Speaker
Ian Ozsvald
Material
Note
pandas
saving RAM
- Stings are expensive and slow → Categorical
- e.g.,
df.CompanyCategory.astype('category')
- cheap and faster
float64
is default and a bit expensive
float32
"half-price" and a bit faster
- dtype_diet
drop to numpy if you know you can
- e.g.,
df['age_years'].sum()
is much slower than df['age_years'].values.sum()
- bypass a lots of method searching
install optional pandas dependencies
mistakes slow us down
Other than pandas
- compile to Numba
- Dask for multi-core
- make plain-python code multi-core
- Vaex
- Modin
Share on:
Twitter
❄ Facebook
❄ Email