Speaker

Eyal Trabelsi

Material

Note

  • use what you need
    • keep needed columns and rows only
  • avoid loop
    • use vectorized operations
  • type matters
    • supported types
      • int64
      • float64
      • bool
      • objects
      • datetime64
      • timedelta
      • Category
      • Sparse Types
      • Nullable Integer / Nullable boolean
  • pandas usage
    • chunks
    • query
      • use numexpr
        • e.g., df[df.col == "val"]df.query("col=='val'")
    • use concat instead op append
    • groupby
      • filter early
      • custom functions are slow
    • merge
      • filter / aggregate early
      • join on index
  • compiled code
    • Cython
    • Numba
  • General Python techniques
    • cache
    • use intermediate variables
    • concurrency And parallelism

Share on: TwitterFacebookEmail


Published

Category

EuroPython 2020

Tags

Contact