TensorLearn
Back to Course
Data Intelligence: NumPy & Pandas
Module 13 of 15

14. Scaling: Dask & Ray

1. The RAM Limit

Pandas crashes if your dataset > RAM. Dask breaks the big dataframe into small chunks and processes them on disk.

2. Distributed Computing

Use Ray to parallelize Python code across a cluster of 100 machines.

python
import ray ray.init() @ray.remote def heavy_task(x): return x * x futures = [heavy_task.remote(i) for i in range(1000)] results = ray.get(futures)

Mark as Completed

TensorLearn - AI Engineering for Professionals