TensorLearn
Back to Course
Data Intelligence: NumPy & Pandas
Module 2 of 15

2. NumPy Basics

1. The n-dimensional Array (ndarray)

Why is Python slow? Because it checks the type of every single element in a loop. Why is NumPy fast? Because it cheats. It pushes the loop down to C.

Memory Layout

A Python List is an array of pointers to scattered objects. A NumPy Array is a contiguous block of memory.

  • CPU Cache: When the CPU reads one number, it accidentally reads the next 16 numbers too (Cache Line).
  • Locality: Because NumPy arrays are contiguous, the CPU always guesses right. This is 50x-100x faster.

2. DataTypes (Dtypes)

In Python, an integer is variable size (28 bytes+). In NumPy, you define the EXACT size.

  • int8: 1 byte (-128 to 127).
  • int64: 8 bytes (Standard).
  • float32: 4 bytes (Single Precision for GPUs).
python
import numpy as np # Force the array to use small integers to save RAM a = np.array([1, 2, 3], dtype='int8')

3. Shape & Strides

How does a 1D block of memory act like a 2D matrix? Math.

  • Shape: Dimensions, e.g., (3, 4) (3 rows, 4 cols).
  • Strides: How many bytes to step to get to the next element/row.

If you change the shape, you just change the metadata. You do not copy the data. (Crucial for performance).

Mark as Completed

TensorLearn - AI Engineering for Professionals