NumPy for Data Science
Beginner
~12 min read
NumPy Arrays, Vectorization & Broadcasting
NumPy provides the fast array operations that power almost all numerical computing in Python. Understanding arrays and broadcasting will make your code much faster and cleaner.
Creating & Inspecting Arrays
Conceptually, a NumPy array is a contiguous block of memory described by three things:
the data type (dtype), the shape (dimensions) and the strides (how many bytes
to step in each dimension). This design lets NumPy perform vectorized operations very quickly
in compiled C code instead of slow Python loops.
import numpy as np
# 1D and 2D arrays
v = np.array([1, 2, 3])
M = np.array([[1, 2, 3],
[4, 5, 6]])
print("v shape:", v.shape) # (3,)
print("M shape:", M.shape) # (2, 3)
# Ranges and random
arr = np.arange(0, 10, 2) # [0, 2, 4, 6, 8]
rand = np.random.randn(3, 3) # standard normal matrix
print("arr:", arr)
print("rand mean:", rand.mean())
Slicing, Boolean Indexing & Broadcasting
import numpy as np
x = np.array([10, 20, 30, 40, 50])
# Slicing
print(x[1:4]) # [20 30 40]
# Boolean indexing
mask = x >= 30
print("mask:", mask)
print("x[mask]:", x[mask])
# Broadcasting: add a scalar to all elements
print("x + 5:", x + 5)
M = np.array([[1, 2, 3],
[4, 5, 6]])
col_means = M.mean(axis=0)
# Subtract column means from each row (broadcasting)
centered = M - col_means
print("centered:\n", centered)
Basic Linear Algebra with NumPy
import numpy as np
A = np.array([[1, 2],
[3, 4]])
b = np.array([5, 6])
# Matrix-vector product
y = A @ b
# Transpose, inverse, eigenvalues
AT = A.T
invA = np.linalg.inv(A)
eig_vals, eig_vecs = np.linalg.eig(A)
print("A @ b:", y)
print("A^T:\n", AT)
print("A inverse:\n", invA)
print("Eigenvalues:", eig_vals)