Image thresholding
Global cv.threshold
The signature is retval, dst = cv2.threshold(src, thresh, maxval, type). For binary output, pixels above thresh become maxval (often 255), others become 0. THRESH_BINARY_INV flips the roles—useful when objects are darker than the background.
import cv2
gray = cv2.imread("scan.png", cv2.IMREAD_GRAYSCALE)
t, bin_img = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
_, bin_inv = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV)
Truncate and zero modes
# Values above 180 become 180; below unchanged (still grayscale)
_, trunc = cv2.threshold(gray, 180, 255, cv2.THRESH_TRUNC)
# Below threshold → 0; above → unchanged
_, tozero = cv2.threshold(gray, 100, 255, cv2.THRESH_TOZERO)
_, tozero_inv = cv2.threshold(gray, 100, 255, cv2.THRESH_TOZERO_INV)
Otsu and Triangle (automatic thresh)
Otsu picks a threshold by maximizing between-class variance of the histogram—works well for roughly bimodal histograms (clear foreground/background). Pass THRESH_OTSU as a flag combined with THRESH_BINARY; the returned t is the chosen value. Triangle fits a line from the histogram peak to the farthest point; good when one tail is long (e.g. bright objects on dark background).
import cv2
gray = cv2.imread("cells.png", cv2.IMREAD_GRAYSCALE)
blur = cv2.GaussianBlur(gray, (5, 5), 0)
t_otsu, bin_o = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
print("Otsu threshold:", t_otsu)
t_tri, bin_t = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
print("Triangle threshold:", t_tri)
The first argument thresh is ignored when Otsu/Triangle is used; OpenCV still requires a placeholder (commonly 0).
Otsu on inverted image
# If objects are dark on light paper, invert first or use BINARY_INV
inv = cv2.bitwise_not(gray)
t2, bin2 = cv2.threshold(inv, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
Adaptive thresholding
When illumination varies across the scene, a single global t fails. cv2.adaptiveThreshold computes a threshold from a blockSize × blockSize neighborhood around each pixel (odd size, e.g. 11, 21). ADAPTIVE_THRESH_MEAN_C uses the mean minus C; GAUSSIAN_C uses a weighted Gaussian window minus C.
import cv2
gray = cv2.imread("receipt.jpg", cv2.IMREAD_GRAYSCALE)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
block, C = 15, 4
ad_mean = cv2.adaptiveThreshold(
gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, block, C)
ad_gauss = cv2.adaptiveThreshold(
gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 21, 5)
Increase blockSize for smoother, more global behavior; adjust C to bias lighter/darker as foreground.
Combined pipelines
Real workflows often chain blur → threshold → morphology (next chapter). Example: isolate dark text after evening out contrast.
import cv2
gray = cv2.imread("page.png", cv2.IMREAD_GRAYSCALE)
blur = cv2.GaussianBlur(gray, (5, 5), 0)
_, bw = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
# bw: likely text as white — ready for morphological cleanup
Color “thresholding” with inRange
For colored objects, threshold each channel in HSV (or LAB) space. cv2.inRange returns a binary mask where all channel constraints hold.
import cv2
import numpy as np
bgr = cv2.imread("fruit.jpg")
hsv = cv2.cvtColor(bgr, cv2.COLOR_BGR2HSV)
lower = np.array([35, 60, 60])
upper = np.array([85, 255, 255])
mask = cv2.inRange(hsv, lower, upper)
fg = cv2.bitwise_and(bgr, bgr, mask=mask)
Tune lower/upper with sliders or by sampling pixels from the object; watch OpenCV’s H hue scale (0–179 for 8-bit).
Takeaways
- Otsu for clean bimodal scenes; Triangle for skewed histograms.
- Adaptive for shadows and uneven lighting on documents or outdoor text.
- Use HSV + inRange when “brightness threshold” is not enough—separate hue from value.
Quick FAQ
cv2.threshold expects single-channel 8-bit (or other supported types). For BGR, convert to gray or threshold each channel separately and combine masks with bitwise logic.Morphological operations
Structuring elements
cv2.getStructuringElement(shape, ksize) builds the probe: MORPH_RECT, MORPH_ELLIPSE, or MORPH_CROSS. Larger kernels have stronger geometric effect—use odd sizes (3, 5, 7, …).
import cv2
k3 = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
k5e = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
k7c = cv2.getStructuringElement(cv2.MORPH_CROSS, (7, 7))
Erosion
Shrinks bright regions, removes thin protrusions, breaks narrow bridges.
Dilation
Grows bright regions, fills small holes, reconnects broken strokes.
Iterations
Repeat erosion/dilation iterations=n for stronger effect without huge kernels.
Erosion and dilation
import cv2
# bw: uint8 binary, foreground white (255)
bw = cv2.imread("mask.png", cv2.IMREAD_GRAYSCALE)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
er = cv2.erode(bw, kernel, iterations=1)
dl = cv2.dilate(bw, kernel, iterations=1)
er2 = cv2.erode(bw, kernel, iterations=2)
dl3 = cv2.dilate(bw, kernel, iterations=3)
Border pixels: default border type is BORDER_CONSTANT with value 0—large iterations can “eat” edges of the image.
Opening and closing
Opening = erosion then dilation—removes small bright noise and smooths boundaries without growing the main objects much. Closing = dilation then erosion—fills small dark holes inside foreground and bridges narrow gaps.
import cv2
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
opened = cv2.morphologyEx(bw, cv2.MORPH_OPEN, kernel, iterations=1)
closed = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, kernel, iterations=1)
# Equivalent explicit opening:
# tmp = cv2.erode(bw, kernel); opened = cv2.dilate(tmp, kernel)
Noise vs broken strokes
k3 = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
k5e = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
# Salt on black background: opening cleans white specks
clean_fg = cv2.morphologyEx(bw, cv2.MORPH_OPEN, k3, iterations=1)
# Gaps in text strokes: closing helps
solid_text = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, k5e, iterations=1)
Gradient, top-hat, black-hat
Morphological gradient ≈ dilation minus erosion—outline of objects. Top-hat = image minus its opening—highlights small bright details on a dark background. Black-hat = closing minus image—dark details on bright background.
import cv2
grad = cv2.morphologyEx(bw, cv2.MORPH_GRADIENT, kernel)
gray = cv2.imread("texture.png", cv2.IMREAD_GRAYSCALE)
k = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 9))
tophat = cv2.morphologyEx(gray, cv2.MORPH_TOPHAT, k)
blackhat = cv2.morphologyEx(gray, cv2.MORPH_BLACKHAT, k)
Hit-or-miss (shape matching)
MORPH_HITMISS finds pixels where a binary pattern and its complement align with a template kernel (hits, misses, and “don’t care” positions—see your OpenCV version’s rules for exact matrix values). Common on skeletonized text or thin structures to locate T-junctions or endpoints.
import cv2
import numpy as np
# Placeholder kernel — replace with a pattern from OpenCV hit-or-miss docs for your task
hitmiss_kernel = np.array([[0, 1, 0],
[1, 1, 0],
[0, 0, 0]], dtype=np.int8)
out = cv2.morphologyEx(bw, cv2.MORPH_HITMISS, hitmiss_kernel)
Encoding of 0/1/-1 differs by tutorial; always cross-check the official cv.morphologyEx hit-miss section for your build.
End-to-end: clean a binary scan
import cv2
gray = cv2.imread("scan.png", cv2.IMREAD_GRAYSCALE)
blur = cv2.GaussianBlur(gray, (3, 3), 0)
_, bw = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
k = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
bw = cv2.morphologyEx(bw, cv2.MORPH_OPEN, k, iterations=1)
bw = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, k, iterations=1)
Takeaways
- Opening removes small foreground noise; closing fills holes in foreground.
- Prefer several small iterations or moderate kernels over one huge kernel for smoother control.
- Gradient / top-hat / black-hat extract boundaries and fine structure on binary or gray images.
Quick FAQ
cv2.bitwise_not first so semantics match your intent.Chapter FAQ
Quick FAQ
cv2.threshold expects single-channel 8-bit (or other supported types). For BGR, convert to gray or threshold each channel separately and combine masks with bitwise logic.Quick FAQ
cv2.bitwise_not first so semantics match your intent.