'How to crop image to only text section with Python OpenCV?

I want to crop the image to only extract the text sections. There are thousands of them with different sizes so I can't hardcode coordinates. I'm trying to remove the unwanted lines on the left and on the bottom. How can I do this?

Original Expected
image_1 image_2


Solution 1:[1]

Here's a simple approach:

  1. Obtain binary image. Load the image, grayscale, Gaussian blur, then Otsu's threshold to obtain a binary black/white image.

  2. Remove horizontal lines. Since we're trying to only extract text, we remove horizontal lines to aid us in our next step so incorrect contours will not merge together.

  3. Merge text into a single contour. The idea is that characters which are adjacent to each other are part of the wall of text. So we can dilate individual contours together to obtain a single contour to extract.

  4. Find contours and extract ROI. We find contours, sort contours by area, then extract the largest contour ROI using Numpy slicing.


Here's the visualization of each step:

Binary image -> Removed horizontal lines in green

1 2
thresh horizontal

Dilate to combine into a single contour -> Detected ROI to extract in green

3 4
dilate ROI

Result

Code

import cv2
import numpy as np

# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3, 3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    cv2.drawContours(thresh, [c], -1, 0, -1)

# Dilate to merge into a single contour
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,30))
dilate = cv2.dilate(thresh, vertical_kernel, iterations=3)

# Find contours, sort for largest contour and extract ROI
cnts, _ = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:-1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 4)
    ROI = original[y:y+h, x:x+w]
    break

cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.imshow('thresh', thresh)
cv2.imshow('ROI', ROI)
cv2.waitKey()

Solution 2:[2]

Determine the least spanning bounding box by finding all the non-zero points in the image. Finally, crop your image using this bounding box. Finding the contours is time-consuming and unnecessary here, especially because your text is axis-aligned. You may accomplish your goal by combining cv2.findNonZero and cv2.boundingRect.

Hope this will work ! :

import numpy as np
import cv2
img = cv2.imread(r"W430Q.png")
  # Read in the image and convert to grayscale
img = img[:-20, :-20]  # Perform pre-cropping
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 50).astype(np.uint8)  # To invert the text to white
gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, np.ones(
    (2, 2), dtype=np.uint8))  # Perform noise filtering
coords = cv2.findNonZero(gray)  # Find all non-zero points (text)
x, y, w, h = cv2.boundingRect(coords)  # Find minimum spanning bounding box
# Crop the image - note we do this on the original image
rect = img[y:y+h, x:x+w]
cv2.imshow("Cropped", rect)  # Show it
cv2.waitKey(0)
cv2.destroyAllWindows()

in above code from forth line of code is where I set the threshold below 50 to make the dark text white. However, because this outputs a binary image, I convert to uint8, then scale by 255. The text is effectively inverted.

Then, using cv2.findNonZero, we discover all of the non-zero locations for this image.We then passed this to cv2.boundingRect, which returns the top-left corner of the bounding box, as well as its width and height. Finally, we can utilise this to crop the image. This is done on the original image, not the inverted version.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 nathancy
Solution 2 Parthiban Marimuthu