Basics of OpenCV

Abhishek Suryavanshi
5 min readAug 17, 2021


OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products

Image processing is a form of signal processing in which the input is an image such as a photograph or video frame, the output is an image or set of characteristics related to image. OpenCV is a library of programming functions mainly used for image processing.

Some of the image processing techniques are:-Image Filtering, Image Transformation, Object Tracking, Feature Detection.

Dealing With Images:

The composition of these three colors, namely red, green, and blue can be used to compose almost any other color. Mixing them in the right proportion allows us to frame any other desired color. This concept has existed since the cathode ray televisions a few decades ago. So how does this exactly work?

Each of these colors has an 8 bit integer value. This means a matrix of these could range from 0 to 255. The reasoning for this is because ²⁸ is 256 and 0–255 consist of 256 values. Each of these colors will have a value of this range and since we have a 3-Dimensional image, we can stack each of these upon each other. This might be a slightly more complex example, so let us switch over to the grayscale images which only consists of black and white and that would be easier to understand. Below is the grayscale representation.


pip install opencv-python

Reading and Displaying the image:

import cv2 # Importing the opencv module

image = cv2.imread(“lena.png”)

# Read The Image

cv2.imshow(“Picture”, image)


Writing the image:

cv2.imwrite(“lena1.png”, image)

Edge Detection is a method of segmenting an image into regions of discontinuity. It is a widely used technique in digital image processing like

  • pattern recognition
  • image morphology
  • feature extraction

Edge detection allows users to observe the features of an image for a significant change in the gray level. This texture indicating the end of one region in the image and the beginning of another. It reduces the amount of data in an image and preserves the structural properties of an image.

Contours are defined as the line joining all the points along the boundary of an image that are having the same intensity. Contours come handy in shape analysis, finding the size of the object of interest, and object detection.

Contours Approximation Method –
Above, we see that contours are the boundaries of a shape with the same intensity. It stores the (x, y) coordinates of the boundary of a shape. But does it store all the coordinates? That is specified by this contour approximation method.
If we pass cv2.CHAIN_APPROX_NONE, all the boundary points are stored. But actually, do we need all the points? For eg, if we have to find the contour of a straight line. We need just two endpoints of that line. This is what cv2.CHAIN_APPROX_SIMPLE does. It removes all redundant points and compresses the contour, thereby saving memory.

Applications of Computer Vision

Apart from all the previously discussed content in this article, there are billions more project choices available to you. I will mention a few of these projects and the methodologies behind them, and you can try out these too if you feel comfortable. At the end of this section, I will also provide you guys with a helpful link to help you, and guide you through five computer vision projects. Let us dive into the applications of computer vision.

  1. Face detection and face recognition projects are some of the most popular computer vision projects. Face recognition is the procedural recognition of a human face along with the authorized name of the user. Face detection is a simpler task and can be considered as a beginner level project. Face detection is one of the steps that is required for face recognition. Face detection is a method of distinguishing the face of a human from the other parts of the body and the background while the face recognition performs the task of enveloping the face and recognizing who the particular person is.
  2. Object Detection and object tracking are other popular choices for computer vision projects. Object detection is a computer vision technique that allows us to identify and locate objects in an image or video. With this kind of identification and localization, object detection can be used to count objects in a scene and determine and track their precise locations, all while accurately labeling them. Object tracking is the task of identifying a particular object and following the already identified object throughout the video or in real-time.
  3. Image Segmentation tasks are extremely useful to coordinate and visualize the surroundings and train the program to perform a specific task. The model trained is capable of achieving good results on tasks such as content-based image retrieval, traffic control and analysis systems, video surveillance, and the biomedical field for distinct pre-determined purposes. Image segmentation is the task of classifying every object in a particular frame or image with fixed names and compute them accordingly with respect to the color, pattern, or some fixed characteristic.
  4. Optical Character Recognition — This is another basic project best suited for beginners. Optical character recognition is the conversion of 2-Dimensional text data into a form of machine-encoded text by the use of an electronic or mechanical device. You use computer vision to read the image or text files. After reading the images, use the pytesseract module of python to read the text data in the image or the PDF and then convert them into a string of data that can be displayed in python. Optical character recognition finds various applications in data entry, billing details, OCR receiver, and OCR clients, tasks, etc., amongst many other use cases.
  5. Emotion or Gesture Recognition is another amazing computer vision application that uses deep learning technologies along with computer vision to perform highly complex tasks such as emotion and gesture recognition. The various faces are detected and classified according to the emotions shown with regards to that of the particular face. Not only do the models classify the emotions but also detects and classifies the different hand gestures of the recognized fingers accordingly. After distinguishing the human emotions or gestures a vocal response is provided by the trained model with the accurate prediction of the human emotion or gesture respectively. This is a slightly complex task and requires a lot of steps for the procedure to be accomplished successfully. Please refer to the conclusion section for further links on how to develop these projects from scratch. You can follow my guides to implement these projects from scratch on your own.