A collection of Jupyter notebooks that explore some basic concepts of Computer Vision.
Following some topics presented in the Introduction to Python for Data Science, this article aims to introduce OpenCV and exposes a collection of Jupyter notebooks for beginners, using the OpenCV Python interface.
What is Computer Vision?
Computer vision, often abbreviated as CV, is an interdisciplinary scientific field that is concerned with the development of techniques to help computers analyze and understand the content of a single image or a video. It involves the development of a theoretical and algorithmic basis to achieve automatic visual understanding, and it is concerned with the theory behind artificial systems that extract information from images.
Helping computers to see turns out to be very hard. An image is just an array of pixel values without any other meaningful data explicit to the computer.
The goal of computer vision is to extract useful information from images. This has proved a surprisingly challenging task; it has occupied thousands of intelligent and creative minds over the last four decades, and despite this we are still far from being able to build a general-purpose “seeing machine.”
— Page 16, Computer Vision: Models, Learning, and Inference, 2012.
What is OpenCV?
OpenCV — which stands for Open source Computer Vision — is a popular, open-source, computer library originally developed by Intel in 1999, being actively used by the industry and academy. The library is cross-platform and free for use under the BSD license.
Developed in efficient C/C++ code, OpenCV also presents a stable Python interface since 2009. The functions prototypes in the Python API can differ from the C++ version, but the OpenCV official documentation presents both versions for reference. It also currently supports the popular deep learning frameworks TensorFlow, PyTorch and Caffe. The collection presented in this article is focused on the OpenCV's Python API usage.
In C++, OpenCV employs its Mat matrix structure to represents image data, but the Python interface represents images as a NumPy N-dimensional array (ndarray). So ideally, but not mandatorily, some NumPy familiarity is required to understand this collection. In my other article you will find a NumPy notebook that introduce the required knowledge.
Notebooks
This collection of Jupyter notebooks provides an introduction to OpenCV’s Python interface.
All notebooks were initially developed and released by Hannah, with some changes, code updates and other customizations made by me.
The target audience is broad and includes:
- People who have done computer science (maybe to graduate level) but who have not looked at OpenCV before
- People who are studying other subjects and want to play with computer vision
The notebooks are divided by the topics, each containing a lesson with estimated time needed for completion.
- OpenCV fundamentals — 20 min
- Image stats and image processing — 20 min
- Features in computer vision — 20 min
- Cascade Classification (Optional) — 20min
Total Estimated time needed : 80 min
Why OpenCV uses BGR color format?
Some people have doubts as to why OpenCV uses the BGR color format instead of RGB.
The reason the early developers at OpenCV chose BGR color format is that back then BGR color format was popular among camera manufacturers and software providers, which is not true nowadays.
It was a choice made for historical reasons and now we have to live with it. For more details I recommend reading this article.
A Note of Curiosity
The following video is an example of Canny Edge Detection using OpenCV cv2.Canny() function at the left, side by side with a Deep Neural Network Inference done with OpenCV and OpenVINO, at the right.
The original video was recorded during a bike ride on Copacabana Beach.
References
About Vin Busquet
Software engineer, Full Stack Developer, Machine Learning Expert, Cybersecurity centered guy, Cryptocurrency enthusiast, hobbyist game dev and Lifelong Learner!