A Transfer Learning based Object Detection API that detects all objects in an image, video or live webcam. An SSD model and a Faster R-CNN model was pretrained on Mobile net coco dataset along with a label map in Tensorflow. This model were used to detect objects captured in an image, video or real time webcam. Open CV was used for streaming objects and preprocessing.

kaushikjadhav01 Last update: Nov 15, 2023

Real-Time-Object-Detection-API-using-TensorFlow

A Transfer Learning based Object Detection API that detects all objects in an image, video or live webcam. An SSD model and a Faster R-CNN model was pretrained on Mobile Net COCO dataset along with a label map in Tensorflow. These models were used to detect objects captured in an image, video or real time webcam. OpenCV was used for streaming objects and preprocessing.

Screenshots

Object Detection output for image using SSD

Object Detection output for video files using SSD

Object Detection output for webcam using SSD

Object Detection output for image using Faster R-CNN

Object Detection output for video files using Faster R-CNN

Object Detection output for webcam using Faster R-CNN

Technical Concepts

Faster RCNN is an object detection architecture presented by Ross Girshick, Shaoqing Ren, Kaiming He and Jian Sun in 2015, and is one of the famous object detection architectures that uses convolution neural networks like YOLO (You Look Only Once) and SSD ( Single Shot Detector).
More information can be found here

Single Shot Detector (SSD) like YOLO takes only one shot to detect multiple objects present in an image using multibox. It is significantly faster in speed and high-accuracy object detection algorithm.
More information can be found here

Label Maps: Cartographic labeling is the craft of placing text on a map in relation to the map symbols, together representing features and properties of the real world. Using text effectively creates maps that are clear, informative, and attractive. TensorFlow requires a label map, which namely maps each of the used labels to an integer values. This label map is used both by the training and detection processes.
More information can be found here

An Inference Graph is a propositional graph in which certain arcs and certain reverse arcs are augmented with channels through which information can flow – meaning the inference graph is both a representation of knowledge and the method for performing inference upon it. Channels come in two forms. The first type, i-channels, are added to the reverse antecedent arcs – named as such since they carry messages reporting that “I am true” or “I am negated” from the antecedent node to the rule node. Channels are also added to the consequent arcs, called u-channels, since they carry messages to the consequents which report that “you are true” or “you are negated.” Rules are connected by shared subexpressions.
More information can be found here

Protocol Buffers (Protobuf) is a method of serializing structured data. It is useful in developing programs to communicate with each other over a wire or for storing data. The method involves an interface description language that describes the structure of some data and a program that generates source code from that description for generating or parsing a stream of bytes that represents the structured data.
More information can be found here

Technologies Used

How to Install & Use

Install TensorFlow API by following the instructions here
Download my repo and place the jupyter notebooks of my repo in models/research/object_detection folder of the Tensorflow API
To use smartphone camera in place of laptop webcam, install IPWebcam app on your smartphone from app store. Open app and click on Start Server
Replace the IP address in my jupyter notebooks with IP address in app on Smartphone and run the notebooks