Object detection is a computer-vision and image-processing technique for locating instances of objects in images or videos. Common applications include face detection and object tracking. Object detection algorithms typically leverage machine learning or deep learning to produce meaningful results.
Questions tagged [object-detection]
348 questions
27
votes
2 answers
What is the difference between semantic segmentation, object detection and instance segmentation?
I'm fairly new at computer vision and I've read an explanation at a medium post, however it still isn't clear for me how they truly differ.
Guilherme Marques
- 398
- 1
- 3
- 8
14
votes
3 answers
what is darknet and why is it needed for YOLO object detection?
what is darknet and why is it needed for YOLO object detection ? I read that its a neural network written in C , but why is it needed for YOLO object detection when we have lot of machine learning framework,api like tensorflow,keras,pytorch .
Im…
star
- 1,521
- 7
- 20
- 31
12
votes
5 answers
Unsupervised image segmentation
I am trying to implement an algorithm where given an image with several objects on a plane table, desired is the output of segmentation masks for each object. Unlike in CNN's, the objective here is to detect objects in an unfamiliar environment.…
MuhsinFatih
- 221
- 2
- 5
11
votes
2 answers
Find missing object(s) in image with a priori knowledge about the missing object(s) (w.r.t base image)
Problem Statement:
I am working on developing a method, or borrow/modify/combine existing ones, where given an golden image (reference or base with all expected objects to be present), it is able to identify the missing objects and draw a bounding…
TwinPenguins
- 4,429
- 3
- 22
- 54
9
votes
5 answers
How can we extract fields from images?
I am making an document parser which extracts data fields from the documents and store them in a structured way. Each field in my dataset is horizontal which is easy to extract.
But the model fails on following type of example -
Is there any way…
hR 312
- 91
- 1
- 8
8
votes
4 answers
Faster-RCNN how anchor work with slider in RPN layer?
I am trying to understand the whole Faster-RCNN,
From https://www.quora.com/How-does-the-region-proposal-network-RPN-in-Faster-R-CNN-work
Then a sliding window is run spatially on these feature maps. The size of sliding window is n×n (here 3×3).…
Mithril
- 393
- 6
- 16
6
votes
2 answers
Does resizing images during training affect the bounding box annotations?
I am using the TensorFlow object detection API to train my own custom dataset and am preparing annotations for the same. I see from the config file of my pre-trained SSD inception net, the size of the image is reduced to 300 x 300 during training.…
Sangathamilan Ravichandran
- 248
- 2
- 12
6
votes
1 answer
Smart data split (train/eval) for Object Detection
I am looking for a smart way of splitting object detection data (images with labelled objects inside them) while taking into account the distribution of the objects themselves and not just the images.
I have a dataset composed of many images. Each…
CarlosUziel
- 61
- 4
6
votes
2 answers
What is difference between intersection over union (IoU) and intersection over bounding box (IoBB)?
Can someone give a detailed explanation IoU and IoBB along with that the differences between them.
p126018 Ali Raza
- 63
- 3
5
votes
1 answer
Which is the "BEST" deep learning model for "Custom" object detection for images & real time. YOLO v3, v4, v5, EfficientDet?
Whenever I look for object detection model, I find YOLO v3 most of the times and that might be due to the fact that it is the last version created by original authors and also more stable. In 2020, a new author released unofficial version called…
Deshwal
- 323
- 3
- 9
5
votes
2 answers
Keyword localization in audio file
I want to build a model that can localize occurrences of a particular word in an audio file. For example, I want to find the word "pizza" in a ~5min recording. The program should return an array with (start, stop) objects describing the start and…
xana
- 161
- 5
5
votes
1 answer
Creating a Object Detection model from scratch using Keras
I have a dataset containing 330 images which contain guns. Along with the images, I have a text file associated with each image file which contains,
The number of objects ( guns ) in the image.
Coordinates for bounding boxes around the gun in the…
Shubham Panchal
- 2,230
- 10
- 21
5
votes
2 answers
Bounding Boxes in YOLO Model
The YOLO model splits the image into smaller boxes and each box is responsible for predicting 5 bounding boxes.
My question is how does the model make these bounding boxes for every grid cell ? Does each box have a predefined offset with respect to…
Tanmay Bhatnagar
- 241
- 3
- 4
5
votes
1 answer
Does image's background matter for detector training (CNN)?
Does an image's background matter for detector/localisation in the training part (using CNN)?
For example, if I want to make a face detector, which one is better as training dataset?
Faces cropped dataset
Faces in a global scene dataset
Does it…
Jean Luc
- 53
- 5
5
votes
1 answer
What is the difference between tensorflow saved_model.pb and frozen_inference_graph.pb?
I've re-trained a model (following this tutorial) from the google's object detection zoo (ssd_inception_v2_coco) on a WIDER Faces Dataset and it seems to work if I use frozen_inference_graph.pb from python, but if i take saved_model.pb and put it to…
vi-kun
- 51
- 1
- 3