Questions tagged [computer-vision]

Questions about algorithms for processing and analyzing images and videos.

Computer vision is a field that focuses on processing and analyzing images and videos.

For questions that ask for an algorithm to perform some computer vision task, it is often helpful to include in the question one or more example images, along with what you would like the computer vision algorithm to output for each such image. Also, if you have tried any algorithm, you might want to show that algorithm's output for one or more such images and explain why its output does not meet your needs.

Other sites: Signal Processing Stack Exchange, Stack Overflow.

290 questions
32
votes
5 answers

What is the difference between object detection, semantic segmentation and localization?

I've read those words in quite a lot of publications and I would like to have some nice definitions for those terms which make it clear what the difference between object detection vs semantic segmentation vs localization is. It would be nice if you…
Martin Thoma
  • 2,360
  • 1
  • 22
  • 41
24
votes
3 answers

What are the differences between computer vision and image processing?

What are the differences between computer vision and image processing? For example, in object recognition, what are the roles of computer vision and image processing?
gena
  • 251
  • 1
  • 2
  • 5
22
votes
3 answers

What is the difference between the fundamental matrix and the essential matrix?

Could someone, in plain english, explain the distinction between the fundamental matrix and the essential matrix in multi-view computer vision? How are they different, and how can each be used in computing the 3D position of a point imaged from…
s-low
  • 323
  • 1
  • 2
  • 5
12
votes
1 answer

Google DeepDream Elaborated

I've seen a few questions on this site about Deep Dream, however none of them seem to actually speak as to what DeepDream is doing, specifically. As far as I've gathered, they seem to have changed the objective function, and also changed…
11
votes
1 answer

Deriving the Sobel equations from derivatives

Many sites give the Sobel operators as the convolution mask for smoothing an image. However, I haven't found a single site that describes how you can derive the operators from partial first derivatives. If anyone can explain the derivation, I would…
Quanquan Liu
  • 233
  • 2
  • 9
10
votes
3 answers

Intuition for convolution in image processing

I have read many documents about convolution in image processing, and most of them say about its formula, some additional parameters. No one explains the intuition and real meaning behind doing convolution on an image. For example, intuition of…
hqt
  • 223
  • 1
  • 7
9
votes
1 answer

What is the difference between 'features' and 'descriptors' in computer vision / machine learning?

I've read multiple time sentences similar to Finally, for standard image classification bag-of-words features based on SIFT descriptors have been found critical for high performances. We first compute a standard SIFT discriptor at regular…
Martin Thoma
  • 2,360
  • 1
  • 22
  • 41
8
votes
1 answer

Convert HSV to RGB colors

HSV colors are composed of a triple of numbers: hue $\in [0, 360)$ (in degrees), saturation $\in [0, 1]$ and value or brightness $\in [0, 1]$. RGB colors instead are more well-known and are also composed of a triple of numbers all of them in the…
user20691
7
votes
0 answers

Detecting the damaged regions in cars

Detecting the regions where a car has been damaged and the extent to which it has been damaged is a very interesting problem. It has potential applications in automatic auto insurance claims. Currently I'm trying to tackle this problem and am…
7
votes
1 answer

Computer vision: object detection with labels that are single coordinates

Are there papers in the literature that address the following object detection task ? The task can be described as follows: Given a set of images, the labels are just coordinates (x,y) that represent the object locations that we wish to detect. A…
7
votes
1 answer

Computer vision: Why do random filters perform similar to edge detectors?

I read here that "a randomly initialized filter acts very much like an edge detector!". I want to know if there are any papers describing and explaining this phenomenon.
Suhas Lohit
  • 289
  • 2
  • 4
7
votes
2 answers

Automated lip-reading: inferring what someone is saying, based upon video of them speaking

Some humans can lip-read fairly well: by watching someone who is speaking, they can tell what the speaker is saying (even without hearing the speech). Has there been any work on building computer software to lip-read? In other words, given a video…
D.W.
  • 167,959
  • 22
  • 232
  • 500
6
votes
3 answers

In lieu of neural networks and deep learning, what are some approaches to computer vision?

So much software related to computer vision relies on AI and neural networks, I wonder some of the approaches which don't use those methods. How could some of the mainstays of computer vision (eg. recognition and identification) be achieved without…
6
votes
1 answer

using orientation sensor data to predict image points

I want to use Android's gyro/accelerometer/magnetometer to predict (over a very short time interval) how image points corresponding to a fixed object will change (without trying to scan for them in the image). In particular, suppose we have an…
Charles F
  • 191
  • 5
6
votes
1 answer

Computer Vision - What is spatial histogram/pyramid feature?

I am trying to re-implement the model in this paper https://arxiv.org/pdf/1511.02917v2.pdf. But the paper omitted some details in feature extraction (Section 4.1). I am new to computer vision so I am confused when the paper just said they got the…
1
2 3
19 20