Queries to count points lying on arbitrary line

Question

Suppose we have $N$ points on $XY$ plane, ie. $(x, y)$ and $x, y \in Z$ and multiple queries where each query is of the form $y = mx + c$ and $m, c \in Z$.

Is it possible to count number of points that lie on the given line more efficiently than $O(N)$ per query?

I believe some preprocessing might help or some data structure like QuadTree that counts points inside arbitrary rectangle.

The queries are offline so if it's possible using some batch processing technique or even SQRT optimization it would be better than naive brute force.

score 8 · Accepted Answer · answered May 29 '23 at 07:52

Note that via point-line duality (i.e. mapping each point $(p_x,p_y)$ to the line $y=p_x x - p_y$ and each line $y=mx +c$ to the point $(m,-c)$), this problem is equivalent to: given a set $L$ of $n$ lines, count the number of lines that contain a query point.

The subdivision of the plane induced by $L$ is known as an arrangement of lines $\mathcal{A}(L)$, and has a complexity of $O(n^2)$. A doubly connected edge-list (DCEL) that represents the faces, edges and vertices of $\mathcal{A}(L)$ can be computed in $O(n^2)$ time. For each vertex of the DCEL, store the number of lines intersecting it. Next, construct a trapezoidal decomposition of the DCEL in $O(n^2\log n)$ expected time.

When given a query point, first find the trapezoid that contains the point, then traverse the boundary edges and vertices of the trapezoid to determine whether the query point coincides with an edge, face, or vertex. Finding the trapezoid takes $O(\log n)$ expected time, the rest takes $O(1)$, as a trapezoid has $O(1)$ vertices and edges.

So, with $O(n^2\log n)$ expected preprocessing time and $O(n^2)$ expected storage, queries can be answered in $O(\log n)$ expected time. This is one way to get a smaller query time, although the storage requirement is rather large, so there may be a better way.

Pablo H · Answer 2 · 2023-05-30T13:55:38.307

Depending on the size of N (the number of points) and M (the number of queries), it may be enough to precompute all the answers: there are no more than N*N possible lines. :-) You can represent all lines by exact slope and 0-crossing using reduced (i.e. simplified) rationals. Then search. Binary search, hashing, perfect hashing, etc.

With the constraint that slope and 0-crossing are integers in queries you can probably discard some lines, and perhaps use some optimizations such as using an array indexed by m,c. You can run a prepass to compute minimum and maximum m and c.

Update: this idea fails for the case where the answer is exactly one. See comments. :-(

Queries to count points lying on arbitrary line

2 Answers2