Timothy M. Chan's Publications

Timothy M. Chan's Publications: Range searching

Finding triangles and other small subgraphs in geometric intersection graphs

We consider problems related to finding short cycles, small cliques, small independent sets, and small subgraphs in geometric intersection graphs. We obtain a plethora of new results. For example:

For the intersection graph of n line segments in the plane, we give algorithms to find a 3-cycle in O(n^{1.408}) time, a size-3 independent set in O(n^{1.652}) time, a 4-clique in near-O(n^{24/13}) time, and a k-clique (or any k-vertex induced subgraph) in O(n^{0.565k+O(1)}) time for any constant k; we can also compute the girth in near-O(n^{3/2}) time.
For the intersection graph of n axis-aligned boxes in a constant dimension d, we give algorithms to find a 3-cycle in O(n^{1.408}) time for any d, a 4-clique (or any 4-vertex induced subgraph) in O(n^{1.715}) time for any d, a size-4 independent set in near-O(n^{3/2}) time for any d, a size-5 independent set in near-O(n^{4/3}) time for d=2, and a k-clique (or any k-vertex induced subgraph) in O(n^{0.429k+O(1)}) time for any d and any constant k.
For the intersection graph of n fat objects in any constant dimension d, we give an algorithm to find any k-vertex (non-induced) subgraph in O(n log n) time for any constant k, generalizing a result by Kaplan, Klost, Mulzer, Roddity, Seiferth, and Sharir (ESA'99) for 3-cycles in 2D disk graphs.

A variety of techniques is used, including geometric range searching, biclique covers, "high-low" tricks, graph degeneracy and separators, and shifted quadtrees. We also prove a near-Omega(n^{4/3}) conditional lower bound for finding a size-4 independent set for boxes.

Simplex range searching revisited: How to shave logs in multi-level data structures

(with Da Wei Zheng)

We revisit the classic problem of simplex range searching and related problems in computational geometry. We present a collection of new results which improve previous bounds by multiple logarithmic factors that were caused by the use of multi-level data structures. Highlights include the following:

For a set of n points in a constant dimension d, we give data structures with O(n^d) (or slightly better) space that can answer simplex range counting queries in optimal O(log n) time and simplex range reporting queries in optimal O(log n + k) time, where k denotes the output size. For semigroup range searching, we obtain O(log n) query time with O(n^d polylog n) space. Previous data structures with similar space bounds by Matousek from nearly three decades ago had O(log^{d+1}n) or O(log^{d+1}n + k) query time.
For a set of n simplices in a constant dimension d, we give data structures with O(n) space that can answer stabbing counting queries (counting the number of simplices containing a query point) in O(n^{1-1/d}) time, and stabbing reporting queries in O(n^{1-1/d} + k) time. Previous data structures had extra log^d n factors in space and query time.
For a set of n (possibly intersecting) line segments in 2D, we give a data structure with O(n) space that can answer ray shooting queries in O(sqrt{n}) time. This improves Wang's recent data structure [SoCG'20] with O(n log n) space and O(sqrt{n}log n) query time.

On the number of incidences when avoiding an induced biclique in geometric settings

(with Sariel Har-Peled)

Given a set of points P and a set of regions O, an \emph{incidence} is a pair (p,o) in P x O such that p is inside o. We obtain a number of new results on a classical question in combinatorial geometry: What is the number of incidences (under certain restrictive conditions)?

We prove a bound of O(kn(log n/loglog n)^{d-1}) on the number of incidences between n points and n axis-parallel boxes in R^d, if no k boxes contain k common points, that is, if the incidence graph between the points and the boxes does not contain K_{k,k} as a subgraph. This new bound improves over previous work, by Basit, Chernikov, Starchenko, Tao, and Tran (2021), by more than a factor of log^d n for d > 2. Furthermore, it matches a lower bound implied by the work of Chazelle (1990), for k=2, thus settling the question for points and boxes.

We also study several other variants of the problem. For halfspaces, using shallow cuttings, we get a linear bound in two and three dimensions. We also present linear (or near linear) bounds for shapes with low union complexity, such as pseudodisks and fat triangles.

Hopcroft's problem, log-star shaving, 2D fractional cascading, and decision trees

(with Da Wei Zheng)

We revisit Hopcroft's problem and related fundamental problems about geometric range searching. Given n points and n lines in the plane, we show how to count the number of point-line incidence pairs or the number of point-above-line pairs in O(n^{4/3}) time, which matches the conjectured lower bound and improves the best previous time bound of n^{4/3}2^{O(log^*n)} obtained almost 30 years ago by Matou�ek.

We describe two interesting and different ways to achieve the result: the first is randomized and uses a new 2D version of fractional cascading for arrangements of lines; the second is deterministic and uses decision trees in a manner inspired by the sorting technique of Fredman (1976). The second approach extends to any constant dimension.

Many consequences follow from these new ideas: for example, we obtain an O(n^{4/3})-time algorithm for line segment intersection counting in the plane, O(n^{4/3})-time randomized algorithms for bichromatic closest pair and Euclidean minimum spanning tree in three or four dimensions, and a randomized data structure for halfplane range counting in the plane with O(n^{4/3}) preprocessing time and space and O(n^{1/3}) query time.

Dynamic colored orthogonal range searching

(with Zhengcheng Huang)

In the colored orthogonal range reporting problem, we want a data structure for storing n colored points so that given a query axis-aligned rectangle, we can report the distinct colors among the points inside the rectangle. This natural problem has been studied in a series of papers, but most prior work focused on the static case. In this paper, we give a dynamic data structure in the 2D case which can answer queries in O(log^{1+o(1)}n + k log^{1/2+o(1)}n) time, where k denotes the output size (the number of distinct colors in the query range), and which can support insertions and deletions in O(log^{2+o(1)}n) time (amortized) in the standard RAM model. This is the first dynamic structure with polylogarithmic update time whose query cost per color reported is sublogarithmic (near sqrt{log n}). We also give an alternative data structure with O(log^{1+o(1)} n + k log^{3/4+o(1)}n) query time and O(log^{3/2+o(1)}n) update time (amortized). We also mention extensions to higher constant dimensions.

Further results on colored range searching

(with Qizheng He and Yakov Nekrich)

We present a number of new results about range searching for colored (or "categorical") data:

For a set of n colored points in three dimensions, we describe randomized data structures with O(n polylog n) space that can report the distinct colors in any query orthogonal range (axis-aligned box) in O(k polyloglog n) expected time, where k is the number of distinct colors in the range, assuming that coordinates are in {1,...,n}. Previous data structures require O(log n / loglog n + k) query time. Our result also implies improvements in higher constant dimensions.
Our data structures can be adapted to halfspace ranges in three dimensions (or circular ranges in two dimensions), achieving O(k log n) expected query time. Previous data structures require O(k log^2 n) query time.
For a set of n colored points in two dimensions, we describe a data structure with O(n polylog n) space that can answer colored "type-2" range counting queries: report the number of occurrences of every distinct color in a query orthogonal range. The query time is O(log n / loglog n + k loglog n), where k is the number of distinct colors in the range. Naively performing k uncolored range counting queries would require O(k log n / loglog n) time.

Our data structures are designed using a variety of techniques, including colored variants of randomized incremental construction (which may be of independent interest), colored variants of shallow cuttings, and bit-packing tricks.

In Proc. 36th Symposium on Computational Geometry (SoCG), pages 28:1-28:15, 2020

Better data structures for colored orthogonal range reporting

(with Yakov Nekrich)

Range searching on categorical, or "colored", data has been studied extensively for over two decades. In this paper, we obtain the current best results for perhaps the most basic, and most often studied, version of the geometric problem: colored orthogonal range reporting.

Given n colored points in two-dimensional space [U]^2, we present a data structure with O(n log^{3/4+epsilon} n) space, for an arbitrarily small constant epsilon > 0, so that all k distinct colors in any axis-aligned query rectangle can be reported in (optimal) O(loglog U + k) time; this is the first method to break the O(n log n) space barrier.

In three dimensions, we present a data structure with O(n log^{9/5+epsilon} n) space and O(log n/loglog n + k) time; this improves the previous space bound of O(n log^4 n).

Range closest-pair search in higher dimensions

(with Saladi Rahul and Jie Xue)

Range closest-pair (RCP) search is a range-search variant of the classical closest-pair problem, which aims to store a given set S of points into some space-efficient data structure such that when a query range Q is specified, the closest pair in S \cap Q can be reported quickly. RCP search has received attention over years, but the primary focus was only on R^2. In this paper, we study RCP search in higher dimensions. We give the first nontrivial RCP data structures for orthogonal, simplex, halfspace, and ball queries in R^d for any constant d. Furthermore, we prove a conditional lower bound for orthogonal RCP search for d >= 3.

Orthogonal range reporting and rectangle stabbing for fat rectangles

(with Yakov Nekrich and Michiel Smid)

In this paper we study two geometric data structure problems in the special case when input objects or queries are fat rectangles. We show that in this case a significant improvement compared to the general case can be achieved. We describe data structures that answer two- and three-dimensional orthogonal range reporting queries in the case when the query range is a fat rectangle. Our two-dimensional data structure uses O(n) words and supports queries in O(loglog U + k) time, where n is the number of points in the data structure, U is the size of the universe and k is the number of points in the query range. Our three-dimensional data structure needs O(n log^eps U) words of space and answers queries in O(loglog U + k) time. We also consider the rectangle stabbing problem on a set of three-dimensional fat rectangles. Our data structure uses O(n) space and answers stabbing queries in O(log U log log U + k) time.

Orthogonal point location and rectangle stabbing queries in 3-d

(with Yakov Nekrich, Saladi Rahul, and Konstantinos Tsakalidis)

In this work, we present a collection of new results on two fundamental problems in geometric data structures: orthogonal point location and rectangle stabbing.

Orthogonal point location. We give the first linear-space data structure that supports 3-d point location queries on n disjoint axis-aligned boxes with optimal O(log n) query time in the (arithmetic) pointer machine model. This improves the previous O(log^{3/2} n) bound of Rahul [SODA 2015]. We similarly obtain the first linear-space data structure in the I/O model with optimal query cost, and also the first linear-space data structure in the word RAM model with sub-logarithmic query time.
Rectangle stabbing. We give the first linear-space data structure that supports 3-d 4- sided and 5-sided rectangle stabbing queries in optimal O(log_w n + k) time in the word RAM model. We similarly obtain the first optimal data structure for the closely related problem of 2-d top-k rectangle stabbing in the word RAM model, and also improved results for 3-d 6-sided rectangle stabbing.

For point location, our solution is simpler than previous methods, and is based on an interesting variant of the van Emde Boas recursion, applied in a round-robin fashion over the dimensions, combined with bit-packing techniques. For rectangle stabbing, our solution is a variant of Alstrup, Brodal, and Rauhe�s grid-based recursive technique (FOCS 2000), combined with a number of new ideas.

Dynamic orthogonal range searching, revisited

(with Konstantinos Tsakalidis)

We study a longstanding problem in computational geometry: 2-d dynamic orthogonal range reporting. We present a new data structure achieving O(log n/loglog n + k) optimal query time and O(log^{2/3+o(1)}n) update time (amortized) in the word RAM model, where n is the number of data points and k is the output size. This is the first improvement in over 10 years of Mortensen's previous result [SIAM J. Comput., 2006], which has O(log^{7/8+epsilon}n) update time for an arbitrarily small constant epsilon.

In the case of 3-sided queries, our update time reduces to O(log^{1/2+epsilon}n), improving Wilkinson's previous bound [ESA 2014] of O(log^{2/3+epsilon}n).

Orthogonal range searching in moderate dimensions: k-d trees and range trees strike back

We revisit the orthogonal range searching problem and the exact l_infinity nearest neighbor searching problem for a static set of n points when the dimension d is moderately large. We give the first data structure with near linear space that achieves truly sublinear query time when the dimension is any constant multiple of log n. Specifically, the preprocessing time/space is O(n^{1+delta}) for any constant delta > 0, and the expected query time is n^{1 - 1/O(c log c)}. The data structure is simple and is based on a new "augmented, randomized, lopsided" variant of k-d trees. It matches (in fact, slightly improves) the performance of previous combinatorial algorithms that work only in the case of offline queries [Impagliazzo, Lovett, Paturi, and Schneider (2014) and Chan (SODA'15)]. It leads to faster combinatorial algorithms for all-pairs shortest paths in general weighted graphs and rectangular Boolean matrix multiplication.

In the offline case, we show that the problem can be reduced to the Boolean orthogonal vectors problem and thus admits an n^{2 - 1/O(log c)}-time non-combinatorial algorithm [Abboud, Williams, and Yu (SODA'15)]. This reduction is also simple and is based on range trees.

Finally, we use a similar approach to obtain a small improvement to Indyk's data structure [FOCS'98] for approximate l_infinity nearest neighbor search when d=c log n.

A clustering-based approach to kinetic closest pair

(with Zahed Rahmati)

Given a set P of n moving points in fixed dimension d, where the trajectory of each point is a polynomial of degree bounded by some constant, we present a kinetic data structure (KDS) for maintenance of the closest pair on P. Assuming the closest pair distance is between 1 and Delta over time, our KDS uses O(n log Delta) space and processes O(n^2 beta log Delta log n + n^2 beta log Delta loglog Delta) events, each in worst-case time O(log^2 n + log^2 log Delta). Here, beta is an extremely slow-growing function. The locality of the KDS is O(log n + loglog Delta). Our closest pair KDS supports insertions and deletions of points. An insertion or deletion takes worst-case time O(log Delta log^2 n +log Delta log^2log Delta). Also, we use a similar approach to provide a KDS for the all epsilon-nearest neighbors in R^d. The complexities of the previous KDSs, for both closest pair and all epsilon-nearest neighbors, have polylogarithmic factors, where the number of logs depends on dimension d. Assuming Delta is polynomial in n, our KDSs obtain improvements on the previous KDSs. Our solutions are based on a kinetic clustering on P. Though we use ideas from the previous clustering KDS by Hershberger, we simplify and improve his work.

Two approaches to building time-windowed geometric data structures

(with John Hershberger and Simon Pratt)

Given a set of geometric objects each associated with a time value, we wish to determine whether a given property is true for a subset of those objects whose time values fall within a query time window. We call such problems time-windowed decision problems, and they have been the subject of much recent attention, for instance studied by Bokal, Cabello, and Eppstein [SoCG 2015]. In this paper, we present new approaches to this class of problems that are conceptually simpler than Bokal et al.'s, and also lead to faster algorithms. For instance, we present algorithms for preprocessing for the time-windowed 2D diameter decision problem in O(n log n) time and the time-windowed 2D convex hull area decision problem in O(n log n) time, improving Bokal et al.'s O(n log^2 n) and O(n log n loglog n) solutions respectively.

Our first approach is to reduce time-windowed decision problems to a generalized range successor problem, which we solve using a novel way to search range trees. Our other approach is to use dynamic data structures directly, taking advantage of a new observation that the total number of combinatorial changes to a planar convex hull is near linear for any FIFO update sequence, in which deletions occur in the same order as insertions. We also apply these approaches to obtain the first O(n polylog n) algorithms for the time-windowed 3D diameter decision and 2D orthogonal segment intersection detection problems.

Multidimensional range selection

(with Gelin Zhou)

We study the problem of supporting (orthogonal) range selection queries over a set of n points in constant-dimensional space. Under the standard word-RAM model with word size w = Omega(lg n), we present data structures that occupy O(n (lg n/lglg n)^{d-1}) words of space and support d-dimensional range selection queries using O((lg n/lglg n)^d) query time. This improves the best known data structure by a factor of lglg n in query time. To develop our data structures, we generalize the "parallel counting" technique of Brodal, Gfeller, J�rgensen, and Sanders (2011) for one-dimensional range selection to higher dimensions.

As a byproduct, we design data structures to support d-dimensional range counting queries within O(n (log_w n)^{d-2}) words of space and O((log_w n)^{d-1}) query time, for any word size w = Omega(lg n). This improves the best known result of JaJa, Mortensen, and Shi (2004) when lg w >> lglg n.

In Proc. 26th International Symposium on Algorithms and Computation (ISAAC), Lecture Notes in Computer Science, volume 9472, pages 83-92, 2015

Time-windowed closest pair

(with Simon Pratt)

Given a set of points in any constant dimension, each of which is associated with a time during which that point is active, we design a data structure with O(n log n) space that can find the closest pair of active points within a query interval of time in O(loglog n) time using a quadtree-based approach in the word-RAM model.

PDF file
In Proc. 27th Canadian Conference on Computational Geometry (CCCG), pages 141-144, 2015

Approximating the minimum closest pair distance and nearest neighbor distances of linearly moving points

(with Zahed Rahmati)

Given a set of n moving points in R^d, where each point moves along a linear trajectory at arbitrary but constant velocity, we present an O~(n^{5/3})-time algorithm to compute a (1+epsilon)-factor approximation to the minimum closest pair distance over time, for any constant epsilon>0 and any constant dimension d. This addresses an open problem posed by Gupta, Janardan, and Smid (1996).

More generally, we consider a data structure version of the problem: for any linearly moving query point q, we want a (1+epsilon)-factor approximation to the minimum nearest neighbor distance to q over time. We present a data structure that requires O~(n^{5/3}) space and O~(n^{2/3}) query time, O~(n^5) space and polylogarithmic query time, or O~(n) space and O~(n^{4/5}) query time, for any constant epsilon>0 and any constant dimension d.

PDF file
Computational Geometry: Theory and Applications, 60:2-7, 2017 (CCCG special issue)
In Proc. 27th Canadian Conference on Computational Geometry (CCCG), pages 136-140, 2015

Fast string dictionary lookup with one error

(with Moshe Lewenstein)

A set of strings, called a string dictionary, is a basic string data structure. The most primitive query, where one seeks the existence of a pattern in the dictionary, is called a lookup query. Approximate lookup queries, i.e., to lookup the existence of a pattern with a bounded number of errors, is a fundamental string problem. Several data structures have been proposed to do so efficiently. Almost all solutions consider a single error, as will this result. Lately, Belazzougui and Venturini (CPM 2013) raised the question whether one can construct efficient indexes that support lookup queries with one error in optimal query time, that is, O(|p|/w + occ), where p is the query, w the machine word-size, and occ the number of occurrences.

Specifically, for the problem of one mismatch and constant alphabet size, we obtain optimal query time. For a dictionary of d strings our proposed index uses O(w d log^{1+eps}d) additional bit space (beyond the dictionary which can be maintained in compressed form). Our results are parameterized for a space-time tradeoff.

We propose more results for the case of lookup queries with one insertion/deletion on dictionaries over a constant sized alphabet. These results are especially effective for large patterns.

Optimal deterministic algorithms for 2-d and 3-d shallow cuttings

(with Konstantinos Tsakalidis)

We present optimal deterministic algorithms for constructing shallow cuttings in an arrangement of lines in two dimensions or planes in three dimensions. Our results improve the deterministic polynomial-time algorithm of Matousek (1992) and the optimal but randomized algorithm of Ramos (1999). This leads to efficient derandomization of previous algorithms for numerous well-studied problems in computational geometry, including halfspace range reporting in 2-d and 3-d, k nearest neighbors search in 2-d, (<= k)-levels in 3-d, order-k Voronoi diagrams in 2-d, linear programming with k violations in 2-d, dynamic convex hulls in 3-d, dynamic nearest neighbor search in 2-d, convex layers (onion peeling) in 3-d, epsilon-nets for halfspace ranges in 3-d, and more. As a side product we also describe an optimal deterministic algorithm for constructing standard (non-shallow) cuttings in two dimensions, which is arguably simpler than the known optimal algorithms by Matousek (1991) and Chazelle (1993).

Succinct indices for path minimum with applications to path reporting

(with Meng He, J. Ian Munro, and Gelin Zhou)

In the path minimum query problem, we preprocess a tree on n weighted nodes, such that given an arbitrary path, we can locate the node with the smallest weight along this path. We design novel succinct indices for this problem; one of our index structures supports queries in O(alpha(m,n)) time, and occupies O(m) bits of space in addition to the space required for the input tree, where m is an integer greater than or equal to n and alpha(m,n) is the inverse-Ackermann function. These indices give us the first succinct data structures for the path minimum problem, and allow us to obtain new data structures for path reporting queries, which report the nodes along a query path whose weights are within a query range. We achieve three different time/space tradeoffs for path reporting by designing (a) an O(n)-word structure with O(lg^eps n + occ lg^eps n) query time, where occ is the number of nodes reported; (b) an O(n lglg n)-word structure with O(lglg n + occ lglg n) query time; and (c) an O(n lg^eps n)- word structure with O(lglg n + occ) query time. These tradeoffs match the state of the art of two-dimensional orthogonal range reporting queries which can be treated as a special case of path reporting queries. When the number of distinct weights is much smaller than n, we further improve both the query time and the space cost of these three results.

Deterministic rectangle enclosure and offline dominance reporting on the RAM

(with Peyman Afshani and Konstantinos Tsakalidis)

We revisit a classical problem in computational geometry that has been studied since the 1980s: in the rectangle enclosure problem we want to report all k enclosing pairs of n input rectangles in 2D. We present the first deterministic algorithm that takes O(n log n + k) worst-case time and O(n) space in the word-RAM model. This improves previous deterministic algorithms with O((n log n + k) loglog n) running time. We achieve the result by derandomizing the algorithm of Chan, Larsen and Patrascu [SoCG'11] that attains the same time complexity but in expectation.

The 2D rectangle enclosure problem is related to the offline dominance range reporting problem in 4D, and our result leads to the currently fastest deterministic algorithm for offline dominance reporting in any constant dimension d >= 4.

A key tool behind Chan et al.'s previous randomized algorithm is shallow cuttings for 3D dominance ranges. Recently, Afshani and Tsakalidis [SODA'14] obtained a deterministic O(n log n)-time algorithm to construct such cuttings. We first present an improved deterministic construction algorithm that runs in O(n loglog n) time in the word-RAM; this result is of independent interest. Many additional ideas are then incorporated, including a linear-time algorithm for merging shallow cuttings and an algorithm for an offline tree point location problem.

Adaptive and approximate orthogonal range counting

(with Bryan T. Wilkinson)

We present three new results on one of the most basic problems in geometric data structures, 2-D orthogonal range counting. All the results are in the w-bit word RAM model.

It is well known that there are linear-space data structures for 2-D orthogonal range counting with worst-case optimal query time O(log n/loglog n). We give an O(n loglog n)-space adaptive data structure that improves the query time to O(loglog n + log k/loglog n), where k is the output count. When k=O(1), our bounds match the state of the art for the 2-D orthogonal range emptiness problem [Chan, Larsen, and Patrascu, SoCG 2011].
We give an O(n loglog n)-space data structure for 2-D approximate orthogonal range counting that can compute a (1+delta)-factor approximation to the count in O(loglog n) time for any fixed constant delta > 0. Again, our bounds match the state of the art for the 2-D orthogonal range emptiness problem.
Lastly we consider the 1-D range selection problem, where a query in an array involves finding the k-th least element in a given subarray. This problem is closely related to 3-sided 2-D orthogonal range counting. Recently, Jørgensen and Larsen [SODA 2011] presented a linear-space adaptive data structure with query time O(loglog n + log k/loglog n). We give a new linear-space structure that improves the query time to O(1 + log k/loglog n), exactly matching the lower bound proved by Jørgensen and Larsen.

PDF file (conference version)
ACM Transactions on Algorithms, 12(4):45:1-45:15, 2016
In Proc. 24th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 241-251, 2013

Linear-space data structures for range minority query in arrays

(with Stephane Durocher, Matthew Skala, and Bryan T. Wilkinson)

We consider range queries in arrays that search for low-frequency elements: least frequent elements and alpha-minorities. An alpha-minority of a query range has multiplicity no greater than an alpha fraction of the elements in the range. Our data structure for the least frequent element range query problem requires O(n) space, O(n^{3/2}) preprocessing time, and O(sqrt{n}) query time. A reduction from boolean matrix multiplication to this problem shows the hardness of simultaneous improvements in both preprocessing time and query time. Our data structure for the alpha-minority range query problem requires O(n) space and O(1/alpha) query time, and allows alpha to be specified at query time.

Linear-space data structures for range mode query in arrays

(with Stephane Durocher, Kasper Green Larsen, Jason Morrison, and Bryan T. Wilkinson)

A mode of a multiset S is an element a in S of maximum multiplicity; that is, a occurs at least as frequently as any other element in S. Given an array A[1:n] of n elements, we consider a basic problem: constructing a static data structure that efficiently answers range mode queries on A. Each query consists of an input pair of indices (i, j) for which a mode of A[i:j] must be returned. The best previous data structure with linear space, by Krizanc, Morin, and Smid (ISAAC 2003), requires O(sqrt(n) loglog n) query time. We improve their result and present an O(n)-space data structure that supports range mode queries in O(sqrt(n / log n)) worst-case time. Furthermore, we present strong evidence that a query time significantly below sqrt(n) cannot be achieved by purely combinatorial techniques; we show that boolean matrix multiplication of two sqrt(n) by sqrt(n) matrices reduces to n range mode queries in an array of size O(n). Additionally, we give linear-space data structures for orthogonal range mode in higher dimensions (queries in near O(n^(1-1/2d)) time) and for halfspace range mode in higher dimensions (queries in O(n^(1-1/d^2)) time).

PDF file
Theory of Computing Systems, 55:719-741, 2014 (STACS special issue)
In Proc. 29th Symposium on Theoretical Aspects of Computer Science (STACS), pages 290-301, 2012

Orthogonal range searching on the RAM, revisited

(with Kasper Green Larsen and Mihai Patrascu)

We present a number of new results on one of the most extensively studied topics in computational geometry, orthogonal range searching. All our results are in the standard word RAM model:

We present two data structures for 2-d orthogonal range emptiness. The first achieves O(n lglg n) space and O(lglg n) query time, assuming that the n given points are in rank space. This improves the previous results by Alstrup, Brodal, and Rauhe (FOCS'00), with O(n lg^eps n) space and O(lglg n) query time, or with O(n lglg n) space and O(lg^2lg n) query time. Our second data structure uses O(n) space and answers queries in O(lg^eps n) time. The best previous O(n)-space data structure, due to Nekrich (WADS'07), answers queries in O(lg n/lglg n) time.
We give a data structure for 3-d orthogonal range reporting with O(n lg^{1+eps}n) space and O(lglg n + k) query time for points in rank space, for any constant eps>0. This improves the previous results by Afshani (ESA'08), Karpinski and Nekrich (COCOON'09), and Chan (SODA'11), with O(n lg^3 n) space and O(lglg n + k) query time, or with O(n lg^{1+eps}n) space and O(lg^2lg n + k) query time. Consequently, we obtain improved upper bounds for orthogonal range reporting in all constant dimensions above 3.
Our approach also leads to a new data structure for 2-d orthogonal range minimum queries with O(n lg^eps n) space and O(lglg n) query time for points in rank space.
We give a randomized algorithm for 4-d offline dominance range reporting/emptiness with running time O(n lg n) plus the output size. This resolves two open problems (both appeared in Preparata and Shamos' seminal book):
- given a set of n axis-aligned rectangles in the plane, we can report all k enclosure pairs (i.e., pairs (r_1,r_2) where rectangle r_1 completely encloses rectangle r_2) in O(n lg n + k) expected time;
- given a set of n points in 4-d, we can find all maximal points (points not dominated by any other points) in O(n lg n) expected time.
The most recent previous development on (a) was reported back in SoCG'95 by Gupta, Janardan, Smid, and Dasgupta, whose main result was an O([n lg n + k] lglg n) algorithm. The best previous result on (b) was an O(n lg n lglg n) algorithm due to Gabow, Bentley, and Tarjan---from STOC'84! As a consequence, we also obtain the current-record time bound for the maxima problem in all constant dimensions above 4.

Optimal partition trees

We revisit one of the most fundamental classes of data structure problems in computational geometry: range searching. Back in SoCG'92, Matousek gave a partition tree method for d-dimensional simplex range searching achieving O(n) space and O(n^{1-1/d}) query time. Although this method is generally believed to be optimal, it is complicated and requires O(n^{1+eps}) preprocessing time for any fixed eps > 0. An earlier method by Matousek (SoCG'91) requires O(n log n) preprocessing time but O(n^{1-1/d} polylog n) query time. We give a new method that achieves simultaneously O(n log n) preprocessing time, O(n) space, and O(n^{1-1/d}) query time with high probability. Our method has several advantages:

It is conceptually simpler than Matousek's SoCG'92 method. Our partition trees satisfy many ideal properties (e.g., constant degree, optimal crossing number at almost all layers, and disjointness of the children's cells at each node).
It leads to more efficient multilevel partition trees, which are important in many data structural applications (each level adds at most one logarithmic factor to the space and query bounds, better than in all previous methods).
A similar improvement applies to a shallow version of partition trees, yielding O(n log n) time, O(n) space, and O(n^{1-1/(d/2)}) query time for halfspace range emptiness in even dimensions d >= 4.

Numerous consequences follow (e.g., improved results for computing spanning trees with low crossing number, ray shooting among line segments, intersection searching, exact nearest neighbor search, linear programming queries, finding extreme points, ...).

Counting inversions, offline orthogonal range counting, and related problems

(with Mihai Patrascu)

We give an O(n sqrt{lg n})-time algorithm for counting the number of inversions in a permutation on n elements. This improves a long-standing previous bound of O(n lg n/lg lg n) that followed from Dietz's data structure [WADS'89], and answers a question of Andersson and Petersson [SODA'95]. As Dietz's result is known to be optimal for the related dynamic rank problem, our result demonstrates a significant improvement in the offline setting. Our new technique is quite simple: we perform a "vertical partitioning" of a trie (akin to van Emde Boas trees), and use ideas from external memory. However, the technique finds numerous applications: for example, we obtain

in d dimensions, an algorithm to answer n offline orthogonal range counting queries in time O(n lg^{d-2+1/d} n);
an improved construction time for online data structures for orthogonal range counting;
an improved update time for the partial sums problem;
faster Word RAM algorithms for finding the maximum depth in an arrangement of axis-aligned rectangles, and for the slope selection problem.

As a bonus, we also give a simple (1+epsilon)-approximation algorithm for counting inversions that runs in linear time, improving the previous O(n lg lg n) bound by Andersson and Petersson.

PostScript file (preliminary version)
In Proc. 21st ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 161-173, 2010

Optimal halfspace range reporting in three dimensions

(with Peyman Afshani)

We give the first optimal solution to a standard problem in computational geometry: three-dimensional halfspace range reporting. We show that n points in 3-d can be stored in a linear-space data structure so that all k points inside a query halfspace can be reported in O(log n + k) time. The data structure can be built in O(n log n) expected time. The previous methods with optimal query time required superlinear (O(n log log n)) space.

We also mention consequences, for example, to higher dimensions and to external-memory data structures. As an aside, we partially answer another open question concerning the crossing number in Matousek's shallow partition theorem in the 3-d case (a tool used in many known halfspace range reporting methods).

On approximate range counting and depth

(with Peyman Afshani)

Improving previous methods by Aronov and Har-Peled (SODA'05) and Kaplan and Sharir (SODA'06), we present a randomized data structure of O(n) expected size which can answer 3D approximate halfspace range counting queries in O(log (n/k)) expected time, where k is the actual value of the count. This is the first optimal method for the problem in the standard decision tree model; moreover, unlike previous methods, the new method is Las Vegas instead of Monte Carlo. In addition, we describe new results for several related problems, including approximate Tukey depth queries in 3D, approximate regression depth queries in 2D, and approximate linear programming with violations in low dimensions.

PostScript file
Discrete and Computational Geometry, 42:3-21, 2009 (SoCG special issue)
In Proc. 23rd ACM Symposium on Computational Geometry (SoCG), pages 337-343, 2007

Dynamic connectivity for axis-parallel rectangles

(with Peyman Afshani)

In this paper we give a fully dynamic data structure to maintain the connectivity of the intersection graph of n axis-parallel rectangles. The amortized update time (insertion and deletion of rectangles) is O(n^{10/11} polylog n) and the query time (deciding whether two given rectangles are connected) is O(1). It slightly improves the update time (O(n^{0.94})) of the previous method while drastically reducing the query time (near O(n^{1/3})). Our method does not use fast matrix multiplication results and supports a wider range of queries.

Random sampling, halfspace range reporting, and construction of (<= k)-levels in three dimensions

Given n points in three dimensions, we show how to answer halfspace range reporting queries in O(log n + k) expected time for an output size k. Our data structure can be preprocessed in optimal O(n log n) expected time. We apply this result to obtain the first optimal randomized algorithm for the construction of the (<= k)-level in an arrangement of n planes in three dimensions. The algorithm runs in O(n log n + nk^2) expected time. Our techniques are based on random sampling. Applications in two dimensions include an improved data structure for "k nearest neighbors" queries, and an algorithm that constructs the order-k Voronoi diagram in O(n log n + nk log k) expected time.

On enumerating and selecting distances

Given an n-point set, the problems of enumerating the k closest pairs and selecting the k-th smallest distance are revisited. For the enumeration problem, we give simpler randomized and deterministic algorithms with O(n log n + k) running time in any fixed-dimensional Euclidean space. For the selection problem, we give a randomized algorithm with running time O(n log n + n^{2/3}k^{1/3}log^{5/3}n) in the Euclidean plane. We also describe output-sensitive results for halfspace range counting that are of use in more general distance selection problems. None of our algorithms requires parametric search.

Fixed-dimensional linear programming queries made easy

We derive two results from Clarkson's randomized algorithm for linear programming in a fixed dimension d. The first is a simple general method that reduces the problem of answering linear programming queries to the problem of answering halfspace range queries. For example, this yields a randomized data structure with O(n) space and O(n^{1-1/floor(d/2)} 2^O(log* n)) query time for linear programming on n halfspaces (d > 3). The second result is a simpler proof of the following: a sequence of q linear programming queries on n halfspaces can be answered in O(n log q) time, if q <= n^{alpha_d} for a certain constant alpha_d > 0. Unlike previous methods, our algorithms do not require parametric searching.

Copyright Notice

The documents contained in this directory are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Timothy Chan (Last updated Aug 2023)