Timothy M. Chan's Publications: Range searching

Dynamic orthogonal range searching, revisited

Konstantinos Tsakalidis)

We study a longstanding problem in computational geometry: 2-d dynamic orthogonal range reporting. We present a new data structure achieving O(log n/loglog n + k) optimal query time and O(log^{2/3+o(1)}n) update time (amortized) in the word RAM model, where n is the number of data points and k is the output size. This is the first improvement in over 10 years of Mortensen's previous result [SIAM J. Comput., 2006], which has O(log^{7/8+epsilon}n) update time for an arbitrarily small constant epsilon.

In the case of 3-sided queries, our update time reduces to O(log^{1/2+epsilon}n), improving Wilkinson's previous bound [ESA 2014] of O(log^{2/3+epsilon}n).

Orthogonal range searching in moderate dimensions: k-d trees and range trees strike back

We revisit the orthogonal range searching problem and the exact l_infinity nearest neighbor searching problem for a static set of n points when the dimension d is moderately large. We give the first data structure with near linear space that achieves truly sublinear query time when the dimension is any constant multiple of log n. Specifically, the preprocessing time/space is O(n^{1+delta}) for any constant delta > 0, and the expected query time is n^{1 - 1/O(c log c)}. The data structure is simple and is based on a new "augmented, randomized, lopsided" variant of k-d trees. It matches (in fact, slightly improves) the performance of previous combinatorial algorithms that work only in the case of offline queries [Impagliazzo, Lovett, Paturi, and Schneider (2014) and Chan (SODA'15)]. It leads to faster combinatorial algorithms for all-pairs shortest paths in general weighted graphs and rectangular Boolean matrix multiplication.

In the offline case, we show that the problem can be reduced to the Boolean orthogonal vectors problem and thus admits an n^{2 - 1/O(log c)}-time non-combinatorial algorithm [Abboud, Williams, and Yu (SODA'15)]. This reduction is also simple and is based on range trees.

Finally, we use a similar approach to obtain a small improvement to Indyk's data structure [FOCS'98] for approximate l_infinity nearest neighbor search when d=c log n.

Two approaches to building time-windowed geometric data structures

(with Simon Pratt)

Given a set of geometric objects each associated with a time value, we wish to determine whether a given property is true for a subset of those objects whose time values fall within a query time window. We call such problems time-windowed decision problems, and they have been the subject of much recent attention, for instance studied by Bokal, Cabello, and Eppstein [SoCG 2015]. In this paper, we present new approaches to this class of problems that are conceptually simpler than Bokal et al.'s, and also lead to faster algorithms. For instance, we present algorithms for preprocessing for the time-windowed 2D diameter decision problem in O(n log n) time and the time-windowed 2D convex hull area decision problem in O(n alpha(n) log n) time (where alpha is the inverse Ackermann function), improving Bokal et al.'s O(n log^2 n) and O(n log n loglog n) solutions respectively.

Our first approach is to reduce time-windowed decision problems to a generalized range successor problem, which we solve using a novel way to search range trees. Our other approach is to use dynamic data structures directly, taking advantage of a new observation that the total number of combinatorial changes to a planar convex hull is near linear for any FIFO update sequence, in which deletions occur in the same order as insertions. We also apply these approaches to obtain the first O(n polylog n) algorithms for the time-windowed 3D diameter decision and 2D orthogonal segment intersection detection problems.

Multidimensional range selection

(with Gelin Zhou)

We study the problem of supporting (orthogonal) range selection queries over a set of n points in constant-dimensional space. Under the standard word-RAM model with word size w = Omega(lg n), we present data structures that occupy O(n (lg n/lglg n)^{d-1}) words of space and support d-dimensional range selection queries using O((lg n/lglg n)^d) query time. This improves the best known data structure by a factor of lglg n in query time. To develop our data structures, we generalize the "parallel counting" technique of Brodal, Gfeller, Jørgensen, and Sanders (2011) for one-dimensional range selection to higher dimensions.

As a byproduct, we design data structures to support d-dimensional range counting queries within O(n (log_w n)^{d-2}) words of space and O((log_w n)^{d-1}) query time, for any word size w = Omega(lg n). This improves the best known result of JaJa, Mortensen, and Shi (2004) when lg w >> lglg n.

Time-windowed closest pair

(with Simon Pratt)

Given a set of points in any constant dimension, each of which is associated with a time during which that point is active, we design a data structure with O(n log n) space that can find the closest pair of active points within a query interval of time in O(loglog n) time using a quadtree-based approach in the word-RAM model.

Approximating the minimum closest pair distance and nearest neighbor distances of linearly moving points

(with Zahed Rahmati)

Given a set of n moving points in R^d, where each point moves along a linear trajectory at arbitrary but constant velocity, we present an O~(n^{5/3})-time algorithm to compute a (1+epsilon)-factor approximation to the minimum closest pair distance over time, for any constant epsilon>0 and any constant dimension d. This addresses an open problem posed by Gupta, Janardan, and Smid (1996).

More generally, we consider a data structure version of the problem: for any linearly moving query point q, we want a (1+epsilon)-factor approximation to the minimum nearest neighbor distance to q over time. We present a data structure that requires O~(n^{5/3}) space and O~(n^{2/3}) query time, O~(n^5) space and polylogarithmic query time, or O~(n) space and O~(n^{4/5}) query time, for any constant epsilon>0 and any constant dimension d.

Fast string dictionary lookup with one error

Moshe Lewenstein)

A set of strings, called a string dictionary, is a basic string data structure. The most primitive query, where one seeks the existence of a pattern in the dictionary, is called a lookup query. Approximate lookup queries, i.e., to lookup the existence of a pattern with a bounded number of errors, is a fundamental string problem. Several data structures have been proposed to do so efficiently. Almost all solutions consider a single error, as will this result. Lately, Belazzougui and Venturini (CPM 2013) raised the question whether one can construct efficient indexes that support lookup queries with one error in optimal query time, that is, O(|p|/w + occ), where p is the query, w the machine word-size, and occ the number of occurrences.

Specifically, for the problem of one mismatch and constant alphabet size, we obtain optimal query time. For a dictionary of d strings our proposed index uses O(w d log^{1+eps}d) additional bit space (beyond the dictionary which can be maintained in compressed form). Our results are parameterized for a space-time tradeoff.

We propose more results for the case of lookup queries with one insertion/deletion on dictionaries over a constant sized alphabet. These results are especially effective for large patterns.

Optimal deterministic algorithms for 2-d and 3-d shallow cuttings

Konstantinos Tsakalidis)

We present optimal deterministic algorithms for constructing shallow cuttings in an arrangement of lines in two dimensions or planes in three dimensions. Our results improve the deterministic polynomial-time algorithm of Matousek (1992) and the optimal but randomized algorithm of Ramos (1999). This leads to efficient derandomization of previous algorithms for numerous well-studied problems in computational geometry, including halfspace range reporting in 2-d and 3-d, k nearest neighbors search in 2-d, (<= k)-levels in 3-d, order-k Voronoi diagrams in 2-d, linear programming with k violations in 2-d, dynamic convex hulls in 3-d, dynamic nearest neighbor search in 2-d, convex layers (onion peeling) in 3-d, epsilon-nets for halfspace ranges in 3-d, and more. As a side product we also describe an optimal deterministic algorithm for constructing standard (non-shallow) cuttings in two dimensions, which is arguably simpler than the known optimal algorithms by Matousek (1991) and Chazelle (1993).

Succinct indices for path minimum with applications to path reporting

Meng He, J. Ian Munro, and Gelin Zhou)

In the path minimum query problem, we preprocess a tree on n weighted nodes, such that given an arbitrary path, we can locate the node with the smallest weight along this path. We design novel succinct indices for this problem; one of our index structures supports queries in O(alpha(m,n)) time, and occupies O(m) bits of space in addition to the space required for the input tree, where m is an integer greater than or equal to n and alpha(m,n) is the inverse-Ackermann function. These indices give us the first succinct data structures for the path minimum problem, and allow us to obtain new data structures for path reporting queries, which report the nodes along a query path whose weights are within a query range. We achieve three different time/space tradeoffs for path reporting by designing (a) an O(n)-word structure with O(lg^eps n + occ lg^eps n) query time, where occ is the number of nodes reported; (b) an O(n lglg n)-word structure with O(lglg n + occ lglg n) query time; and (c) an O(n lg^eps n)- word structure with O(lglg n + occ) query time. These tradeoffs match the state of the art of two-dimensional orthogonal range reporting queries which can be treated as a special case of path reporting queries. When the number of distinct weights is much smaller than n, we further improve both the query time and the space cost of these three results.

Deterministic rectangle enclosure and offline dominance reporting on the RAM

Peyman Afshani and Konstantinos Tsakalidis)

We revisit a classical problem in computational geometry that has been studied since the 1980s: in the rectangle enclosure problem we want to report all k enclosing pairs of n input rectangles in 2D. We present the first deterministic algorithm that takes O(n log n + k) worst-case time and O(n) space in the word-RAM model. This improves previous deterministic algorithms with O((n log n + k) loglog n) running time. We achieve the result by derandomizing the algorithm of Chan, Larsen and Patrascu [SoCG'11] that attains the same time complexity but in expectation.

The 2D rectangle enclosure problem is related to the offline dominance range reporting problem in 4D, and our result leads to the currently fastest deterministic algorithm for offline dominance reporting in any constant dimension d >= 4.

A key tool behind Chan et al.'s previous randomized algorithm is shallow cuttings for 3D dominance ranges. Recently, Afshani and Tsakalidis [SODA'14] obtained a deterministic O(n log n)-time algorithm to construct such cuttings. We first present an improved deterministic construction algorithm that runs in O(n loglog n) time in the word-RAM; this result is of independent interest. Many additional ideas are then incorporated, including a linear-time algorithm for merging shallow cuttings and an algorithm for an offline tree point location problem.

Adaptive and approximate orthogonal range counting

Bryan T. Wilkinson)

We present three new results on one of the most basic problems in geometric data structures, 2-D orthogonal range counting. All the results are in the w-bit word RAM model.

Linear-space data structures for range minority query in arrays

Stephane Durocher, Matthew Skala, and Bryan T. Wilkinson)

We consider range queries in arrays that search for low-frequency elements: least frequent elements and alpha-minorities. An alpha-minority of a query range has multiplicity no greater than an alpha fraction of the elements in the range. Our data structure for the least frequent element range query problem requires O(n) space, O(n^{3/2}) preprocessing time, and O(sqrt{n}) query time. A reduction from boolean matrix multiplication to this problem shows the hardness of simultaneous improvements in both preprocessing time and query time. Our data structure for the alpha-minority range query problem requires O(n) space and O(1/alpha) query time, and allows alpha to be specified at query time.

Linear-space data structures for range mode query in arrays

Stephane Durocher, Kasper Green Larsen, Jason Morrison, and Bryan T. Wilkinson)

A mode of a multiset S is an element a in S of maximum multiplicity; that is, a occurs at least as frequently as any other element in S. Given an array A[1:n] of n elements, we consider a basic problem: constructing a static data structure that efficiently answers range mode queries on A. Each query consists of an input pair of indices (i, j) for which a mode of A[i:j] must be returned. The best previous data structure with linear space, by Krizanc, Morin, and Smid (ISAAC 2003), requires O(sqrt(n) loglog n) query time. We improve their result and present an O(n)-space data structure that supports range mode queries in O(sqrt(n / log n)) worst-case time. Furthermore, we present strong evidence that a query time significantly below sqrt(n) cannot be achieved by purely combinatorial techniques; we show that boolean matrix multiplication of two sqrt(n) by sqrt(n) matrices reduces to n range mode queries in an array of size O(n). Additionally, we give linear-space data structures for orthogonal range mode in higher dimensions (queries in near O(n^(1-1/2d)) time) and for halfspace range mode in higher dimensions (queries in O(n^(1-1/d^2)) time).

Orthogonal range searching on the RAM, revisited

Kasper Green Larsen and Mihai Patrascu)

We present a number of new results on one of the most extensively studied topics in computational geometry, orthogonal range searching. All our results are in the standard word RAM model:

Optimal partition trees

We revisit one of the most fundamental classes of data structure problems in computational geometry: range searching. Back in SoCG'92, Matousek gave a partition tree method for d-dimensional simplex range searching achieving O(n) space and O(n^{1-1/d}) query time. Although this method is generally believed to be optimal, it is complicated and requires O(n^{1+eps}) preprocessing time for any fixed eps > 0. An earlier method by Matousek (SoCG'91) requires O(n log n) preprocessing time but O(n^{1-1/d} polylog n) query time. We give a new method that achieves simultaneously O(n log n) preprocessing time, O(n) space, and O(n^{1-1/d}) query time with high probability. Our method has several advantages:

Numerous consequences follow (e.g., improved results for computing spanning trees with low crossing number, ray shooting among line segments, intersection searching, exact nearest neighbor search, linear programming queries, finding extreme points, ...).

Counting inversions, offline orthogonal range counting, and related problems

Mihai Patrascu)

We give an O(n sqrt{lg n})-time algorithm for counting the number of inversions in a permutation on n elements. This improves a long-standing previous bound of O(n lg n/lg lg n) that followed from Dietz's data structure [WADS'89], and answers a question of Andersson and Petersson [SODA'95]. As Dietz's result is known to be optimal for the related dynamic rank problem, our result demonstrates a significant improvement in the offline setting. Our new technique is quite simple: we perform a "vertical partitioning" of a trie (akin to van Emde Boas trees), and use ideas from external memory. However, the technique finds numerous applications: for example, we obtain

As a bonus, we also give a simple (1+epsilon)-approximation algorithm for counting inversions that runs in linear time, improving the previous O(n lg lg n) bound by Andersson and Petersson.

Optimal halfspace range reporting in three dimensions

Peyman Afshani)

We give the first optimal solution to a standard problem in computational geometry: three-dimensional halfspace range reporting. We show that n points in 3-d can be stored in a linear-space data structure so that all k points inside a query halfspace can be reported in O(log n + k) time. The data structure can be built in O(n log n) expected time. The previous methods with optimal query time required superlinear (O(n log log n)) space.

We also mention consequences, for example, to higher dimensions and to external-memory data structures. As an aside, we partially answer another open question concerning the crossing number in Matousek's shallow partition theorem in the 3-d case (a tool used in many known halfspace range reporting methods).

On approximate range counting and depth

Peyman Afshani)

Improving previous methods by Aronov and Har-Peled (SODA'05) and Kaplan and Sharir (SODA'06), we present a randomized data structure of O(n) expected size which can answer 3D approximate halfspace range counting queries in O(log (n/k)) expected time, where k is the actual value of the count. This is the first optimal method for the problem in the standard decision tree model; moreover, unlike previous methods, the new method is Las Vegas instead of Monte Carlo. In addition, we describe new results for several related problems, including approximate Tukey depth queries in 3D, approximate regression depth queries in 2D, and approximate linear programming with violations in low dimensions.

Dynamic connectivity for axis-parallel rectangles

Peyman Afshani)

In this paper we give a fully dynamic data structure to maintain the connectivity of the intersection graph of n axis-parallel rectangles. The amortized update time (insertion and deletion of rectangles) is O(n^{10/11} polylog n) and the query time (deciding whether two given rectangles are connected) is O(1). It slightly improves the update time (O(n^{0.94})) of the previous method while drastically reducing the query time (near O(n^{1/3})). Our method does not use fast matrix multiplication results and supports a wider range of queries.

Random sampling, halfspace range reporting, and construction of (<= k)-levels in three dimensions

Given n points in three dimensions, we show how to answer halfspace range reporting queries in O(log n + k) expected time for an output size k. Our data structure can be preprocessed in optimal O(n log n) expected time. We apply this result to obtain the first optimal randomized algorithm for the construction of the (<= k)-level in an arrangement of n planes in three dimensions. The algorithm runs in O(n log n + nk^2) expected time. Our techniques are based on random sampling. Applications in two dimensions include an improved data structure for "k nearest neighbors" queries, and an algorithm that constructs the order-k Voronoi diagram in O(n log n + nk log k) expected time.

On enumerating and selecting distances

Given an n-point set, the problems of enumerating the k closest pairs and selecting the k-th smallest distance are revisited. For the enumeration problem, we give simpler randomized and deterministic algorithms with O(n log n + k) running time in any fixed-dimensional Euclidean space. For the selection problem, we give a randomized algorithm with running time O(n log n + n^{2/3}k^{1/3}log^{5/3}n) in the Euclidean plane. We also describe output-sensitive results for halfspace range counting that are of use in more general distance selection problems. None of our algorithms requires parametric search.

Fixed-dimensional linear programming queries made easy

We derive two results from Clarkson's randomized algorithm for linear programming in a fixed dimension d. The first is a simple general method that reduces the problem of answering linear programming queries to the problem of answering halfspace range queries. For example, this yields a randomized data structure with O(n) space and O(n^{1-1/floor(d/2)} 2^O(log* n)) query time for linear programming on n halfspaces (d > 3). The second result is a simpler proof of the following: a sequence of q linear programming queries on n halfspaces can be answered in O(n log q) time, if q <= n^{alpha_d} for a certain constant alpha_d > 0. Unlike previous methods, our algorithms do not require parametric searching.

Copyright Notice

The documents contained in this directory are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Timothy Chan (Last updated 20 Mar 2017)