We study the time complexity of the discrete k-center problem and related (exact) geometric set cover problems when k or the size of the cover is small. We obtain a plethora of new results:
In SoCG 2022, Conroy and Tóth presented several constructions of sparse, low-hop spanners in geometric intersection graphs, including an O(n log n)-size 3-hop spanner for n disks (or fat convex objects) in the plane, and an O(n log^2 n)-size 3-hop spanner for n axis-aligned rectangles in the plane. Their work left open two major questions: (i) can the size be made closer to linear by allowing larger constant stretch? and (ii) can near-linear size be achieved for more general classes of intersection graphs?
We address both questions simultaneously, by presenting new constructions of constant-hop spanners that have almost linear size and that hold for a much larger class of intersection graphs. More precisely, we prove the existence of an O(1)-hop spanner for arbitrary string graphs with O(n alpha_k(n)) size for any constant k, where alpha-k(n) denotes the k-th function in the inverse Ackermann hierarchy. We similarly prove the existence of an O(1)-hop spanner for intersection graphs of d-dimensional fat objects with O(n alpha_k(n)) size for any constant k and d.
We also improve on some of Conroy and Tóth's specific previous results, in either the number of hops or the size: we describe an O(n log n)-size 2-hop spanner for disks (or more generally objects with linear union complexity) in the plane, and an O(n log n)-size 3-hop spanner for axis-aligned rectangles in the plane. Our proofs are all simple, using separator theorems, recursion, shifted quadtrees, and shallow cuttings.
We present a (combinatorial) algorithm with running time close to O(n^d) for computing the minimum directed L_infinity Hausdorff distance between two sets of n points under translations in any constant dimension d. This substantially improves the best previous time bound near O(n^{5d/4}) by Chew, Dor, Efrat, and Kedem from more than twenty years ago. Our solution is obtained by a new generalization of Chan's algorithm [FOCS'13] for Klee's measure problem.
To complement this algorithmic result, we also prove a nearly matching conditional lower bound close to Omega(n^d) for combinatorial algorithms, under the Combinatorial k-Clique Hypothesis.
In this paper we carefully combine Fredman's trick [SICOMP'76] and Matousek's approach for dominance product [IPL'91] to obtain powerful results in fine-grained complexity:
We consider problems related to finding short cycles, small cliques, small independent sets, and small subgraphs in geometric intersection graphs. We obtain a plethora of new results. For example:
We revisit the classic problem of simplex range searching and related problems in computational geometry. We present a collection of new results which improve previous bounds by multiple logarithmic factors that were caused by the use of multi-level data structures. Highlights include the following:
Given a set of points P and a set of regions O, an \emph{incidence} is a pair (p,o) in P x O such that p is inside o. We obtain a number of new results on a classical question in combinatorial geometry: What is the number of incidences (under certain restrictive conditions)?
We prove a bound of O(kn(log n/loglog n)^{d-1}) on the number of incidences between n points and n axis-parallel boxes in R^d, if no k boxes contain k common points, that is, if the incidence graph between the points and the boxes does not contain K_{k,k} as a subgraph. This new bound improves over previous work, by Basit, Chernikov, Starchenko, Tao, and Tran (2021), by more than a factor of log^d n for d > 2. Furthermore, it matches a lower bound implied by the work of Chazelle (1990), for k=2, thus settling the question for points and boxes.
We also study several other variants of the problem. For halfspaces, using shallow cuttings, we get a linear bound in two and three dimensions. We also present linear (or near linear) bounds for shapes with low union complexity, such as pseudodisks and fat triangles.
The 3SUM hypothesis, the APSP hypothesis and SETH are the three main hypotheses in fine-grained complexity. So far, within the area, the first two hypotheses have mainly been about integer inputs in the Word RAM model of computation. The "Real APSP" and "Real 3SUM" hypotheses, which assert that the APSP and 3SUM hypotheses hold for real-valued inputs in a reasonable version of the Real RAM model, are even more believable than their integer counterparts.
Under the very believable hypothesis that at least one of the Integer 3SUM hypothesis, Integer APSP hypothesis or SETH is true, Abboud, Vassilevska W. and Yu [STOC 2015] showed that a problem called Triangle Collection requires n^{3-o(1)} time on an n-node graph.
Our main result is a nontrivial lower bound for a slight generalization of Triangle Collection, called All-Color-Pairs Triangle Collection, under the even more believable hypothesis that at least one of the Real 3SUM, the Real APSP, and the OV hypotheses is true. Combined with slight modifications of prior reductions, we obtain polynomial conditional lower bounds for problems such as the (static) ST-Max Flow problem and dynamic Max Flow, now under the new weaker hypothesis.
Our main result is built on the following two lines of reductions.
We revisit Hopcroft's problem and related fundamental problems about geometric range searching. Given n points and n lines in the plane, we show how to count the number of point-line incidence pairs or the number of point-above-line pairs in O(n^{4/3}) time, which matches the conjectured lower bound and improves the best previous time bound of n^{4/3}2^{O(log^*n)} obtained almost 30 years ago by Matou�ek.
We describe two interesting and different ways to achieve the result: the first is randomized and uses a new 2D version of fractional cascading for arrangements of lines; the second is deterministic and uses decision trees in a manner inspired by the sorting technique of Fredman (1976). The second approach extends to any constant dimension.
Many consequences follow from these new ideas: for example, we obtain an O(n^{4/3})-time algorithm for line segment intersection counting in the plane, O(n^{4/3})-time randomized algorithms for bichromatic closest pair and Euclidean minimum spanning tree in three or four dimensions, and a randomized data structure for halfplane range counting in the plane with O(n^{4/3}) preprocessing time and space and O(n^{1/3}) query time.
Geometric set cover is a classical problem in computational geometry, which has been extensively studied in the past. In the dynamic version of the problem, points and ranges may be inserted and deleted, and our goal is to efficiently maintain a set cover solution (satisfying certain quality requirement). In this paper, we give a plethora of new dynamic geometric set cover data structures in 1D and 2D, which significantly improve and extend the previous results:
In the colored orthogonal range reporting problem, we want a data structure for storing n colored points so that given a query axis-aligned rectangle, we can report the distinct colors among the points inside the rectangle. This natural problem has been studied in a series of papers, but most prior work focused on the static case. In this paper, we give a dynamic data structure in the 2D case which can answer queries in O(log^{1+o(1)}n + k log^{1/2+o(1)}n) time, where k denotes the output size (the number of distinct colors in the query range), and which can support insertions and deletions in O(log^{2+o(1)}n) time (amortized) in the standard RAM model. This is the first dynamic structure with polylogarithmic update time whose query cost per color reported is sublogarithmic (near sqrt{log n}). We also give an alternative data structure with O(log^{1+o(1)} n + k log^{3/4+o(1)}n) query time and O(log^{3/2+o(1)}n) update time (amortized). We also mention extensions to higher constant dimensions.
Given a graph with n vertices and real edge weights in [0,1], we investigate an approximate version of the standard all-pairs shortest paths (APSP) problem where distances are estimated with additive error at most epsilon. Yuster (2012) introduced this natural variant of approximate APSP, and presented an algorithm for directed graphs running in O~(n^{(3+omega)/2}) < O(n^{2.687}) time for an arbitrarily small constant epsilon > 0, where omega denotes the matrix multiplication exponent. We give a faster algorithm for undirected graphs running in O~(n^{(3+omega^2)/(omega+1)}) < O(n^{2.559}) time for any constant epsilon > 0. If omega = 2, the time bound is O~(n^{7/3}), matching a previous result for undirected graphs by Dor, Halperin, and Zwick (2000) which only guaranteed additive error at most 2.
All-Pairs Shortest Paths (APSP) is one of the most well studied problems in graph algorithms. This paper studies several variants of APSP in unweighted graphs or graphs with small integer weights.
APSP with small integer weights in undirected graphs [Seidel'95, Galil and Margalit'97] has an O~(n^omega) time algorithm, where omega < 2.373 is the matrix multiplication exponent. APSP in directed graphs with small weights however, has a much slower running time that would be Omega(n^{2.5}) even if omega = 2 [Zwick'02]. To understand this n^{2.5} bottleneck, we build a web of reductions around directed unweighted APSP. We show that it is fine-grained equivalent to computing a rectangular Min-Plus product for matrices with integer entries; the dimensions and entry size of the matrices depend on the value of omega. As a consequence, we establish an equivalence between APSP in directed unweighted graphs, APSP in directed graphs with small (O~(1)) integer weights, All-Pairs Longest Paths in DAGs with small weights, c-Red-APSP in undirected graphs with small weights, for any c >= 2 (computing all-pairs shortest path distances among paths that use at most c red edges), #_{<= c}APSP in directed graphs with small weights (counting the number of shortest paths for each vertex pair, up to c), and approximate APSP with additive error c in directed graphs with small weights, for c <= O~(1).
We also provide fine-grained reductions from directed unweighted APSP to All-Pairs Shortest Lightest Paths (APSLP) in undirected graphs with {0,1} weights and #_{mod c}APSP in directed unweighted graphs (computing counts mod c), thus showing that unless the current algorithms for APSP in directed unweighted graphs can be improved substantially, these problems need at least Omega(n^{2.528}) time.
We complement our hardness results with new algorithms. We improve the known algorithms for APSLP in directed graphs with small integer weights (previously studied by Zwick [STOC'99]) and for approximate APSP with sublinear additive error in directed unweighted graphs (previously studied by Roditty and Shapira [ICALP'08]). Our algorithm for approximate APSP with sublinear additive error is optimal, when viewed as a reduction to Min-Plus product. We also give new algorithms for variants of #APSP (such as #_{<= U}APSP and #_{mod U}APSP for U <= n^{O~(1)}) in unweighted graphs, as well as a near-optimal O~(n^3)-time algorithm for the original #APSP problem in unweighted graphs (when counts may be exponentially large). This also implies an O~(n^3)-time algorithm for Betweenness Centrality, improving on the previous O~(n^4) running time for the problem. Our techniques also lead to a simpler alternative to Shoshan and Zwick�s algorithm [FOCS'99] for the original APSP problem in undirected graphs with small integer weights.
We revisit a classical problem in computational geometry: finding the largest-volume axis-aligned empty box (inside a given bounding box) amidst n given points in d dimensions. Previously, the best algorithms known have running time O(n log^2 n) for d = 2 (by Aggarwal and Suri [SoCG'87]) and near n^d for d >= 3. We describe faster algorithms with running time
We study geometric set cover problems in dynamic settings, allowing insertions and deletions of points and objects. We present the first dynamic data structure that can maintain an O(1)-approximation in sublinear update time for set cover for axis-aligned squares in 2D. More precisely, we obtain randomized update time O(n^{2/3+delta}) for an arbitrarily small constant delta > 0. Previously, a dynamic geometric set cover data structure with sublinear update time was known only for unit squares by Agarwal, Chang, Suri, Xiao, and Xue [SoCG 2020]. If only an approximate size of the solution is needed, then we can also obtain sublinear amortized update time for disks in 2D and halfspaces in 3D. As a byproduct, our techniques for dynamic set cover also yield an optimal randomized O(n log n)-time algorithm for static set cover for 2D disks and 3D halfspaces, improving our earlier O(n log n (log log n)^{O(1)}) result [SoCG 2020].
In this paper, we present simple randomized multi-pass streaming algorithms for fundamental computational geometry problems of finding the skyline (maximal) points and the extreme points of the convex hull. For the skyline problem, one of our algorithm occupies O(h) space and performs O(log n) passes, where h is the number of skyline points. This improves the space bound of the currently best known result by Das Sarma, Lall, Nanongkai, and Xu [VLDB'09] by a logarithmic factor. For the extreme points problem, we present the first non-trivial result for any constant dimension greater than two: an O(h log^{O(1)}n) space and O(log^d n) pass algorithm, where h is the number of extreme points. Finally, we argue why randomization seems unavoidable for these problems, by proving lower bounds on the performance of deterministic algorithms for a related problem of finding maximal elements in a poset.
We revisit classical problems about searching in totally monotone and Monge matrices, which have many applications in computational geometry and other areas. We present a number of new results, including the following:
We revisit classical problems about searching in totally monotone matrices, which have many applications in computational geometry and other areas. In a companion paper, we gave new (near-)linear-time algorithms for a number of such problems. In the present paper, we describe new subquadratic results for more basic problems, including the following:
These matrix searching problems are intimately related to problems about arrangements of pseudo-lines. In particular, our selection algorithm implies an O(n^{4/3} polylog n) algorithm for computing incidences between n points and n pseudo-lines in the plane. This improves, extends, and simplifies a previous method by Agarwal and Sharir [SODA'02].
In SODA'99, Chan introduced a simple type of planar straight-line upward order-preserving drawings of binary trees, known as LR drawings: such a drawing is obtained by picking a root-to-leaf path, drawing the path as a straight line, and recursively drawing the subtrees along the paths. Chan proved that any binary tree with n nodes admits an LR drawing with O(n^{0.48}) width. In SODA'17, Frati, Patrignani, and Roselli proved that there exist families of n-node binary trees for which any LR drawing has Omega(n^{0.418}) width. In this note, we improve Chan's upper bound to O(n^{0.437}) and Frati et al.'s lower bound to Omega(n^{0.429}).
Given a set of n integer-valued coin types and a target value t, the well-known change-making problem asks for the minimum number of coins that sum to t, assuming an unlimited number of coins in each type. In the more general all-targets version of the problem, we want the minimum number of coins summing to j, for every j=0,...,t. For example, the textbook dynamic programming algorithms can solve the all-targets problem in O(nt) time. Recently, Chan and He (SOSA'20) described a number of O(t polylog t)-time algorithms for the original (single-target) version of the change-making problem, but not the all-targets version.
In this paper, we obtain a number of new results on change-making and related problems:
We present a number of new results about range searching for colored (or "categorical") data:
Our data structures are designed using a variety of techniques, including colored variants of randomized incremental construction (which may be of independent interest), colored variants of shallow cuttings, and bit-packing tricks.
We improve the running times of O(1)-approximation algorithms for the set cover problem in geometric settings, specifically, covering points by disks in the plane, or covering points by halfspaces in three dimensions. In the unweighted case, Agarwal and Pan [SoCG 2014] gave a randomized O(n log^4 n)-time, O(1)-approximation algorithm, by using variants of the multiplicative weight update (MWU) method combined with geometric data structures. We simplify the data structure requirement in one of their methods and obtain a deterministic O(n log^3 n log log n)-time algorithm. With further new ideas, we obtain a still faster randomized O(n log n (log log n)^O(1))-time algorithm. For the weighted problem, we also give a randomized O(n log^4 n log log n)-time, O(1)-approximation algorithm, by simple modifications to the MWU method and the quasi-uniform sampling technique.
We revisit a fundamental problem in string matching: given a pattern of length m and a text of length n, both over an alphabet of size sigma, compute the Hamming distance (i.e., the number of mismatches) between the pattern and the text at every location. Several randomized (1+eps)-approximation algorithms have been proposed in the literature (e.g., by Karloff (Inf. Proc. Lett., 1993), Indyk (FOCS 1998), and Kopelowitz and Porat (SOSA 2018)), with running time of the form O(eps^{-O(1)} n log n log m), all using fast Fourier transform (FFT). We describe a simple randomized (1+eps)-approximation algorithm that is faster and does not need FFT. Combining our approach with additional ideas leads to numerous new results (all Monte-Carlo randomized) in different settings:
Given a set S of n numbers, the 3SUM problem asks to determine whether there exist three elements a,b,c in S such that a+b+c = 0. The related Convolution-3SUM problem asks to determine whether there exist a pair of indices i,j such that A[i]+A[j] = A[i+j], where A is a given array of nnumbers.
When the numbers are integers, a randomized reduction from 3SUM to Convolution-3SUM was given in a seminal paper by Patrascu [STOC 2010], which was later improved by Kopelowitz, Pettie, and Porat [SODA 2016] with an O(log n) factor slowdown. In this paper, we present a simple deterministic reduction from 3SUM to Convolution-3SUM for integers bounded by U. We also describe additional ideas to obtaining further improved reductions, with only a (loglog n)^{O(1)} factor slowdown in the randomized case, and a (log U)^{O(1)} factor slowdown in the deterministic case.
Given a set of n non-negative integers representing a coin system, the change-making problem seeks the fewest number of coins that sum up to a given value t, where each type of coin can be used an unlimited number of times. This problem is a popular homework exercise in dynamic programming, where the textbook solution runs in O(nt) time.
It is not hard to solve this problem in O(t polylog t) time by using convolution. In this paper, we present a simple deterministic O(t log t loglog t) time algorithm, and later improve the running time to O(t log t) by randomization.
Eppstein (1995) gave a technique to transform any data structure for dynamic nearest neighbor queries into a data structure for dynamic closest pair, for any distance function; the transformation increases the time bound by two logarithmic factors. We present a similar, simple transformation that is just as good, and can avoid the extra logarithmic factors when the query and update time of the given structure exceed n^epsilon for some constant epsilon > 0.
Consequently, in the case of an arbitrary distance function, we obtain an optimal O(n)-space data structure to maintain the dynamic closest pair of n points in O(n) amortized time plus O(n) distance evaluations per update.
We present a deterministic, truly subquadratic algorithm for offline (1+epsilon)-approximate nearest or farthest neighbor search (in particular, the closest pair or diameter problem) in Hamming space in any dimension d < n^delta, for a sufficiently small constant delta > 0. The running time of the algorithm is roughly n^{2-epsilon^{1/2+O(delta)}} for nearest neighbors, or n^{2-Omega(sqrt{epsilon}/log(1/epsilon))} for farthest. The algorithm follows from a simple combination of expander walks, Chebyshev polynomials, and rectangular matrix multiplication.
We also show how to eliminate errors in the previous Monte Carlo randomized algorithm of Alman, Chan, and Williams [FOCS'16] for offline approximate nearest or farthest neighbors, and obtain a Las Vegas randomized algorithm with expected running time n^{2-Omega(epsilon^{1/3}/log(1/epsilon))}.
Finally, we note a simplification of Alman, Chan, and Williams' method and obtain a slightly improved Monte Carlo randomized algorithm with running time n^{2-Omega(epsilon^{1/3}/log^{2/3}(1/epsilon))}.
As one application, we obtain improved deterministic and randomized (1+epsilon)-approximation algorithms for MAX-SAT.
Range searching on categorical, or "colored", data has been studied extensively for over two decades. In this paper, we obtain the current best results for perhaps the most basic, and most often studied, version of the geometric problem: colored orthogonal range reporting.
Given n colored points in two-dimensional space [U]^2, we present a data structure with O(n log^{3/4+epsilon} n) space, for an arbitrarily small constant epsilon > 0, so that all k distinct colors in any axis-aligned query rectangle can be reported in (optimal) O(loglog U + k) time; this is the first method to break the O(n log n) space barrier.
In three dimensions, we present a data structure with O(n log^{9/5+epsilon} n) space and O(log n/loglog n + k) time; this improves the previous space bound of O(n log^4 n).
Range closest-pair (RCP) search is a range-search variant of the classical closest-pair problem, which aims to store a given set S of points into some space-efficient data structure such that when a query range Q is specified, the closest pair in S \cap Q can be reported quickly. RCP search has received attention over years, but the primary focus was only on R^2. In this paper, we study RCP search in higher dimensions. We give the first nontrivial RCP data structures for orthogonal, simplex, halfspace, and ball queries in R^d for any constant d. Furthermore, we prove a conditional lower bound for orthogonal RCP search for d >= 3.
In this paper we study two geometric data structure problems in the special case when input objects or queries are fat rectangles. We show that in this case a significant improvement compared to the general case can be achieved. We describe data structures that answer two- and three-dimensional orthogonal range reporting queries in the case when the query range is a fat rectangle. Our two-dimensional data structure uses O(n) words and supports queries in O(loglog U + k) time, where n is the number of points in the data structure, U is the size of the universe and k is the number of points in the query range. Our three-dimensional data structure needs O(n log^eps U) words of space and answers queries in O(loglog U + k) time. We also consider the rectangle stabbing problem on a set of three-dimensional fat rectangles. Our data structure uses O(n) space and answers stabbing queries in O(log U log log U + k) time.
We present new results on a number of fundamental problems about dynamic geometric data structures:
Given a set of n points in the plane, and a parameter k, we consider the problem of computing the minimum (perimeter or area) axis-aligned rectangle enclosing k points. We present the first near quadratic time algorithm for this problem, improving over the previous near-O(n^{5/2})-time algorithm by Kaplan et al. [ESA 2017]. We provide an almost matching conditional lower bound, under the assumption that (min,+)-convolution cannot be solved in truly subquadratic time. Furthermore, we present a new reduction (for either perimeter or area) that can make the time bound sensitive to k, giving near O(nk) time. We also present a near linear time (1+eps)-approximation algorithm to the minimum area of the optimal rectangle containing k points. In addition, we study related problems including the 3-sided, arbitrarily oriented, weighted, and subset sum versions of the problem.
We consider the problem of computing Shapley values for points in the plane, where each point is interpreted as a player, and the value of a coalition is defined by the area of usual geometric objects, such as the convex hull or the minimum axis-parallel bounding box.
For sets of n points in the plane, we show how to compute in roughly O(n^{3/2}) time the Shapley values for the area of the minimum axis-parallel bounding box and the area of the union of the rectangles spanned by the origin and the input points. When the points form an increasing or decreasing chain, the running time can be improved to near-linear. In all these cases, we use linearity of the Shapley values and algebraic methods.
We also show that Shapley values for the area of the convex hull or the minimum enclosing disk can be computed in O(n^2) and O(n^3) time, respectively. These problems are closely related to the model of stochastic point sets considered in computational geometry, but here we have to consider random insertion orders of the points instead of a probabilistic existence of points.
For any constant d and parameter eps > 0, we show the existence of (roughly) 1/eps^d orderings on the unit cube [0,1)^d, such that any two points p,q in [0,1)^d that are close together under the Euclidean metric are "close together" in one of these linear orderings in the following sense: the only points that could lie between p and q in the ordering are points with Euclidean distance at most eps d(p,q) from p or q. These orderings are extensions of the "Z-order", and they can be efficiently computed.
Functionally, the orderings can be thought of as a replacement to quadtrees and related structures (like well-separated pair decompositions). We use such orderings to obtain surprisingly simple algorithms for a number of basic problems in low-dimensional computational geometry, including (i) dynamic approximate bichromatic closest pair, (ii) dynamic spanners, (iii) dynamic approximate minimum spanning trees, (iv) static and dynamic fault-tolerant spanners, and (v) approximate nearest neighbor search.
We initiate the study of the following natural geometric optimization problem. The input is a set of axis-aligned rectangles in the plane. The objective is to find a set of horizontal line segments of minimum total length so that every rectangle is stabbed by some line segment. A line segment stabs a rectangle if it intersects its left and its right boundary. The problem, which we call Stabbing, can be motivated by a resource allocation problem and has applications in geometric network design. To the best of our knowledge, only special cases of this problem have been considered so far.
Stabbing is a weighted geometric set cover problem, which we show to be NP-hard. A constrained variant of Stabbing turns out to be even APX-hard. While for general set cover the best possible approximation ratio is Theta(log n), it is an important field in geometric approximation algorithms to obtain better ratios for geometric set cover problems. Chan et al. [SODA'12] generalize earlier results by Varadarajan [STOC'10] to obtain sub-logarithmic performances for a broad class of weighted geometric set cover instances that are characterized by having low shallow-cell complexity. The shallow-cell complexity of Stabbing instances, however, can be high so that a direct application of the framework of Chan et al. gives only logarithmic bounds. We still achieve a constant-factor approximation by decomposing general instances into what we call laminar instances that have low enough complexity.
Our decomposition technique yields constant-factor approximations also for the variant where rectangles can be stabbed by horizontal and vertical segments and for two further geometric set cover problems.
For two sequences P and Q of n points in R^d, we compute an approximation to the discrete Fr�chet distance. Our f-approximation algorithm runs in time O(n logn +n^2/f^2), for any f in [1, n/logn] and d=O(1), which improves (and, at the same time, slightly simplifies) the previous O(n logn +n^2/f)-time algorithm by Bringmann and Mulzer [SoCG�15].
In this work, we present a collection of new results on two fundamental problems in geometric data structures: orthogonal point location and rectangle stabbing.
We make progress on a number of open problems concerning the area requirement for drawing trees on a grid. We prove that
For many algorithms dealing with sets of points in the plane, the only relevant information carried by the input is the combinatorial configuration of the points: the orientation of each triple of points in the set (clockwise, counterclockwise, or collinear). This information is called the order type of the point set. In the dual, realizable order types and abstract order types are combinatorial analogues of line arrangements and pseudoline arrangements. Too often in the literature we analyze algorithms in the real-RAM model for simplicity, putting aside the fact that computers as we know them cannot handle arbitrary real numbers without some sort of encoding. Encoding an order type by the integer coordinates of a realizing point set is known to yield doubly exponential coordinates in some cases. Other known encodings can achieve quadratic space or fast orientation queries, but not both. In this contribution, we give a compact encoding for abstract order types that allows efficient query of the orientation of any triple: the encoding uses O(n^2) bits and an orientation query takes O(log n) time in the word-RAM model with word size w >= log n. This encoding is space-optimal for abstract order types. We show how to shorten the encoding to O(n^2 (loglog n)^2 / log n) bits for realizable order types, giving the first subquadratic encoding for those order types with fast orientation queries. We further refine our encoding to attain O(log n / loglog n) query time at the expense of a negligibly larger space requirement. In the realizable case, we show that all those encodings can be computed efficiently. Finally, we generalize our results to the encoding of point configurations in higher dimension.
We present the first near-linear-time (1+epsilon)-approximation algorithm for the diameter of a weighted unit-disk graph of n vertices, running in O(n log^2 n) time, for any constant epsilon > 0, improving the near-O(n^{3/2})-time algorithm of Gao and Zhang [STOC 2003]. Using similar ideas, we can construct a (1+epsilon)-approximate distance oracle for weighted unit-disk graphs with O(1) query time, with a similar improvement in the preprocessing time, from near O(n^{3/2}) to O(n log^3 n). We also obtain new results for a number of other related problems in the weighted unit-disk graph metric, such as the radius and bichromatic closest pair.
As a further application, we use our new distance oracle, along with additional ideas, to solve the (1+epsilon)-approximate all-pairs bounded-leg shortest paths problem for a set of n planar points, with near O(n^{2.579}) preprocessing time, O(n^2 log n) space, and O(log log n) query time, improving thus the near-cubic preprocessing bound by Roditty and Segal [SODA 2007].
We study a longstanding problem in computational geometry: dynamic 2-d orthogonal point location, i.e., vertical ray shooting among n horizontal line segments. We present a data structure achieving O(log n / loglog n) optimal expected query time and O(log^{1/2+epsilon} n) update time (amortized) in the word-RAM model for any constant epsilon > 0, under the assumption that the x-coordinates are integers bounded polynomially in n. This substantially improves previous results of Giyora and Kaplan [SODA 2007] and Blelloch [SODA 2008] with O(log n) query and update time, and of Nekrich (2010) with O(log n / loglog n) query time and O(log^{1+epsilon} n) update time. Our result matches the best known upper bound for simpler problems such as dynamic 2-d dominance range searching.
We also obtain similar bounds for orthogonal line segment intersection reporting queries, vertical ray stabbing, and vertical stabbing-max, improving previous bounds, respectively, of Blelloch [SODA 2008] and Mortensen [SODA 2003], of Tao (2014), and of Agarwal, Arge, and Yi [SODA 2005] and Nekrich [ISAAC 2011].
We revisit the standard 0-1 knapsack problem. The latest polynomial-time approximation scheme by Rhee (2015) with approximation factor 1+epsilon has running time near O~(n+(1/epsilon)^{5/2}) (ignoring polylogarithmic factors), and is randomized. We present a simpler algorithm which achieves the same result and is deterministic.
With more effort, our ideas can actually lead to an improved time bound near O~(n + (1/epsilon)^{12/5}), and still further improvement for small n.
We present an algorithm that solves the 3SUM problem for n real numbers in O((n^2/log^2n) (loglog n)^{O(1)}) time, improving previous solutions by about a logarithmic factor. Our framework for shaving off two logarithmic factors can be applied to other problems, such as (median,+)-convolution/matrix multiplication and algebraic generalizations of 3SUM. We also obtain the first subquadratic results on some 3SUM-hard problems in computational geometry, for example, deciding whether (the interiors of) a constant number of simple polygons have a common intersection.
Let T be an n-node tree of maximum degree 4, and let P be a set of n points in the plane with no two points on the same horizontal or vertical line. It is an open question whether T always has a planar drawing on P such that each edge is drawn as an orthogonal path with one bend (an "L-shaped" edge). By giving new methods for drawing trees, we improve the bounds on the size of the point set P for which such drawings are possible to: O(n^{1.55}) for maximum degree 4 trees; O(n^{1.22}) for maximum degree 3 (binary) trees; and O(n^{1.142}) for perfect binary trees.
Drawing ordered trees with L-shaped edges is harder---we give an example that cannot be done and a bound of O(n log n) points for L-shaped drawings of ordered caterpillars, which contrasts with the known linear bound for unordered caterpillars.
We present an algorithm that computes a (1+epsilon)-approximation of the diameter of a weighted, undirected planar graph of n vertices with non-negative edge lengths in O(n log n (log n + (1/epsilon)^5)) expected time, improving upon the O(n ((1/epsilon)^4 log^4 n + 2^{O(1/epsilon)}))-time algorithm of Weimann and Yuster [ICALP 2013]. Our algorithm makes two improvements over that result: first and foremost, it replaces the exponential dependency on 1/epsilon with a polynomial one, by adapting and specializing Cabello's recent abstract-Voronoi-diagram-based technique [SODA 2017] for approximation purposes; second, it shaves off two logarithmic factors by choosing a better sequence of error parameters during recursion.
Moreover, using similar techniques, we improve the (1+epsilon)-approximate distance oracle of Gu and Xu [ISAAC 2015] by first replacing the exponential dependency on 1/epsilon on the preprocessing time and space with a polynomial one and second removing a logarithmic factor from the preprocessing time.
We address the All-Pairs Shortest Paths (APSP) problem for a number of unweighted, undirected geometric intersection graphs. We present a general reduction of the problem to static, offline intersection searching (specifically detection). As a consequence, we can solve APSP for intersection graphs of n arbitrary disks in O(n^2 log n) time, axis-aligned line segments in O(n^2 loglog n) time, arbitrary line segments in O(n^{7/3} log^{1/3} n) time, d-dimensional axis-aligned boxes in O(n^2 log^{d-1.5} n) time for d >= 2, and d-dimensional axis-aligned unit hypercubes in O(n^2 loglog n) time for d=3 and O(n^2 log^{d-3} n) time for d >= 4.
In addition, we show how to solve the Single-Source Shortest Paths (SSSP) problem in unweighted intersection graphs of axis-aligned line segments in O(n log n) time, by a reduction to dynamic orthogonal point location.
We study a longstanding problem in computational geometry: 2-d dynamic orthogonal range reporting. We present a new data structure achieving O(log n/loglog n + k) optimal query time and O(log^{2/3+o(1)}n) update time (amortized) in the word RAM model, where n is the number of data points and k is the output size. This is the first improvement in over 10 years of Mortensen's previous result [SIAM J. Comput., 2006], which has O(log^{7/8+epsilon}n) update time for an arbitrarily small constant epsilon.
In the case of 3-sided queries, our update time reduces to O(log^{1/2+epsilon}n), improving Wilkinson's previous bound [ESA 2014] of O(log^{2/3+epsilon}n).
We revisit the orthogonal range searching problem and the exact l_infinity nearest neighbor searching problem for a static set of n points when the dimension d is moderately large. We give the first data structure with near linear space that achieves truly sublinear query time when the dimension is any constant multiple of log n. Specifically, the preprocessing time/space is O(n^{1+delta}) for any constant delta > 0, and the expected query time is n^{1 - 1/O(c log c)}. The data structure is simple and is based on a new "augmented, randomized, lopsided" variant of k-d trees. It matches (in fact, slightly improves) the performance of previous combinatorial algorithms that work only in the case of offline queries [Impagliazzo, Lovett, Paturi, and Schneider (2014) and Chan (SODA'15)]. It leads to faster combinatorial algorithms for all-pairs shortest paths in general weighted graphs and rectangular Boolean matrix multiplication.
In the offline case, we show that the problem can be reduced to the Boolean orthogonal vectors problem and thus admits an n^{2 - 1/O(log c)}-time non-combinatorial algorithm [Abboud, Williams, and Yu (SODA'15)]. This reduction is also simple and is based on range trees.
Finally, we use a similar approach to obtain a small improvement to Indyk's data structure [FOCS'98] for approximate l_infinity nearest neighbor search when d=c log n.
We apply the polynomial method---specifically, Chebyshev polynomials---to obtain a number of new results on geometric approximation algorithms in low constant dimensions. For example, we give an algorithm for constructing epsilon-kernels (coresets for approximate width and approximate convex hull) in close to optimal time O(n + (1/epsilon)^{(d-1)/2}), up to a small near-(1/epsilon)^{3/2} factor, for any d-dimensional n-point set. We obtain an improved data structure for Euclidean approximate nearest neighbor search with close to O(n log n + (1/epsilon)^{d/4}n) preprocessing time and O((1/epsilon)^{d/4} log n) query time. We obtain improved approximation algorithms for discrete Voronoi diagrams, diameter, and bichromatic closest pair in the L_s-metric for any even integer constant s > 2. The techniques are general and may have further applications.
In this paper we study the all-pairs shortest paths problem in (unweighted) unit-disk graphs. The previous best solution for this problem required O(n^2 log n) time, by running the O(n log n)-time single-source shortest path algorithm of Cabello and Jejcic (2015) from every source vertex, where n is the number of vertices. We not only manage to eliminate the logarithmic factor, but also obtain the first (slightly) subquadratic algorithm for the problem, running in O(n^2 sqrt{loglog n/log n}) time. Our algorithm computes an implicit representation of all the shortest paths, and, in the same amount of time, can also compute the diameter of the graph.
Given a set P of n moving points in fixed dimension d, where the trajectory of each point is a polynomial of degree bounded by some constant, we present a kinetic data structure (KDS) for maintenance of the closest pair on P. Assuming the closest pair distance is between 1 and Delta over time, our KDS uses O(n log Delta) space and processes O(n^2 beta log Delta log n + n^2 beta log Delta loglog Delta) events, each in worst-case time O(log^2 n + log^2 log Delta). Here, beta is an extremely slow-growing function. The locality of the KDS is O(log n + loglog Delta). Our closest pair KDS supports insertions and deletions of points. An insertion or deletion takes worst-case time O(log Delta log^2 n +log Delta log^2log Delta). Also, we use a similar approach to provide a KDS for the all epsilon-nearest neighbors in R^d. The complexities of the previous KDSs, for both closest pair and all epsilon-nearest neighbors, have polylogarithmic factors, where the number of logs depends on dimension d. Assuming Delta is polynomial in n, our KDSs obtain improvements on the previous KDSs. Our solutions are based on a kinetic clustering on P. Though we use ideas from the previous clustering KDS by Hershberger, we simplify and improve his work.
We design new polynomials for representing threshold functions in three different regimes: probabilistic polynomials of low degree, which need far less randomness than previous constructions, polynomial threshold functions (PTFs) with "nice" threshold behavior and degree almost as low as the probabilistic polynomials, and a new notion of probabilistic PTFs where we combine the above techniques to achieve even lower degree with similar "nice" threshold behavior. Utilizing these polynomial constructions, we design faster algorithms for a variety of problems:
Introduced by Agarwal, Har-Peled, and Varadarajan (2004), an epsilon-kernel of a point set is a coreset that can be used to approximate the width, minimum enclosing cylinder, minimum bounding box, and solve various related geometric optimization problems. Such coresets form one of the most important tools in the design of linear-time approximation algorithms in computational geometry, as well as efficient insertion-only streaming algorithms and dynamic (non-streaming) data structures. In this paper, we continue the theme and explore dynamic streaming algorithms (in the so-called turnstile model).
Andoni and Nguyen [SODA'12] described a dynamic streaming algorithm for maintaining a (1+epsilon)-approximation of the width using O(polylog U) space and update time for a point set in [U]^d for any constant dimension d and any constant epsilon > 0. Their sketch, based on a polynomial method, does not explicitly maintain an epsilon-kernel. We extend their method to maintain an epsilon-kernel, and at the same time reduce some of logarithmic factors. As an application, we obtain the first randomized dynamic streaming algorithm for the width problem (and related geometric optimization problems) that supports k outliers, using poly(k, log U) space and time.
Given a set of geometric objects each associated with a time value, we wish to determine whether a given property is true for a subset of those objects whose time values fall within a query time window. We call such problems time-windowed decision problems, and they have been the subject of much recent attention, for instance studied by Bokal, Cabello, and Eppstein [SoCG 2015]. In this paper, we present new approaches to this class of problems that are conceptually simpler than Bokal et al.'s, and also lead to faster algorithms. For instance, we present algorithms for preprocessing for the time-windowed 2D diameter decision problem in O(n log n) time and the time-windowed 2D convex hull area decision problem in O(n log n) time, improving Bokal et al.'s O(n log^2 n) and O(n log n loglog n) solutions respectively.
Our first approach is to reduce time-windowed decision problems to a generalized range successor problem, which we solve using a novel way to search range trees. Our other approach is to use dynamic data structures directly, taking advantage of a new observation that the total number of combinatorial changes to a planar convex hull is near linear for any FIFO update sequence, in which deletions occur in the same order as insertions. We also apply these approaches to obtain the first O(n polylog n) algorithms for the time-windowed 3D diameter decision and 2D orthogonal segment intersection detection problems.
We show how to solve all-pairs shortest paths on n nodes in deterministic n^3 / 2^{Omega(sqrt{log n})} time, and how to count the pairs of orthogonal vectors among n 0-1 vectors in d = c log n dimensions in deterministic n^{2 - 1/O(log c)} time. These running times essentially match the best known randomized algorithms of (Williams, STOC'14) and (Abboud, Williams, and Yu, SODA 2015) respectively, and the ability to count was open even for randomized algorithms. By reductions, these two results yield faster deterministic algorithms for many other problems. Our techniques can also be used to count k-SAT assignments on n variable formulas in 2^{n - n/O(k)} time, roughly matching the best known running times for detecting satisfiability and resolving an open problem of Santhanam (2013).
A key to our constructions is an efficient way to deterministically simulate certain probabilistic polynomials critical to the algorithms of prior work, applying epsilon-biased sets and modulus-amplifying polynomials.
At SODA'93, Chazelle and Matousek presented a derandomization of Clarkson's sampling-based algorithm for solving linear programs with n constraints and d variables in d^{(7+o(1))d} n deterministic time. The time bound can be improved to d^{(5+o(1))d} n with subsequent work by Bronnimann, Chazelle, and Matousek [FOCS'93]. We first point out a much simpler derandomization of Clarkson's algorithm that avoids epsilon-approximations and runs in d^{(3+o(1))d} n time. We then describe a few additional ideas that eventually improve the deterministic time bound to d^{(1/2+o(1))d} n.
We study the problem of supporting (orthogonal) range selection queries over a set of n points in constant-dimensional space. Under the standard word-RAM model with word size w = Omega(lg n), we present data structures that occupy O(n (lg n/lglg n)^{d-1}) words of space and support d-dimensional range selection queries using O((lg n/lglg n)^d) query time. This improves the best known data structure by a factor of lglg n in query time. To develop our data structures, we generalize the "parallel counting" technique of Brodal, Gfeller, J�rgensen, and Sanders (2011) for one-dimensional range selection to higher dimensions.
As a byproduct, we design data structures to support d-dimensional range counting queries within O(n (log_w n)^{d-2}) words of space and O((log_w n)^{d-1}) query time, for any word size w = Omega(lg n). This improves the best known result of JaJa, Mortensen, and Shi (2004) when lg w >> lglg n.
We describe a fully dynamic linear-space data structure for point location in connected planar subdivisions, or more generally vertical ray shooting among non-intersecting line segments, that supports queries in O(log n (loglog n)^2) time and updates in O(log n loglog n) time. This is the first data structure that achieves close to logarithmic query and update time simultaneously, ignoring loglog n factors. We further show how to reduce the query time to O(log n loglog n) in the RAM model with randomization. Alternatively, the query time can be lowered to O(log n) if the update time is increased to O(log^{1+eps}n) for any constant eps>0, or vice versa.
Given a set of points in any constant dimension, each of which is associated with a time during which that point is active, we design a data structure with O(n log n) space that can find the closest pair of active points within a query interval of time in O(loglog n) time using a quadtree-based approach in the word-RAM model.
Given a set of n moving points in R^d, where each point moves along a linear trajectory at arbitrary but constant velocity, we present an O~(n^{5/3})-time algorithm to compute a (1+epsilon)-factor approximation to the minimum closest pair distance over time, for any constant epsilon>0 and any constant dimension d. This addresses an open problem posed by Gupta, Janardan, and Smid (1996).
More generally, we consider a data structure version of the problem: for any linearly moving query point q, we want a (1+epsilon)-factor approximation to the minimum nearest neighbor distance to q over time. We present a data structure that requires O~(n^{5/3}) space and O~(n^{2/3}) query time, O~(n^5) space and polylogarithmic query time, or O~(n) space and O~(n^{4/5}) query time, for any constant epsilon>0 and any constant dimension d.
We give a fully dynamic data structure for maintaining an approximation of the Hausdorff distance between two point sets in a constant dimension d, a standard problem in computational geometry. Our solution has an approximation factor of 1+epsilon for any constant epsilon>0 and expected update time O(log U/loglog n}). The result of the paper greatly improves over the previous exact method, which required O~(n^{5/6}) time and worked only in a semi-online setting. The model of computation is the word RAM model.
A set of strings, called a string dictionary, is a basic string data structure. The most primitive query, where one seeks the existence of a pattern in the dictionary, is called a lookup query. Approximate lookup queries, i.e., to lookup the existence of a pattern with a bounded number of errors, is a fundamental string problem. Several data structures have been proposed to do so efficiently. Almost all solutions consider a single error, as will this result. Lately, Belazzougui and Venturini (CPM 2013) raised the question whether one can construct efficient indexes that support lookup queries with one error in optimal query time, that is, O(|p|/w + occ), where p is the query, w the machine word-size, and occ the number of occurrences.
Specifically, for the problem of one mismatch and constant alphabet size, we obtain optimal query time. For a dictionary of d strings our proposed index uses O(w d log^{1+eps}d) additional bit space (beyond the dictionary which can be maintained in compressed form). Our results are parameterized for a space-time tradeoff.
We propose more results for the case of lookup queries with one insertion/deletion on dictionaries over a constant sized alphabet. These results are especially effective for large patterns.
Chazelle [FOCS'89] gave a linear-time algorithm to compute the intersection of two convex polyhedra in three dimensions. We present a simpler algorithm to do the same.
We present optimal deterministic algorithms for constructing shallow cuttings in an arrangement of lines in two dimensions or planes in three dimensions. Our results improve the deterministic polynomial-time algorithm of Matousek (1992) and the optimal but randomized algorithm of Ramos (1999). This leads to efficient derandomization of previous algorithms for numerous well-studied problems in computational geometry, including halfspace range reporting in 2-d and 3-d, k nearest neighbors search in 2-d, (<= k)-levels in 3-d, order-k Voronoi diagrams in 2-d, linear programming with k violations in 2-d, dynamic convex hulls in 3-d, dynamic nearest neighbor search in 2-d, convex layers (onion peeling) in 3-d, epsilon-nets for halfspace ranges in 3-d, and more. As a side product we also describe an optimal deterministic algorithm for constructing standard (non-shallow) cuttings in two dimensions, which is arguably simpler than the known optimal algorithms by Matousek (1991) and Chazelle (1993).
We present a collection of new results on problems related to 3SUM, including:
We present a new combinatorial algorithm for Boolean matrix multiplication that runs in O(n^3 (loglog n)^3 / log^3 n) time. This improves the previous combinatorial algorithm by Bansal and Williams [FOCS'09] that runs in O(n^3 (loglog n)^2 / log^{9/4} n) time. Whereas Bansal and Williams' algorithm uses regularity lemmas for graphs, the new algorithm is simple and uses entirely elementary techniques: table lookup, word operations, plus a deceptively straightforward divide-and-conquer.
Our algorithm is in part inspired by a recent result of Impagliazzo, Lovett, Paturi, and Schneider (2014) on a different geometric problem, offline dominance range reporting; we improve their analysis for that problem as well.
We investigate the problem of constructing planar drawings with few bends for two related problems, the partially embedded graph (PEG) problem to extend a straight-line planar drawing of a subgraph to a planar drawing of the whole graph and the simultaneous planarity (SEFE) problem to find planar drawings of two graphs that coincide on shared vertices and edges. In both cases we show that if the required planar drawings exist, then there are planar drawings with a linear number of bends per edge and, in the case of simultaneous planarity, a constant number of crossings between every pair of edges. Our proofs provide efficient algorithms if the combinatorial embedding information about the drawing is given. Our result on partially embedded graph drawing generalizes a classic result of Pach and Wenger showing that any planar graph can be drawn with fixed locations for its vertices and with a linear number of bends per edge.
In the path minimum query problem, we preprocess a tree on n weighted nodes, such that given an arbitrary path, we can locate the node with the smallest weight along this path. We design novel succinct indices for this problem; one of our index structures supports queries in O(alpha(m,n)) time, and occupies O(m) bits of space in addition to the space required for the input tree, where m is an integer greater than or equal to n and alpha(m,n) is the inverse-Ackermann function. These indices give us the first succinct data structures for the path minimum problem, and allow us to obtain new data structures for path reporting queries, which report the nodes along a query path whose weights are within a query range. We achieve three different time/space tradeoffs for path reporting by designing (a) an O(n)-word structure with O(lg^eps n + occ lg^eps n) query time, where occ is the number of nodes reported; (b) an O(n lglg n)-word structure with O(lglg n + occ lglg n) query time; and (c) an O(n lg^eps n)- word structure with O(lglg n + occ) query time. These tradeoffs match the state of the art of two-dimensional orthogonal range reporting queries which can be treated as a special case of path reporting queries. When the number of distinct weights is much smaller than n, we further improve both the query time and the space cost of these three results.
Jumbled indexing is the problem of indexing a text T for queries that ask whether there is a substring of T matching a pattern represented as a Parikh vector, i.e., the vector of frequency counts for each character. Jumbled indexing has garnered a lot of interest in the last four years. There is a naive algorithm that preprocesses all answers in O(n^2 |Sigma|) time allowing quick queries afterwards, and there is another naive algorithm that requires no preprocessing but has O(n log |Sigma|) query time. Despite a tremendous amount of effort there has been little improvement over these running times.
In this paper we provide good reason for this. We show that, under a 3SUM-hardness assumption, jumbled indexing for alphabets of size omega(1) requires Omega(n^{2-epsilon}) preprocessing time or Omega(n^{1-delta}) query time for any epsilon,delta>0. In fact, under a stronger 3SUM-hardness assumption, for any constant alphabet size r >= 3 there exist describable fixed constant epsilon_r and delta_r such that jumbled indexing requires Omega(n^{2-epsilon_r}) preprocessing time or Omega(n^{1-delta_r}) query time.
We revisit a classical problem in computational geometry that has been studied since the 1980s: in the rectangle enclosure problem we want to report all k enclosing pairs of n input rectangles in 2D. We present the first deterministic algorithm that takes O(n log n + k) worst-case time and O(n) space in the word-RAM model. This improves previous deterministic algorithms with O((n log n + k) loglog n) running time. We achieve the result by derandomizing the algorithm of Chan, Larsen and Patrascu [SoCG'11] that attains the same time complexity but in expectation.
The 2D rectangle enclosure problem is related to the offline dominance range reporting problem in 4D, and our result leads to the currently fastest deterministic algorithm for offline dominance reporting in any constant dimension d >= 4.
A key tool behind Chan et al.'s previous randomized algorithm is shallow cuttings for 3D dominance ranges. Recently, Afshani and Tsakalidis [SODA'14] obtained a deterministic O(n log n)-time algorithm to construct such cuttings. We first present an improved deterministic construction algorithm that runs in O(n loglog n) time in the word-RAM; this result is of independent interest. Many additional ideas are then incorporated, including a linear-time algorithm for merging shallow cuttings and an algorithm for an offline tree point location problem.
Recently, Arya, da Fonseca, and Mount [STOC 2011, SODA 2012] made notable progress in improving the epsilon-dependencies in the space/query-time tradeoffs for (1+epsilon)-factor approximate nearest neighbor search in fixed-dimensional Euclidean spaces. However, epsilon-dependencies in the preprocessing time were not considered, and so their data structures cannot be used to derive faster algorithms for offline proximity problems. Known algorithms for many such problems, including approximate bichromatic closest pair (BCP) and approximate Euclidean minimum spanning trees (EMST), typically have factors near (1/epsilon)^{d/2 +/- O(1)} in the running time when the dimension d is a constant.
We describe a technique that breaks the (1/epsilon)^{d/2} barrier and yields new results for many well-known proximity problems, including:
The improvement arises from a new time bound for exact "discrete Voronoi diagrams", which were previously used in the construction of epsilon-kernels (or extent-based coresets), a well-known tool for another class of fundamental problems. This connection leads to more results, including:
Many standard problems in computational geometry have been solved asymptotically optimally as far as comparison-based algorithms are concerned, but there has been little work focusing on improving the constant factors hidden in big-Oh bounds on the number of comparisons needed. In this paper, we consider orthogonal-type problems and present a number of results that achieve optimality in the constant factors of the leading terms, including:
Our algorithms and data structures use a variety of techniques, including Seidel and Adamy's planar point location method, weighted binary search, and height-optimal BSP trees.
We consider the classical selection and sorting problems in a model where the initial permutation of the input has to be restored after completing the computation. While the requirement of the restoration is stringent compared to the classical versions of the problems, this model is more relaxed than a read-only memory where the input elements are not allowed to be moved within the input array.
We first show that for a sequence of n integers, selection (finding the median or more generally the k-th smallest element for a given k) can be done in O(n) time using O(lg n) words of extra space in this model. In contrast, no linear-time selection algorithm is known which uses polylogarithmic space in the read-only memory model.
For sorting n integers in this model, we first present an O(n lg n)-time algorithm using O(lg n) words of extra space. When the universe size U is polynomial in n, we give a faster O(n)-time algorithm (analogous to radix sort) which uses O(n^eps) words of extra space for an arbitrarily small constant eps>0. More generally, we show how to match the time bound of any word-RAM integer-sorting algorithm using O(n^eps) words of extra space. In sharp contrast, there is an Omega(n^2/S)-time lower bound for integer sorting using O(S) bits of space in the read-only memory model. Extension of our results to arbitrary input types beyond integers is not possible: for "indivisible" input elements, we can prove the same Omega(n^2/S) lower bound for sorting in our model.
En route, we develop linear-time in-place algorithms to extract leading bits of the input array and to compress and decompress strings with low entropy; these techniques may be of independent interest.
Starting with Munro and Paterson (1980), the selection or median-finding problem has been extensively studied in the read-only memory model and in streaming models. Munro and Paterson's deterministic algorithm and its subsequent refinements require at least polylogarithmic or logarithmic space, whereas the algorithms by Munro and Raman (1996) and Raman and Ramnath (1999) can be made to use just O(1) storage cells but take O(n^{1+eps}) time for an arbitrarily small constant eps>0.
In this paper, we show that faster selection algorithms in read-only memory are possible if the input is a sequence of integers. For example, one algorithm uses O(1) storage cells and takes O(n lg U) time where U is the universe size. Another algorithm uses O(1) storage cells and takes O(n lg n lglg U) time. We also describe an O(n)-time algorithm for finding an approximate median using O(lg^eps U) storage cells.
All our algorithms are simple and deterministic. Interestingly, one of our algorithms works by making multiple calls to the textbook algorithm to find the majority of a sequence of bits. This is to find the `centroid' of the trie of the binary representation of the sequence of integers. This technique could be of independent interest.
We consider the problem of finding a planar embedding of a graph at fixed vertex locations that minimizes the total edge length. The problem is known to be NP-hard. We give polynomial time algorithms achieving an O(n^{1/2} log n) approximation for paths and matchings, and an O(n) approximation for general graphs.
We present a new algorithm for a classic problem in computational geometry, Klee's measure problem: given a set of n axis-parallel boxes in d-dimensional space, compute the volume of the union of the boxes. The algorithm runs in O(n^{d/2}) time for any constant d >= 3. Although it improves the previous best algorithm by "just" an iterated logarithmic factor, the real surprise lies in the simplicity of the new algorithm.
We also show that it is theoretically possible to beat the O(n^{d/2}) time bound by logarithmic factors for integer input in the word RAM model, and for other variants of the problem.
With additional work, we obtain an O(n^{d/3} polylog n)-time algorithm for the important special case of orthants or unit hypercubes (which include the so-called "hypervolume indicator problem"), and an O(n^{(d+1)/3} polylog n)-time algorithm for the case of arbitrary hypercubes or fat boxes, improving a previous O(n^{(d+2)/3})-time algorithm by Bringmann.
Given a set P of n points in R^d, where each point p of P is associated with a weight w(p) (positive or negative), the Maximum-Weight Box problem consists in finding an axis-aligned box B maximizing the sum of w(p) over all points p in B. We describe algorithms for this problem in two dimensions that run in the worst case in O(n^2) time, and much less on more specific classes of instances. In particular, these results imply similar ones for the Maximum Bichromatic Discrepancy Box problem. These improve by a factor of log n on the best worst-case complexity previously known for these problems, O(n^2 lg n) [Cortes et al., J. Alg., 2009; Dobkin et al., J. Comput. Syst. Sci., 1996].
We study a geometric version of the Red-Blue Set Cover problem originally proposed by Carr, Doddi, Konjevod, and Marathe (SODA 2000): given a red point set, a blue point set, and a set of objects, we want to use objects to cover all the blue points, while minimizing the number of red points covered. We prove that the problem is NP-hard even when the objects are unit squares in 2D, and we give the first PTAS for this case. The technique we use simplifies and unifies previous PTASes for the weighted geometric set cover problem and the unique maximum coverage problem for 2D unit squares.
One advantage of smart grids is that they can reduce the peak load by distributing electricity-demands over multiple short intervals. Finding a schedule that minimizes the peak load corresponds to a variant of a strip packing problem. Normally, for strip packing problems, a given set of axis-aligned rectangles must be packed into a fixed-width strip, and the goal is to minimize the height of the strip. The electricity-allocation application can be modelled as strip packing with slicing: each rectangle may be cut vertically into multiple slices and the slices may be packed into the strip as individual pieces. The stacking constraint forbids solutions in which a vertical line intersects two slices of the same rectangle.
We give a fully polynomial time approximation scheme for this problem, as well as a practical polynomial time algorithm that slices each rectangle at most once and yields a solution of height at most 5/3 times the optimal height.
In 1944, Cairns proved the following theorem: given any two straight-line planar drawings of a triangulation with the same outer face, there exists a morph (i.e., a continuous transformation) between the two drawings so that the drawing remains straight-line planar at all times. Cairns's original proof required exponentially many morphing steps. We prove that there is a morph that consists of O(n^2) steps, where each step is a linear morph that moves each vertex at constant speed along a straight line. Using a known result on compatible triangulations this implies that for a general planar graph G and any two straight-line planar drawings of G with the same embedding, there is a morph between the two drawings that preserves straight-line planarity and consists of O(n^4) steps.
We present three new results on one of the most basic problems in geometric data structures, 2-D orthogonal range counting. All the results are in the w-bit word RAM model.
In this paper we introduce self-approaching graph drawings. A straight-line drawing of a graph is self-approaching if, for any origin vertex s and any destination vertex t, there is an st-path in the graph such that, for any point q on the path, as a point p moves continuously along the path from the origin to q, the Euclidean distance from p to q is always decreasing. This is a more stringent condition than a greedy drawing (where only the distance between vertices on the path and the destination vertex must decrease), and guarantees that the drawing is a 5.33-spanner.
We study three topics: (1) recognizing self-approaching drawings; (2) constructing self-approaching drawings of a given graph; (3) constructing a self-approaching Steiner network connecting a given set of points.
We show that: (1) there are efficient algorithms to test if a polygonal path is self-approaching in R^2 and R^3, but it is NP-hard to test if a given graph drawing in R^3 has a self-approaching uv-path; (2) we can characterize the trees that have self-approaching drawings; (3) for any given set of terminal points in the plane, we can find a linear sized network that has a self-approaching path between any ordered pair of terminals.
We consider range queries in arrays that search for low-frequency elements: least frequent elements and alpha-minorities. An alpha-minority of a query range has multiplicity no greater than an alpha fraction of the elements in the range. Our data structure for the least frequent element range query problem requires O(n) space, O(n^{3/2}) preprocessing time, and O(sqrt{n}) query time. A reduction from boolean matrix multiplication to this problem shows the hardness of simultaneous improvements in both preprocessing time and query time. Our data structure for the alpha-minority range query problem requires O(n) space and O(1/alpha) query time, and allows alpha to be specified at query time.
In the conflict-free coloring problem, for a given range space, we want to bound the minimum value F(n) such that every set P of n points can be colored with F(n) colors with the property that every nonempty range contains a unique color. We prove a new upper bound O(n^{0.368}) with respect to orthogonal ranges in two dimensions (i.e., axis-parallel rectangles), which is the first improvement over the previous bound O(n^{0.382}) by Ajwani, Elbassioni, Govindarajan, and Ray [SPAA'07]. This result leads to an O(n^{1-0.632/2^{d-2}}) upper bound with respect to orthogonal ranges (boxes) in dimension d, and also an O(n^{1-0.632/(2^{d-3}-0.368)}) upper bound with respect to dominance ranges (orthants) in dimension d >= 4.
We also observe that combinatorial results on conflict-free coloring can be applied to the analysis of approximation algorithms for discrete versions of geometric independent set problems. Here, given a set P of (weighted) points and a set S of ranges, we want to select the largest(-weight) subset Q of P with the property that every range of S contains at most one point of Q. We obtain, for example, a randomized O(n^{0.368})-approximation algorithm for this problem with respect to orthogonal ranges in the plane.
A mode of a multiset S is an element a in S of maximum multiplicity; that is, a occurs at least as frequently as any other element in S. Given an array A[1:n] of n elements, we consider a basic problem: constructing a static data structure that efficiently answers range mode queries on A. Each query consists of an input pair of indices (i, j) for which a mode of A[i:j] must be returned. The best previous data structure with linear space, by Krizanc, Morin, and Smid (ISAAC 2003), requires O(sqrt(n) loglog n) query time. We improve their result and present an O(n)-space data structure that supports range mode queries in O(sqrt(n / log n)) worst-case time. Furthermore, we present strong evidence that a query time significantly below sqrt(n) cannot be achieved by purely combinatorial techniques; we show that boolean matrix multiplication of two sqrt(n) by sqrt(n) matrices reduces to n range mode queries in an array of size O(n). Additionally, we give linear-space data structures for orthogonal range mode in higher dimensions (queries in near O(n^(1-1/2d)) time) and for halfspace range mode in higher dimensions (queries in O(n^(1-1/d^2)) time).
The minimum-weight set cover problem is widely known to be O(log n)-approximable, with no improvement possible in the general case. We take the approach of exploiting problem structure to achieve better results, by providing a geometry-inspired algorithm whose approximation guarantee depends solely on an instance-specific combinatorial property known as shallow cell complexity (SCC). Roughly speaking, a set cover instance has low SCC if any column-induced submatrix of the corresponding element-set incidence matrix has few distinct rows. By adapting and improving Varadarajan's recent quasi-uniform random sampling method for weighted geometric covering problems, we obtain strong approximation algorithms for a structurally rich class of weighted covering problems with low SCC.
Our main result has several immediate consequences. Among them, we settle an open question of Chakrabarty et al. by showing that weighted instances of the capacitated covering problem with underlying network structure have O(1)-approximations. Additionally, our improvements to Varadarajan's sampling framework yield several new results for weighted geometric set cover, hitting set, and dominating set problems. In particular, for weighted covering problems exhibiting linear (or near-linear) union complexity, we obtain approximability results agreeing with those known for the unweighted case. For example, we obtain a constant approximation for the weighted disk cover problem, improving upon the 2^{O(log* n)}-approximation known prior to our work and matching the O(1)-approximation known for the unweighted variant. We also obtain an O(log log* n)-approximation for weighted fat triangle cover.
We study several geometric set cover problems in which the goal is to compute a minimum cover of a given set of points in Euclidean space by a family of geometric objects. We give a short proof that this problem is APX-hard when the objects are axis-aligned fat rectangles, even when each rectangle is an epsilon-perturbed copy of a single unit square. We extend this result to several other classes of objects including almost-circular ellipses, axis-aligned slabs, downward shadows of line segments, downward shadows of graphs of cubic functions, 3-dimensional unit balls, and axis-aligned cubes, as well as some related hitting set problems. Our hardness results are all proven by encoding a highly structured minimum vertex cover problem which we believe may be of independent interest.
In contrast, we give a polynomial-time dynamic programming algorithm for 2-dimensional set cover where the objects are pseudodisks containing the origin or are downward shadows of pairwise 2-intersecting x-monotone curves. Our algorithm extends to the weighted case where a minimum-cost cover is required.
We give an algorithm for bichromatic line segment intersection counting that runs in O(n sqrt{log n}) time under the word RAM model via a reduction to dynamic predecessor search, offline point location, and offline dynamic ranking. This algorithm is the first to solve bichromatic line segment intersection counting in o(n log n) time.
At SODA'10, Agarwal and Sharathkumar presented a streaming algorithm for approximating the minimum enclosing ball of a set of points in d-dimensional Euclidean space. Their algorithm requires one pass, uses O(d) space, and was shown to have approximation factor at most (1+sqrt{3})/2 + eps ~ 1.3661. We prove that the same algorithm has approximation factor less than 1.22, which brings us much closer to a (1+sqrt{2})/2 ~ 1.207 lower bound given by Agarwal and Sharathkumar.
We also apply this technique to the dynamic version of the minimum enclosing ball problem (in the non-streaming setting). We give an O(dn)-space data structure that can maintain a 1.22-approximate minimum enclosing ball in O(d log n) expected amortized time per insertion/deletion.
Given a (master) set M of n points in d-dimensional Euclidean space, consider drawing a random subset that includes each point m_i in M with an independent probability p_i. How difficult is it to compute elementary statistics about the closest pair of points in such a subset? For instance, what is the probability that the distance between the closest pair of points in the random subset is no more than l, for a given value l? Or, can we preprocess the master set M such that given a query point q, we can efficiently estimate the expected distance from q to its nearest neighbor in the random subset? These basic computational geometry problems, whose complexity is quite well-understood in the deterministic setting, prove to be surprisingly hard in our stochastic setting. We obtain hardness results and approximation algorithms for stochastic problems of this kind.
We present a number of new results on one of the most extensively studied topics in computational geometry, orthogonal range searching. All our results are in the standard word RAM model:
The most recent previous development on (a) was reported back in SoCG'95 by Gupta, Janardan, Smid, and Dasgupta, whose main result was an O([n lg n + k] lglg n) algorithm. The best previous result on (b) was an O(n lg n lglg n) algorithm due to Gabow, Bentley, and Tarjan---from STOC'84! As a consequence, we also obtain the current-record time bound for the maxima problem in all constant dimensions above 4.
We present three results related to dynamic convex hulls:
We study the complexity of geometric minimum spanning trees under a stochastic model of input: Suppose we are given a master set of points {s_1,s_2,...,s_n} in d-dimensional Euclidean space, where each point s_i is active with some independent and arbitrary but known probability p_i. We want to compute the expected length of the minimum spanning tree (MST) of the active points. This particular form of stochastic problems has not been investigated before in computational geometry to our knowledge, and is motivated by uncertainty inherent in many sources of geometric data.
We answer a basic data structuring question (for example, raised by Dietz and Raman back in SODA 1991): can van Emde Boas trees be made persistent, without changing their asymptotic query/update time? We present a (partially) persistent data structure that supports predecessor search in a set of integers in {1,...,U} under an arbitrary sequence of n insertions and deletions, with O(loglog U) expected query time and expected amortized update time, and O(n) space. The query bound is optimal in U for linear-space structures and improves previous near-O((loglog U)^2) methods.
The same method solves a fundamental problem from computational geometry: point location in orthogonal planar subdivisions (where edges are vertical or horizontal). We obtain the first static data structure achieving O(loglog U) worst-case query time and linear space. This result is again optimal in U for linear-space structures and improves the previous O((loglog U)^2) method by de Berg, Snoeyink, and van Kreveld (1992). The same result also holds for higher-dimensional subdivisions that are orthogonal binary space partitions, and for certain nonorthogonal planar subdivisions such as triangulations without small angles. Many geometric applications follow, including improved query times for orthogonal range reporting for dimensions >= 3 on the RAM.
Our key technique is an interesting new van-Emde-Boas-style recursion that alternates between two strategies, both quite simple.
We revisit one of the most fundamental classes of data structure problems in computational geometry: range searching. Back in SoCG'92, Matousek gave a partition tree method for d-dimensional simplex range searching achieving O(n) space and O(n^{1-1/d}) query time. Although this method is generally believed to be optimal, it is complicated and requires O(n^{1+eps}) preprocessing time for any fixed eps > 0. An earlier method by Matousek (SoCG'91) requires O(n log n) preprocessing time but O(n^{1-1/d} polylog n) query time. We give a new method that achieves simultaneously O(n log n) preprocessing time, O(n) space, and O(n^{1-1/d}) query time with high probability. Our method has several advantages:
We give an O(n sqrt{lg n})-time algorithm for counting the number of inversions in a permutation on n elements. This improves a long-standing previous bound of O(n lg n/lg lg n) that followed from Dietz's data structure [WADS'89], and answers a question of Andersson and Petersson [SODA'95]. As Dietz's result is known to be optimal for the related dynamic rank problem, our result demonstrates a significant improvement in the offline setting. Our new technique is quite simple: we perform a "vertical partitioning" of a trie (akin to van Emde Boas trees), and use ideas from external memory. However, the technique finds numerous applications: for example, we obtain
As a bonus, we also give a simple (1+epsilon)-approximation algorithm for counting inversions that runs in linear time, improving the previous O(n lg lg n) bound by Andersson and Petersson.
We prove the existence of an algorithm A for computing 2-d or 3-d convex hulls that is optimal for every point set in the following sense: for every set S of n points and for every algorithm A' in a certain class C, the maximum running time of A on input s_1,...,s_n is at most a constant factor times the maximum running time of A' on s_1,...,s_n, where the maximum is taken over all permutations s_1,...,s_n of S. In fact, we can establish a stronger property: for every S and A', the maximum running time of A is at most a constant factor times the average running time of A' over all permutations of S. We call algorithms satisfying these properties instance-optimal in the order-oblivious and random-order setting. Such instance-optimal algorithms simultaneously subsume output-sensitive algorithms and distribution-dependent average-case algorithms, and all algorithms that do not take advantage of the order of the input or that assume the input is given in a random order.
The class C under consideration consists of all algorithms in a decision tree model where the tests involve only multilinear functions with a constant number of arguments. To establish an instance-specific lower bound, we deviate from traditional Ben-Or-style proofs and adopt an interesting adversary argument. For 2-d convex hulls, we prove that a version of the well known algorithm by Kirkpatrick and Seidel (1986) or Chan, Snoeyink, and Yap (1995) already attains this lower bound. For 3-d convex hulls, we propose a new algorithm.
To demonstrate the potential of the concept, we further obtain instance-optimal results for a few other standard problems in computational geometry, such as maxima in 2-d and 3-d, orthogonal line segment intersection in 2-d, finding bichromatic L_infty-close pairs in 2-d, off-line orthogonal range searching in 2-d, off-line dominance reporting in 2-d and 3-d, off-line halfspace range reporting in 2-d and 3-d, and off-line point location in 2-d.
We present approximation algorithms for maximum independent set of pseudo-disks in the plane, both in the weighted and unweighted cases. For the unweighted case, we prove that a local search algorithm yields a PTAS. For the weighted case, we suggest a novel rounding scheme based on an LP relaxation of the problem, that leads to a constant-factor approximation.
Most previous algorithms for maximum independent set (in geometric settings) relied on packing arguments that are not applicable in this case. As such, the analysis of both algorithms requires some new combinatorial ideas, which we believe to be of independent interest.
We describe the first optimal randomized in-place algorithm for the basic 3-d convex hull problem (and, in particular, for 2-d Voronoi diagrams). The algorithm runs in O(n log n) expected time using only O(1) extra space; this improves the previous O(n log^3 n) bound by Bronnimann, Chan, and Chen [SoCG'04]. The same approach leads to an optimal randomized in-place algorithm for the 2-d line segment intersection problem, with O(n log n + K) expected running time for output size K, improving the previous O(n log^2 n + K) bound by Vahrenhold [WADS'05]. As a bonus, we also point out a simplification of a known optimal cache-oblivious (non-in-place) algorithm by Kumar and Ramos (2002) for 3-d convex hulls, and observe its applicability to 2-d segment intersection, extending a recent result for red/blue segment intersection by Arge, Molhave, and Zeh [ESA'08]. Our results are all obtained by standard random sampling techniques, with some interesting twists.
This note describes a data structure that has the same theoretical performance as Fibonacci heaps, supporting decrease-key operations in O(1) amortized time and delete-min operations in O(log n) amortized time. The data structure is simple to explain and analyze, and may be of pedagogical value.
We establish the first nontrivial lower bounds on time-space tradeoffs for the selection problem. We prove that any comparison-based randomized algorithm for finding the median requires Omega(n log log_S n) expected time in the RAM model (or more generally in the comparison branching program model), if we have S bits of extra space besides the read-only input array. This bound is tight for all S >> log n, and remains true even if the array is given in a random order. Our result thus answers a 16-year-old question of Munro and Raman, and also complements recent lower bounds that are restricted to sequential access, as in the multi-pass streaming model [Chakrabarti et al., SODA 2008].
We also prove that any comparison-based, deterministic, multi-pass streaming algorithm for finding the median requires Omega(n log^* (n/s) + n log_s n) worst-case time (in scanning plus comparisons), if we have s cells of space. This bound is also tight for all s >> log^2 n. We get deterministic lower bounds for I/O-efficient algorithms as well.
All proofs in this paper involve "elementary" techniques only.
We give the first optimal solution to a standard problem in computational geometry: three-dimensional halfspace range reporting. We show that n points in 3-d can be stored in a linear-space data structure so that all k points inside a query halfspace can be reported in O(log n + k) time. The data structure can be built in O(n log n) expected time. The previous methods with optimal query time required superlinear (O(n log log n)) space.
We also mention consequences, for example, to higher dimensions and to external-memory data structures. As an aside, we partially answer another open question concerning the crossing number in Matousek's shallow partition theorem in the 3-d case (a tool used in many known halfspace range reporting methods).
Dynamic connectivity is a well-studied problem, but so far the most compelling progress has been confined to the edge-update model: maintain an understanding of connectivity in an undirected graph, subject to edge insertions and deletions. In this paper, we study two more challenging, yet equally fundamental problems:
Subgraph connectivity asks to maintain an understanding of connectivity under vertex updates: updates can turn vertices on and off, and queries refer to the subgraph induced by on vertices. (For instance, this is closer to applications in networks of routers, where node faults may occur.) We describe a data structure supporting vertex updates in O~(m^{2/3}) amortized time, where m denotes the number of edges in the graph. This greatly improves over the previous result [STOC'02], which required fast matrix multiplication and had an update time of O(m^{0.94}). The new data structure is also simpler.
Geometric connectivity asks to maintain a dynamic set of n geometric objects, and query connectivity in their intersection graph. (For instance, the intersection graph of balls describes connectivity in a network of sensors with bounded transmission radius.) Previously, nontrivial fully dynamic results were known only for special cases like axis-parallel line segments and rectangles. We provide similarly improved update times, O~(n^{2/3}), for these special cases. Moreover, we show how to obtain sublinear update bounds for virtually all families of geometric objects which allow sublinear-time range queries. In particular, we obtain the first sublinear update time for arbitrary 2D line segments: O~(n^{9/10}); for d-dimensional simplices: O~(n^{1-1/(d(2d+1))}); and for d-dimensional balls: O~(n^{1-1/((d+1)(2d+3))}).
Given n axis-parallel boxes in a fixed dimension d >= 3, how efficiently can we compute the volume of the union? This standard problem in computational geometry, commonly referred to as Klee's measure problem, can be solved in time O(n^{d/2} log n) by an algorithm of Overmars and Yap (FOCS 1988). We give the first (albeit small) improvement: our new algorithm runs in time n^{d/2} 2^{O(log* n)}, where log* denotes the iterated logarithm.
For the related problem of computing the depth in an arrangement of n boxes, we further improve the time bound to near O(n^{d/2} / log^{d/2-1} n), ignoring log log n factors. Other applications and lower-bound possibilities are discussed. The ideas behind the improved algorithms are simple.
We give a dynamic data structure that can maintain an epsilon-coreset of n points, with respect to the extent measure, in O(log n) time for any constant epsilon > 0 and any constant dimension. The previous method by Agarwal, Har-Peled, and Varadarajan requires polylogarithmic update time. For points with integer coordinates bounded by U, we alternatively get O(log log U) time. Numerous applications follow, for example, on dynamically approximating the width, smallest enclosing cylinder, minimum bounding box, or minimum-width annulus. We can also use the same approach to maintain approximate k-centers in O(min{log n, log log U}) randomized amortized time for any constant k and any constant dimension. For the smallest enclosing cylinder problem, we also show that a constant-factor approximation can be maintained in O(1) randomized amortized time on the word RAM.
We revisit the problem of bounding the combinatorial complexity of the k-level in a two-dimensional arrangement of n curves. We give a number of small improvements over the results from the author's previous paper (FOCS'03). For example:
Given a point set in a fixed dimension, we note that a well-separated pair decomposition can be found in linear time if we assume that the ratio of the farthest pair distance to the closest pair distance is polynomially bounded. Many consequences follow; for example, we can construct spanners or solve the all-nearest-neighbors problem in linear time (under the same assumption), and we compute an approximate Euclidean minimum spanning tree in linear time (without any assumption).
We revisit a classic problem in computational geometry: preprocessing a planar n-point set to answer nearest neighbor queries. In SoCG 2004, Bronnimann, Chan, and Chen showed that it is possible to design an efficient data structure that takes no extra space at all other than the input array holding a permutation of the points. The best query time known for such "in-place data structures" is O(log^2 n). In this paper, we break the O(log^2 n) barrier by providing a method that answers nearest neighbor queries in time
O((log n)^{log_{3/2} 2} loglog n) = O(log^{1.71} n).The new method uses divide-and-conquer (based on planar separators) in a way that is quite unlike traditional point location methods, and extends previous 1-d data structuring techniques (specifically the van Emde Boas layout). The method has further applications, for example, in answering extreme point queries for a 3-d point set on the boundary of a convex set of constant complexity.
We study a bichromatic version of the well-known k-set problem: given two sets R and B of points of total size n and an integer k, how many subsets of the form (R\cap h) \cup (B - h) can have size exactly k over all halfspaces h? In the dual, the problem is asymptotically equivalent to determining the worst-case combinatorial complexity of the k-level in an arrangement of n halfspaces.
Disproving a conjecture by Linhart (1993), we present the first nontrivial upper bound for all k << n in two dimensions: O(nk^{1/3} + n^{5/6-e}k^{2/3+2e} + k^2) for any fixed e > 0. In three dimensions, we obtain the bound O(nk^{3/2} + n^{0.5034}k^{2.4932} + k^3). Incidentally, this also implies a new upper bound for the original k-set problem in four dimensions: O(n^2k^{3/2} + n^{1.5034}k^{2.4932} + nk^3), which improves the best previous result for all k << n^{0.923}. Extensions to other cases, such as arrangements of disks, are also discussed.
We reexamine fundamental problems from computational geometry in the word RAM model, where input coordinates are integers that fit in a machine word. We develop a new algorithm for offline point location, a two-dimensional analog of sorting where one needs to order points with respect to segments. This result implies, for example, that the Voronoi diagram of n points in the plane can be constructed in (randomized) time n . 2^{O(\sqrt{lg lg n})}. Similar bounds hold for numerous other geometric problems, such as three-dimensional convex hulls, planar Euclidean minimum spanning trees, line segment intersection, and triangulation of non-simple polygons.
In FOCS'06, we developed a data structure for online point location, which implied a bound of O(n lg n / lg lg n) for Voronoi diagrams and the other problems. Our current bounds are dramatically better, and a convincing improvement over the classic O(n lg n) algorithms. As in the field of integer sorting, the main challenge is to find ways to manipulate information, while avoiding the online problem (in that case, predecessor search).
In the first part of the paper, we reexamine the all-pairs shortest paths (APSP) problem and present a new algorithm with running time approaching O(n^3 log^3log n / log^2 n), which improves all known algorithms for general real-weighted dense graphs.
In the second part of the paper, we use fast matrix multiplication to obtain truly subcubic APSP algorithms for a large class of "geometrically weighted" graphs, where the weight of an edge is a function of the coordinates of its vertices. For example, for graphs embedded in Euclidean space of a constant dimension d, we obtain a time bound near O(n^{3-(3-w)/(2d+4)}), where w < 2.376; in two dimensions, this is O(n^{2.922}). Our framework greatly extends the previously considered case of small-integer-weighted graphs, and incidentally also yields the first truly subcubic result (near O(n^{3-(3-w)/4}) = O(n^{2.844}) time) for APSP in real-vertex-weighted graphs, as well as an improved result (near O(n^{(3+w)/2}) = O(n^{2.688}) time) for the all-pairs lightest shortest path problem for small-integer-weighted graphs.
Improving previous methods by Aronov and Har-Peled (SODA'05) and Kaplan and Sharir (SODA'06), we present a randomized data structure of O(n) expected size which can answer 3D approximate halfspace range counting queries in O(log (n/k)) expected time, where k is the actual value of the count. This is the first optimal method for the problem in the standard decision tree model; moreover, unlike previous methods, the new method is Las Vegas instead of Monte Carlo. In addition, we describe new results for several related problems, including approximate Tukey depth queries in 3D, approximate regression depth queries in 2D, and approximate linear programming with violations in low dimensions.
Given a planar subdivision whose coordinates are integers bounded by U <= 2^w, we present a linear-space data structure that can answer point location queries in O(min{ lg n/lglg n, sqrt{lg U/lglg U} }) time on the unit-cost RAM with word size w. This is the first result to beat the standard Theta(lg n) bound for infinite precision models.
As a consequence, we obtain the first o(n lg n) (randomized) algorithms for many fundamental problems in computational geometry for arbitrary integer input on the word RAM, including: constructing the convex hull of a three-dimensional point set, computing the Voronoi diagram or the Euclidean minimum spanning tree of a planar point set, triangulating a polygon with holes, and finding intersections among a set of line segments. Higher-dimensional extensions and applications are also discussed.
Though computational geometry with bounded precision input has been investigated for a long time, improvements have been limited largely to problems of an orthogonal flavor. Our results surpass this long-standing limitation, answering, for example, a question of Willard (SODA'92).
In this paper, we consider the online version of the following problem: partition a set of input points into subsets, each enclosable by a unit ball, so as to minimize the number of subsets used. In the one-dimensional case, we show that surprisingly the naive upper bound of 2 on the competitive ratio can be beaten: we present a new randomized 15/8-competitive online algorithm. We also provide some lower bounds and an extension to two dimensions.
In this paper we give a fully dynamic data structure to maintain the connectivity of the intersection graph of n axis-parallel rectangles. The amortized update time (insertion and deletion of rectangles) is O(n^{10/11} polylog n) and the query time (deciding whether two given rectangles are connected) is O(1). It slightly improves the update time (O(n^{0.94})) of the previous method while drastically reducing the query time (near O(n^{1/3})). Our method does not use fast matrix multiplication results and supports a wider range of queries.
We give subquadratic algorithms that, given two necklaces each with n beads at arbitrary positions, compute the optimal rotation of the necklaces to best align the beads. Here alignment is measured according to the l_p norm of the vector of distances between pairs of beads from opposite necklaces in the best perfect matching. We show surprisingly different results for p=1, p=2, and p=infty. For p=2, we reduce the problem to standard convolution, while for p=infty and p=1, we reduce the problem to (min,+) convolution and (median,+) convolution. Then we solve the latter two convolution problems in subquadratic time, which are interesting results in their own right. These results shed some light on the classic sorting X+Y problem, because the convolutions can be viewed as computing order statistics on the antidiagonals of the X+Y matrix. All of our algorithms run in o(n^2) time, whereas the obvious algorithms for these problems run in Theta(n^2) time.
We analyze an extremely simple approximation algorithm for computing the minimum enclosing ball (or the 1-center) of a set of points in high dimensions. We prove that this algorithm computes a 3/2-factor approximation in any dimension using minimum space in just one pass over the data points.
We consider the standard problem of approximate nearest neighbor search, for a given set of n points with integer coordinates in a constant-dimensional Euclidean space. We describe a simple implementation of a randomized algorithm that guarantees O(log n) expected query time and O(n log n) preprocessing time. The entire C++ code is under 100 lines long and requires no extra space other than the input array. The algorithm can easily be made dynamic as well.
We present a fully dynamic randomized data structure that can answer queries about the convex hull of a set of n points in three dimensions, where insertions take O(log^3 n) expected amortized time, deletions take O(log^6 n) expected amortized time, and extreme-point queries take O(log^2 n) worst-case time. This is the first method that guarantees polylogarithmic update and query cost for arbitrary sequences of insertions and deletions, and improves the previous O(n^epsilon)-time method by Agarwal and Matousek a decade ago. As a consequence, we obtain similar results for nearest neighbor queries in two dimensions and improved results for numerous fundamental geometric problems (such as levels in three dimensions and dynamic Euclidean minimum spanning trees in the plane).
We revisit the all-pairs-shortest-paths problem for an unweighted undirected graph with n vertices and m edges. We present new algorithms with the following running times:
The piercing problem seeks the minimum number of points for a set of objects such that each object contains at least one of the points. We present a polynomial-time approximation scheme (PTAS) for the piercing problem for a set of axis-parallel unit-height rectangles. We also examine the problem in a dynamic setting and show how to maintain a factor-2 approximation under insertions in logarithmic amortized time, by solving an incremental version of the maximum independent set problem for interval graphs.
We study two problems for a given n-point set in 3-space: finding a largest subset with diameter at most one, and finding a subset of k points with minimum diameter. For the former problem we suggest several polynomial-time algorithms with constant approximation factors, the best of which has factor pi / arccos(1/3) < 2.553. For the latter problem we observe that there is a polynomial-time approximation scheme.
We give space-efficient geometric algorithms for two related problems. Given a set of n axis-aligned rectangles in the plane, we calculate the area covered by the union of these rectangles (Klee's measure problem) in O(n^{3/2} log n) time with O(sqrt{n}) extra space. If the input can be destroyed and there are no degenerate cases and input coordinates are all integers, we can solve Klee's measure problem in O(n log^2 n) time with O(log^2 n) extra space. Given a set of n points in the plane, we find the axis-aligned unit square that covers the maximum number of points in O(n log^3 n) time with O(log^2 n) extra space.
We describe an O(n^3 / log n)-time algorithm for the all-pairs-shortest-paths problem for a real-weighted directed graph with n vertices. This slightly improves a series of previous, slightly subcubic algorithms by Fredman (1976), Takaoka (1992), Dobosiewicz (1990), Han (2004), Takaoka (2004), and Zwick (2004). The new algorithm is surprisingly simple and different from previous ones.
We initiate the study of exact geometric algorithms that require limited storage and make only a small number of passes over the input. Fundamental problems such as low-dimensional linear programming and convex hulls are considered.
A favorite open problem in combinatorial geometry is to determine the worst-case complexity of a level in an arrangement. Up to now, nontrivial upper bounds in three dimensions are known only for the linear cases of planes and triangles. We propose the first technique that can deal with more general surfaces in three dimensions. For example, in an arrangement of n "pseudo-planes" or "pseudo-spheres" (where each triple of surfaces has at most two common intersections), we prove that there are at most O(n^{2.997}) vertices of any given level.
Given a parametric graph with n vertices and m edges where edge weights change linearly over time, we show how to find the time value at which the heaviest edge weight in the minimum spanning tree is minimized in O(n(m/n)^epsilon log n + m) expected time...
We give three related algorithmic results concerning a simple polygon P:
We study the problem of maintaining a (1+epsilon)-factor approximation of the diameter of a stream of points under the sliding window model. In one dimension, we give a simple algorithm that only needs to store O((1/epsilon) log R) points at any time, where the parameter R denotes the "spread" of the point set. This bound is optimal and improves Feigenbaum, Kannan, and Zhang's recent solution by two logarithmic factors. We then extend our one-dimensional algorithm to higher constant dimensions and, at the same time, correct an error in the previous solution. In high nonconstant dimensions, we also observe a constant-factor approximation algorithm that requires sublinear space. Related optimization problems, such as the width, are also considered in the two-dimensional case.
We speed up previous (1+epsilon)-factor approximation algorithms for a number of geometric optimization problems in fixed dimensions: diameter, width, minimum-radius enclosing cylinder, minimum-width annulus, minimum-volume bounding box, minimum-width cylindrical shell, etc. Linear time bounds were known before; we further improve the dependence of the "constants" in terms of epsilon.
We next consider the data stream model and present new (1+epsilon)-factor approximation algorithms that need only constant space for all of the above problems in any fixed dimension. Previously, such a result was known only for diameter.
Both sets of results are obtained using the core-set framework recently proposed by Agarwal, Har-Peled, and Varadarajan.
For many geometric problems, there are efficient algorithms that surprisingly use very little extra space other than the given array holding the input. For many geometric query problems, there are efficient data structures that need no extra space at all other than an array holding a permutation of the input. In this paper, we obtain the first such space-economical solutions for a number of fundamental problems, including three-dimensional convex hulls, two-dimensional Delaunay triangulations, fixed-dimensional range queries, and fixed-dimensional nearest neighbor queries.
We present space-efficient algorithms for computing the convex hull of a simple polygonal line in-place, in linear time. It turns out that the problem is as hard as stable partition, i.e., if there were a truly simple solution then stable partition would also have a truly simple solution, and vice versa. Nevertheless, we present a simple self-contained solution that uses O(log n) space, and indicate how to improve it to O(1) space with the same techniques used for stable partition. If the points inside the convex hull can be discarded, then there is a truly simple solution that uses a single call to stable partition, and even that call can be spared if only extreme points are desired (and not their order). If the polygonal line is closed, then the problem admits a very simple solution which does not call for stable partitioning at all.
We present the first optimal algorithm to compute the maximum Tukey depth (also known as location or halfspace depth) for a non-degenerate point set in the plane. The algorithm is randomized and requires O(n log n) expected time for n data points. In a higher fixed dimension d >= 3, the expected time bound is O(n^{d-1}), which is probably optimal as well. The result is obtained using an interesting variant of the author's randomized optimization technique, capable of solving "implicit" linear-programming-type problems; some other applications of this technique are briefly mentioned.
Finding the maximum independent set in the intersection graph of n axis-parallel rectangles is NP-hard. We re-examine two known approximation results for this problem. For the case of rectangles of unit height, Agarwal, van Kreveld, and Suri (1997) gave a (1+1/k)-factor algorithm with an O(n log n + n^{2k-1}) time bound for any integer constant k >= 1; we describe a similar algorithm running in only O(n log n + nD^{k-1}) time, where D <= n denotes the maximum number of rectangles a point can be in. For the general case, Berman, DasGupta, Muthukrishnan, and Ramaswami (2001) gave a log_k n-factor algorithm with an O(n^{k+1}) time bound for any integer constant k >= 2; we describe similar algorithms running in O(n log n + nD^{k-2}) and n^{O(k/log k)} time.
We give a surprisingly short proof that in any planar arrangement of n curves where each pair intersects at most a fixed number (s) of times, the k-level has subquadratic (O(n^{2-1/2s})) complexity. This answers one of the main open problems from the author's previous paper (FOCS'00), which provided a weaker bound for a restricted class of curves only (graphs of degree-s polynomials). When combined with existing tools (cutting curves, sampling, etc.), the new idea generates a slew of improved k-level results for most of the curve families studied earlier, including a near-O(n^{3/2}) bound for parabolas.
We examine the space requirement for the classic line-segment intersection problem. Using so-called implicit data structures, we show how to make the standard sweep-line algorithm run in O((n+k)log^2 n) time with only O(log^2 n) extra space, where n is the number of line segments and k is the number of intersections. If division is allowed and input can be destroyed, the algorithm can run in O((n+k)log n) time with O(1) extra space.
We consider the problem of finding the shortest curve in the plane that has unit width. This problem was first posed as the "river shore" puzzle by Ogilvy (1972) and is related to the area of on-line searching. Adhikari and Pitman (1989) proved that the optimal solution has length 2.2782... We present a simpler proof, which exploits the fact that the width of a polygon does not decrease under a certain convexification operation.
We give a simple interpretation and a simple implementation of the classical divide-and-conquer algorithm for computing 3-d convex hulls (and in particular, 2-d Delaunay triangulations and Voronoi diagrams). The entire C++ code is under 100 lines long, requires no special data structures, and uses only 6n pointers for space.
Let tau_K^{(d)} be the analogous ratio in d-dimensional space. Khuller et al. showed that tau_3^{(d)} < 1.667 for any d. We observe that tau_3^{(d)} < 1.633.
Applications in the plane include improved algorithms for finding a line that misclassifies the fewest among a set of bichromatic points, and finding the smallest circle enclosing all but k points. We also discuss related problems of finding local minima in levels.
We prove that there is a polyhedron with genus 6 whose faces are orthogonal polygons (equivalently, rectangles) and yet the angles between some faces are not multiples of 90 degrees, so the polyhedron itself is not orthogonal. On the other hand, we prove that any such polyhedron must have genus at least 3. These results improve the bounds of Donoso and O'Rourke (2001) that there are nonorthogonal polyhedra with orthogonal faces and genus 7 or larger, and any such polyhedron must have genus at least 2. We also demonstrate nonoverlapping one-piece edge-unfoldings (nets) for the genus-7 and genus-6 polyhedra.
We give a tradeoff theorem between the area and the aspect ratio required by any planar straight-line drawing of K_{2,n} on the integer lattice. In particular we show that if the drawing is contained in a rectangle of area O(n) then the rectangle must have aspect ratio Omega(n), and conversely, if the aspect ratio is 1 then the area must be Omega(n^2/log^2 n).
In this note, we point out that much simpler algorithms with the same performance are possible using standard, though nonalgebraic, RAM operations. This is interesting, considering that nonalgebraic operations have been used before in the literature...
Usually, binary search only makes sense in sorted arrays. We show that insertion sort based on repeated "binary searches" in an initially unsorted array also sorts n elements in time Theta(n^2 log n). If n is a power of two then the expected termination point of a binary search in a random permutation of n elements is exactly the cell where the element should be if the array was sorted. We further show that we can sort in expected time Theta(n^2 log n) by always picking two random cells and swapping their contents if they are not ordered correctly.
While discrepancy theory is normally only studied in the context of 2-colorings, we explore the problem of k-coloring, for k>=2, a set of vertices to minimize imbalance among a family of subsets of vertices. The imbalance is the maximum, over all subsets in the family, of the largest difference between the size of any two color classes in that subset. The discrepancy is the minimum possible imbalance. We show that the discrepancy is always at most 4d-3, where d (the "dimension") is the maximum number of subsets containing a common vertex. For 2-colorings, the bound on the discrepancy is at most max{2d-3, 2}. Finally, we prove that several restricted versions of computing the discrepancy are NP-complete.
In planning a flight, stops at intermediate airports are sometimes necessary to minimize fuel consumption, even if a direct flight is available. We investigate the problem of finding the cheapest path from one airport to another, given a set of n airports in R^2 and a function l: R^2 x R^2 -> R^+ representing the cost of a direct flight between any pair.
Given a source airport s, the cheapest-path map is a subdivision of R^2 where two points lie in the same region iff their cheapest paths from s use the same sequence of intermediate airports. We show a quadratic lower bound on the combinatorial complexity of this map for a class of cost functions. Nevertheless, we are able to obtain subquadratic algorithms to find the cheapest path from s to all other airports for any well-behaved cost function l: our general algorithm runs in O(n^{4/3+eps}) time, and a simpler, more practical variant runs in O(n^{3/2+eps}) time, while a special class of cost functions requires just O(n log n) time.
We consider the problem of bounding the complexity of the k-th level in an arrangement of n curves or surfaces, a problem dual to, and an extension of, the well-known k-set problem. Among other results, we prove a new bound, O(n k^{5/3}), on the complexity of the k-th level in an arrangement of n planes in R^3, or on the number of k-sets in a set of n points in three dimensions, and we show that the complexity of the k-th level in an arrangement of n line segments in the plane is O(n sqrt(k) alpha(n/k)), and that the complexity of the k-th level in an arrangement of n triangles in 3-space is O(n^2 k^{5/6} alpha(n/k)).
We investigate the problem of drawing an arbitrary n-node binary tree orthogonally and upwardly in an integer grid using straight-line edges. We show that one can simultaneously achieve good area bounds while also allowing the aspect ratio to be chosen as being a constant or even an arbitrary parameter. In addition, we show that one can also achieve an additional desirable aesthetic criterion, which we call "subtree separation." Our drawings require O(n log n) area, which is optimal to within a constant factor. An improvement for non-upward drawings is briefly mentioned.
This note considers the complexity of a free region in the configuration space of a polygonal robot translating amidst polygonal obstacles in the plane. Specifically, given polygonal sets P and Q with k and n vertices, respectively (k < n), the number of edges and vertices bounding a single face of the complement of the Minkowski sum P + Q is Theta(nk alpha(k)) in the worst case. The lower bound comes from a construction based on lower envelopes of line segments; the upper bound comes from a combinatorial bound on Davenport-Schinzel sequences that satisfy two alternation conditions.
In this paper, we give an algorithm for output-sensitive construction of an f-face convex hull of a set of n points in general position in E^4. Our algorithm runs in O((n+f) log^2 f) time and uses O(n+f) space. This is the first algorithm within a polylogarithmic factor of optimal O(n log f + f) time over the whole range of f. By a standard lifting map, we obtain output-sensitive algorithms for the Voronoi diagram or Delaunay triangulation in E^3 and for the portion of a Voronoi diagram that is clipped to a convex polytope. Our approach simplifies the "ultimate convex hull algorithm" of Kirkpatrick and Seidel in E^2 and also leads to improved output-sensitive results on constructing convex hulls in E^d for any even constant d > 4.
The documents contained in this directory are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.