ghost February 2016

Count number of intervals containing another interval?

Given two lists each containing N intervals (subset of the number line), each interval with form of a start point and endpoint. How many pairs of these intervals from one list contains intervals from another list?

For example:

If list A is {(1,7), (2,9)} and list B is {(3,6), (5,8)}

Then the count of pairs where A has an interval that contains an interval in B would be 3 pairs:

(1,7),(3,6)
(2,9)(3,6)
(2,9)(5,8)

The goal is to shoot for O(n log n).

My algorithm currently is to sort by x-coordinate first and take that as one list. Then sort the list by y-coordinate and count the inversions between the two lists. But my question is why does this work? Any insight would be appreciated.

The way I have I'm currently visualizing is in the following geometric fashion (where each intersection of the lines is a count of the num inversion):

enter image description here

Note: I'm not sure how to go about checking inversions in a list of list. Just trying to get an approach that would give O(n log n). If there are any other approaches happy to hear suggestions.

Answers


Baj Mile February 2016

If you decide to try also the tree/gird approach I will explain how this may work. For your task you do not need 2D but a one-dimensional interval map or even grid. Lets choose the grid because it is more clear.

Suppose that your pairs are integers from 1 to 100. Then you can afford to have one array with size 100. Each cell from the array contains empty set (ordered list). See the picture below:

enter image description here

Now we start adding the intervals in the grid. We add 1 in all grid sells between 1,7 and 2 in all grid cells between 2,9 (1,2 are IDs which we increase by 1 with each inserted interval, insert in this way is inefficient but this can be fixed).

Now how we check for the intervals from B? We just get each ID from the first cell and check if it is also in the second cell. Since the cells are sets the checks takes O(log n). We need n O(log n) operations in the worst case to check the overlapping intervals count that one interval from B has inside A.

This can be extended to use interval map instead of grid (if the numbers are not small integers). Also if you have fixed number of intervals in A, and no memory requirements, O(logN) can become O(1) if we replace the sets with arrays for example.


MichaƂ Komorowski February 2016

I'll answer the first question about why the solution with inversion works. Firstly, I'll clarify one thing. You shouldn't count all inversions (intersections of lines) but only these that occur between an element from A list and an element from B list. In your example there is no difference but let's assume that A = {(1,7), (2,5)} and B = {(3,6), (5,8)}. If we visualize these case as in your example there would be 2 intersection but there is only 1 pair we are looking for i.e. (1,7), (3,6).

Now let's assume that we have 2 intervals: I1=(x1,y1) and I2=(x2,y2). I2 is included in I1. It means that x1 <= x2 and y1 >= y2. Now, if you sort the list of intervals by x then I1 will always be before I2. Analogically if you sort the list of intervals by y then I1 will always be after I2. It also beans that if we connect I1, I2 in the first list with I1, I2 in the second list then the lines must cross.

However, let's assume that x1 <= x2 and y1 < y2. Now I1 will be before I2 in the first and in the second list. If we connect I1, I2 in the first list with I1, I2 in the second list then the lines will never cross. The same situation is if x1 > x2 and y1 >= y2

Here is the visualization of these cases:

Post Status

Asked in February 2016
Viewed 3,741 times
Voted 9
Answered 2 times

Search




Leave an answer