Understanding the line sweep technique for intervals
In some array problems, the data items stored in the array may represent an interval. An interval can be thought of as two points on the x-axis of a cartesian plane, where the x-axis represents any scalar metrics like time, distance, etc. An interval is denoted by a start and end coordinate that is often stored as a pair at each index in the array.
An interval denoted by the start and end coordinate on x-axis of a cartesian plane
Line sweep technique
The line sweep technique, also known as the sweep line algorithm, is a powerful approach to efficiently solve geometric problems involving events (such as intervals) on an axis of the cartesian plane. It works by "sweeping" a line across the set of events while maintaining a data structure that keeps track of active events as the line progresses. Let's understand how the line sweep technique can be applied to intervals.
1. Sort the events
The first step is to sort the intervals, as the line sweep algorithm requires the events to be sorted in some order in the array holding them. Intervals are generally sorted by their start or end values. If sorted by their start values, intervals with the same start values are further sorted by their end values. Similarly, if the intervals are sorted by their end values, intervals with the same end values are further sorted by their start values.
Sort the interval array in a non-decreasing order of start and end points.
This way, the array can be viewed as the x-axis of the cartesian plane with events marked on it from left to right. This allows us to traverse the array as if we were swiping a line through it on the x-axis of the cartesian plane.
Sorting the given array of intervals using start values first and then end values.
2. Sweep the line
Once we have sorted the array of intervals, traversing it from start to end becomes the same as moving on the x-axis from left to right. We keep some data structure like an event counter, set, or list to keep state information as we traverse the array from start to end. This can be visualized as a vertical line moving from left to right on the x-axis and hence is called a line sweep. As the sweep line moves, it dynamically updates the state variables to correctly process overlaps, merges, or other conditions dictated by the problem.
Iterating through the sorted array is the same as sweeping a line through the intervals and processing them
Complexity Analysis
The sweep line algorithm sorts the input array of intervals in O(NlogN) time, where N is the size of the array, and then traverses it only once in linear O(N) time. And so, the time complexity in any case is O(NlogN).
Based on the problem, we may not be able to sort the array in place and may need to create a sorted copy, resulting in a linear O(N) space complexity. In other cases, we may be able to sort the input array in-place and have a constant O(1) space complexity.
Best Case: Can sort the input array in place
- Space Complexity - O(1)
- Time Complexity - O(NlogN)
Worst Case: Need to create a new array
- Space Complexity - O(N)
- Time Complexity - O(NlogN)