Tuesday, July 5, 2011

Find the Median of Two Sorted Arrays

Problem: given two sorted arrays, find the median.

Solution: Assume the length of the two arrays are m and n. When m+n is even, the median is the average of two numbers. The neat solution has a time complexity of O(log(m+n)). Basically, we need to leverage binary search. The details of this algorithm is very complex due to many corner cases to handle. The following just gives the main idea.

  1. get the median of two arrays Ai and Bj, where i = m/2 and j = n/2. If  Ai <= Bj, the median will be between Ai and Bj.
  2. therefore, we can discard A0 to Ai-1 and Bj+1 to Bn-1. However, we cannot do this in a naive way. To reduce to an equivalent sub-problem, we need to discard the same number of elements from each array. Then we keep comparing the middle elements of the sub-arrays. To discard the same number of elements, we just need to get Min(in-j-1).
Besides, we can also use the techniques in Find the Kth Smallest Element of Two Sorted Arrays.
    More: a more general problem is to find the median for K sorted arrays. There are two ways:
    1. Guess a number, search each array to see how many elements are smaller than this number. Then we can have a total number. If this total is smaller than half, we guess a bigger number; otherwise, we guess a smaller number. Try to repeat binary search until our goal is met.
    2. The second approach is to first find the medians of all the arrays. Then we can know the bounds of these medians (low_m, high_m). For each array, throw the elements that are out of the bounds. Make sure the elements thrown at the two ends of the array should be the same number. Then repeat until our goal is met.

    No comments:

    Post a Comment