`
leonzhx
  • 浏览: 800666 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论
阅读更多

1.  The i th order statistic of a set of n elements is the i th smallest element. Regardless of the parity of n, medians occur at i = ⌊(n + 1)/2⌋ (the lower median) and i = ⌈(n+1)/2⌉ (the upper median). Here, we consistently use the phrase “the median” to refer to the lower median.

 

2.  We formally specify the selection problem as follows:

Input: A set A of n (distinct) numbers and an integer , with ≤ i ≤ n.

Output: The element x ∈ A that is larger than exactly i-1 other elements of A.

 

3.  we can find both the minimum and the maximum using at most 3⌊n/2⌋

comparisons. We do so by maintaining both the minimum and maximum elements

seen thus far. We compare pairs of elements from the input first with each other, and then we compare the smaller with the current minimum and the larger to the current maximum, at a cost of 3 comparisons for every 2 elements. How we set up initial values for the current minimum and maximum depends on whether n is odd or even. If n is odd, we set both the minimum and maximum to the value of the first element, and then we process the rest of the elements in pairs. Then we perform 3⌊n/2⌋ comparisons. If n is even, we perform 1 comparison on the first 2 elements to determine the initial values of the minimum and maximum, and then process the rest of the elements in pairs as in the case for odd n. We perform 1 initial comparison followed by 3(n - 2)/2 comparisons, for a total of 3n/2-2. Thus, in either case, the total number of comparisons is at most 3⌊n/2⌋.

 

4.  As in quicksort, randomSelect partition the input array recursively. But unlike quicksort, which recursively processes both sides of the partition, it works on only one side of the partition. Quicksort has an expected running time of θ(n lg n), the expected running time of randomSelect is θ(n), assuming that the elements are distinct:

 

int randomSelect(int[] arr, int start, int end, int i ) {
    // if the array only has one element
    if (start==end) return arr[start]; 
    int q = randomPartition(arr, start, end);
    int k = q - start + 1;
    // if pivot is the answer
    if ( k == i ) return arr[q];
    // if i < k , then the answer is the ith smallest one in left subarray
    if ( i < k ) return randomSelect(arr, start, q - 1, i);
    // if i > k , then the answer is the i-k th smallest one in right subarray
    else return randomSelect(arr, q + 1, end , i - k );

}
The worst-case running time for randomSelect is θ(n^2), even to find the minimum, because we could be extremely unlucky and always partition around

 

the largest remaining element, and partitioning takes θ(n) time. However, randomSelect can find any order statistic, and in particular the median, in

expected linear time, assuming that the elements are distinct.

 

5.  The algorithm select finds the desired element in O(n)  time in worst case by recursively partitioning the input array and guaranteeing a good split upon partitioning the array. The select algorithm determines the i th smallest of an input array of n > 1 distinct elements by executing the following steps. (If n = 1, then select merely returns its only input value as the i th smallest.)

    1)  Divide the n elements of the input array into ⌊n/5⌋ groups of 5 elements each and at most one group made up of the remaining n mod 5 elements.

    2)  Find the median of each of the ⌊n/5⌋ groups by first insertion-sorting the elements of each group (of which there are at most 5) and then picking the median from the sorted list of group elements.

    3)  Use select recursively to find the median x of the ⌊n/5⌋ medians found in

step 2). (If there are an even number of medians, then x is the lower median.)

    4)  Partition the input array around the median-of-medians x. Let k be one more than the number of elements on the low side of the partition, so that x is the kth smallest element and there are n-k elements on the high side of the partition.

    5)  If i = k, then return x. Otherwise, use select recursively to find the i th

smallest element on the low side if i < k, or the (i-k)th smallest element on

the high side if i > k.

At least half of the medians found in step 2) are greater than or equal to the median-of-medians x.Thus, at least half of the ⌊n/5⌋ groups contribute at least 3 elements that are greater than x, except for the one group that has fewer than 5 elements if 5 does not divide n exactly, and the one group containing x itself. Discounting these two groups, it follows that the number of elements greater than x is at least 3n/10 - 6Similarly, at least 3n/10 - 6 elements are less than x. Thus, in the worst case, step 5) calls select recursively on at most 7n/10 + 6 elements.




6. Implementation:

 

public class MedianOfMediansSelector {
    private int[] arr;
    
    public MedianOfMediansSelector( int[] arr) {
        this.arr = arr;
    }
    
    public int select(int index) {
        if (index < 0 && index >= arr.length) throw new IllegalArgumentException("Index out of boundary");
	int end = arr.length;
	int start = 0;
	int pivotIndex = choosePivot(start, end);
	pivotIndex = partition(start, end, pivotIndex);
	while (pivotIndex != index) {
		if( pivotIndex > index ) {
			end = pivotIndex;
			pivotIndex = choosePivot(start, end);
			pivotIndex = partition(start, end, pivotIndex);
					
		} else {
			start = pivotIndex + 1;
			pivotIndex = chooser.choosePivot(start, end);
			pivotIndex = partition(start, end, pivotIndex);
		}
	}
			
	return arr[index];
    }
		
    private int partition(int start, int end , int pivotIndex){
	//swap with the partition element with the first one
	int temp = arr[start];
	arr[start] = arr[pivotIndex];
	arr[pivotIndex] = temp;
	int pivot = arr[start];
	int i=start+1,j=end-1;
	while( i <= j ) {
		while ( i <= j && arr[i] <= pivot ) i ++;
		if (i > j) break;
		while (arr[j] > pivot ) j--;
		if ( i > j) break;
        	temp = arr[i];
		arr[i] = arr[j];
		arr[j] = temp;
	}
	i--;
	arr[start] = arr[i];
	arr[i] = pivot;
	return i;
    }
		
    private int choosePivot(int start, int end) {
	if ( start >= end ) throw new IllegalArgumentException("end should bigger than start!");
	if ( end > arr.length ) throw new IllegalArgumentException("end should not bigger than arry length!");
        if ( end - start <= 2) {
            return start;
        }
			
	// sort elements in each block
	int BLOCK_SIZE = 5;
	int i = BLOCK_SIZE;
	int num = 1;
	for ( ; i < end ; i += BLOCK_SIZE , num ++) {
		insertionSort(i - BLOCK_SIZE, i);
	}
	i -= BLOCK_SIZE;
	insertionSort(i, end);
	// choose medium of each group
	int medium = BLOCK_SIZE/2;
	int[] mediums = new int[num];
	int r = ( i + end ) / 2; // medium item in last block
	i = 0 ;
	for ( int j = medium; i < num - 1 ; j += BLOCK_SIZE, i ++ ) {
		mediums[i] = arr[j];
	}
	mediums[i] = arr[r];
	MedianOfMediansSelector selector = new MedianOfMediansSelector (mediums);
	int pivot = selector.select(num/2);
	for ( i = start; i < end; i ++) {
		if ( arr[i] == pivot) return i;
	}
	throw new IllegalStateException("Medium of Medium didn't find a correct pivot!");
    }

    private void insertionSort(int start, int end) {
	int temp;
	for ( int i = start + 1; i < end ; i ++ ) {
	    for ( int j = i ; j > 0 ; j --) {
                if ( arr[j] < arr[j-1] ) {
		    temp = arr[j];
		    arr[j] = arr[j-1];
		    arr[j-1] = temp;
		} else {
		    break;
		}
	    }
	}
    }
		
}

 

  • 大小: 13.6 KB
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics