Insert Sort, Merge Sort and Quick Sort
Introduction
Insert Sort Algorithm
Straight Insertion Sort:
Think of n elements to be sorted as an ordered table and an unordered table. At the beginning, the ordered table contains only 1 element, while the unordered table contains n-1 elements. In the sorting process, the first element is taken out of the unordered table every time and inserted into the appropriate position in the ordered table to make it a new ordered table. Repeat n-1 times to complete the sorting process.
What we should do:
- Take the first number in the disordered region and find its corresponding position in the ordered region.
- Insert the data in the unordered region into the ordered region; If necessary, the relevant data in the ordered region is shifted.
Merge Sort Algorithm
Operation
The ordered subsequences were merged to obtain the completely ordered sequence. That is, order each subsequence first, and then order the subsequence segments. If two ordered tables are merged into one ordered table, it is called two-way merge.
Application scenarios
The speed is second only to quicksort, which is a stable sorting algorithm. It is generally used for a sequence of numbers that are generally unordered, but each subitem is relatively ordered.
Quick Sort Algorithm
Operation
After a run of sorting, the data to be sorted is divided into two independent parts. All the data in one part are smaller than all the data in the other part. Then, the two parts of data are respectively fast-sorted in this way.
Application scenarios
Quicksort is unstable. It doesn’t require extra storage. Its application scenario is large-scale data sorting, and the actual performance is better than merge sort.
Code
Algorithm Code
Insert Sort Algorithm
/*Insertion Sorting*/ |
Merge Sort Algorithm
/*MergeSort*/ |
i = 0; j = 0; |
void MergeSortUp(int A[], int p,int r) |
Quick Sort Algorithm
/*QuickSort*/ |
void QuickSortUp(int A[], int p, int r) |
Test Code
/*Test*/ |
/*Generating test data set*/ |
/*output function*/ |
main() |
/*Insert Sorting*/ |
Test
I prepared 12 test datasets for the test of the algorithm, in which the data amount is 10,000\50,000\100,000
For each data size, there are randomly generated data sets (for testing average complexity) and inverted data sets (for testing worst case)
To minimize errors, two of each type of dataset are prepared, resulting in 3x2x2=12 datasets
The correctness test of the algorithm is manually verified when the amount of data is small, so here we only focus on the comparison of time complexity
Test Data
Generate test data by “randomly generate test set” part in the main function.
To test the time complexity of the sorting algorithm, the corresponding “reverse order data set” is generated by the quicksort algorithm.
Result
data set | Insert Sort(s) | Merge Sort(s) | Quick Sort(s) |
---|---|---|---|
S1:10,000\rand array(average condition) | 0.082 | 0.001 | 0.002 |
S2:10,000\rand array(average condition) | 0.069 | 0.002 | 0.001 |
S3:10,000\reserve array(worst condition) | 0.469 | 0.003 | 0.554 |
S4:10,000\reserve array(worst condition) | 0.463 | 0.003 | 0.546 |
S5:50,000\rand array(average condition) | 1.718 | 0.009 | 0.009 |
S6:50,000\rand array(average condition) | 1.881 | 0.01 | 0.009 |
S7:50,000\reserve array(worst condition) | 8.54 | 0.015 | / |
S8:50,000\reserve array(worst condition) | 7.956 | 0.015 | / |
S9:100,000\rand array(average condition) | 7.023 | 0.022 | 0.017 |
S10:100,000\rand array(average condition) | 6.86 | 0.02 | 0.018 |
S11:100,000\reserve array(worst condition) | 31.549 | 0.027 | / |
S12:100,000\reserve array(worst condition) | 30.989 | 0.029 | / |
Tips: For 50,000 and 100,000 data, the quicksort algorithm program cannot complete the sorting in the worst case.
Algorithm complexity analysis
Insert Sort Algorithm
- Best condition
Compare once and move twice at least. - Worst condition
Compare times at most and move times. (reserve array) ()
Therefore, the time complexity of direct insertion sort is , which is related to the order of the sequence to be sorted.
Merge Sort Algorithm
According to the iterative drawing method, the complexity of the merging algorithm is , and its time complexity is independent of the order of the sequence to be sorted.
Quick Sort Algorithm
- Worst Condition
In order or reverse order, partition can only solve the position arrangement of one element at a time, so the worst-case time complexity is . - Average Condition
. The more evenly divided the two sides of the pivot element, the smaller the time complexity.
Optimizing
Sentry
in merge sort
In merge sort, when merging two sorted sequences together, we used to do this:if(i == n1){
while(j < n2) A[k++] = R[j++];
} else if ( j == n2){
while(i < n1) A[k++] = L[i++];
}
If a sentinel is added to the rightmost end of the two sequences to be sorted, that is, the maximum value MAX, it is not necessary to judge the problem that the sequence has been selected, which can effectively reduce the number of judgments.
How to choose pivot in quick sort
The choice of pivot has a significant impact on the time complexity of quicksort. From the “reverse test data” above, we can see that the complexity of quicksort can reach if the pivot selected each time is the maximum/minimum value.
Picking the pivot at random during each run usually yields better results.
I adopted a method that I selected the median of the first, last and middle elements as pivot, which can effectively avoid the appearance of “worst condition”.
//median-of-three pivot rule |
Then swap the selected pivot with the first element in the queue.
Improve the stability
The reason quicksort is “unstable” is that the last step of the partition, pivot, switches with the element at the position.
Example :(elements of the same size are distinguished by '
)
- 3 1 3’ 5 2 6 1’
- 3 1 2 5 3’ 6 1’
- 3 1 2 1’ 3’ 6 5
(No problem at this point.) - 1’ 1 2 3 3’ 6 5
(The order of 1’ and 1 are changed in the last step)
Solution: At the last step of each partition, iterate over the part before array , moving all elements of the same size as backward.
Summary
In this experiment, I learned three important algorithms: insertion sort algorithm, merge sort algorithm and quicksort algorithm, and understood their principles and applicable scenarios.
I think merge sort is more robust than quicksort, it doesn’t affect the time complexity because of the order of the sequence, and it is a stable sort. Perhaps because the data set is not large enough, I haven’t fully appreciated the time advantage of quicksort.
The worst-case time complexity is very different from the average time complexity, so we should take both into account when analyzing the algorithm in the future.