CS 111 w20 lecture 24 outline
1 Merge Sort
An example of a divide and conquer approach: recursively split the problem into smaller pieces and solve those
1.1 merge
Function
Take two sorted lists, combine them into a single sorted list. Pseudocode:
Given two sorted lists, left and right, and an empty list, merged while elements remain in both left and right compare the first element in left to the first element in right remove the smaller one from its list and append it to merged append any remaining elements from left or right to merged
def merge(left, right): """assume left and right are already sorted""" left_finger = 0 right_finger = 0 merged = [] while left_finger < len(left) and right_finger < len(right): if left[left_finger] <= right[right_finger]: merged.append(left[left_finger]) left_finger += 1 else: merged.append(right[right_finger]) right_finger += 1 # we've gone through all the items from one list # just append everything from the remaining list if left_finger == len(left): # append the right # for i in range(right_finger, len(right)): # merged.append(right[right_finger]) # merged.extend(right[right_finger:]) merged += right[right_finger:] else: # append the left merged += left[left_finger:] return merged
1.2 Recursive Sorting
def merge_sort(arr): # find an index to split arr in half midpoint = int(len(arr) / 2) left = merge_sort(arr[:midpoint]) right = merge_sort(arr[midpoint:]) return merge(left, right)
1.3 Base Case
def merge_sort(arr): # base case: nothing to sort for a list of 1 or 0 elements if len(arr) <= 1: return arr # find an index to split arr in half midpoint = int(len(arr) / 2) left = merge_sort(arr[:midpoint]) right = merge_sort(arr[midpoint:]) return merge(left, right)
1.4 Diagram
1.5 Analysis
merge
is \(O(n)\)- how many merges?
- number of times \(n\) can be divided by 2 before we reach the base case: \(O(\log_2 n)\)
- so overall, mergesort is \(O(n\log_2 n)\)
- seems like it should be better than insertion sort's \(O(n^2)\)
2 Comparing Algorithms
2.1 Insertion Sort vs Merge Sort
- how do \(O(n^2)\) and \(O(n\log_2 n)\) compare empirically?
2.2 Scenarios
2.2.1 in-place vs not
- can the algorithm sort the original list or does it necessarily make a copy
- insertion sort is in-place, the merge sort we implemented is not
2.2.2 stability
- equal elements remain in the same relative order before and after sorting
- essential if we want to sort by one attribute and then another
- search results by prince and then by relevance (i.e., equally relevant items will remain sorted by price)
- both merge sort and insertion sort are stable, selection sort is not
2.2.3 streaming data
- insertion sort is great for sorting data as it comes in (\(O(n)\) to insert a single
element), merge sort we have to run the whole sort again
2.2.4 ideal algorithm
The ideal algorithm would have the following properties:
- Stable: Equal keys aren't reordered.
- Operates in place, requiring \(O(1)\) extra space.
- Worst-case \(O(n\log n)\) comparisons
- Worst-case \(O(n)\) swaps.
- Adaptive: Speeds up to \(O(n)\) when data is nearly sorted or when there are lots of duplicates.
There is no algorithm that has all of these properties, and so the choice of sorting algorithm depends on the application.