Mastering Algorithms

Tim Sort

Overview

Tim Sort is a hybrid stable sorting algorithm derived from merge sort and insertion sort. It was designed by Tim Peters in 2002 for use in Python's list.sort() method and has since been adopted by many other languages and libraries, including Java (for non-primitive types) and Android.

Tim Sort is designed to perform well on many kinds of real-world data. It takes advantage of runs (consecutive sequences of elements that are already sorted) in the data, making it highly efficient for partially sorted arrays.

How It Works

The algorithm works by:

  1. Finding Runs: Identify naturally occurring runs (ascending or descending sequences)
  2. Extending Runs: Extend short runs using insertion sort to a minimum run size
  3. Merging Runs: Merge runs using a merge sort-like approach
  4. Optimization: Use a stack to merge runs efficiently, maintaining balance

The algorithm uses insertion sort for small subarrays (typically < 64 elements) and merge sort for larger arrays, combining the best of both algorithms.

Algorithm


TimSort(arr):
    min_run = calculate_min_run(length of arr)
    
    // Find and extend runs
    runs = []
    i = 0
    while i < length of arr:
        run = find_run(arr, i)
        if length of run < min_run:
            extend_run(arr, i, min_run)  // Use insertion sort
        runs.append(run)
        i += length of run
    
    // Merge runs using stack
    stack = []
    for each run in runs:
        stack.push(run)
        while should_merge(stack):
            merge_two_runs(stack)
    
    return arr
                

Implementation


MIN_MERGE = 32

def calculate_min_run(n):
    """Calculate minimum run size"""
    r = 0
    while n >= MIN_MERGE:
        r |= n & 1
        n >>= 1
    return n + r

def insertion_sort(arr, left, right):
    """Insertion sort for small subarrays"""
    for i in range(left + 1, right + 1):
        key = arr[i]
        j = i - 1
        while j >= left and arr[j] > key:
            arr[j + 1] = arr[j]
            j -= 1
        arr[j + 1] = key

def find_run(arr, start):
    """Find natural run starting at start"""
    if start == len(arr) - 1:
        return 1
    
    run = 2
    if arr[start] > arr[start + 1]:  # Descending run
        while start + run < len(arr) and arr[start + run - 1] > arr[start + run]:
            run += 1
        # Reverse descending run
        arr[start:start + run] = arr[start:start + run][::-1]
    else:  # Ascending run
        while start + run < len(arr) and arr[start + run - 1] <= arr[start + run]:
            run += 1
    
    return run

def merge(arr, left, mid, right):
    """Merge two sorted subarrays"""
    len1 = mid - left + 1
    len2 = right - mid
    left_arr = arr[left:mid + 1]
    right_arr = arr[mid + 1:right + 1]
    
    i = j = 0
    k = left
    
    while i < len1 and j < len2:
        if left_arr[i] <= right_arr[j]:
            arr[k] = left_arr[i]
            i += 1
        else:
            arr[k] = right_arr[j]
            j += 1
        k += 1
    
    while i < len1:
        arr[k] = left_arr[i]
        i += 1
        k += 1
    
    while j < len2:
        arr[k] = right_arr[j]
        j += 1
        k += 1

def tim_sort(arr):
    """Tim Sort implementation"""
    n = len(arr)
    if n < 2:
        return arr
    
    min_run = calculate_min_run(n)
    
    # Sort individual subarrays of size min_run
    for start in range(0, n, min_run):
        end = min(start + min_run - 1, n - 1)
        insertion_sort(arr, start, end)
    
    # Merge runs
    size = min_run
    while size < n:
        for left in range(0, n, 2 * size):
            mid = min(n - 1, left + size - 1)
            right = min(left + 2 * size - 1, n - 1)
            if mid < right:
                merge(arr, left, mid, right)
        size *= 2
    
    return arr
                

Complexity Analysis

  • Time Complexity:
    • Best Case: O(n) - when array is already sorted
    • Average Case: O(n log n)
    • Worst Case: O(n log n)
  • Space Complexity: O(n) - for temporary arrays during merging

Tim Sort performs exceptionally well on real-world data because it takes advantage of existing order. It's the default sorting algorithm in Python and many other systems because of its excellent performance characteristics.

Characteristics

  • Stable: Yes - maintains relative order of equal elements
  • In-place: No - requires O(n) extra space for merging
  • Adaptive: Yes - very efficient for partially sorted data
  • Online: No - requires the entire array
  • Hybrid: Yes - combines insertion sort and merge sort

When to Use Tim Sort

Tim Sort is ideal when:

  • You need a general-purpose, stable sorting algorithm
  • Data is likely to be partially sorted
  • You want consistent O(n log n) performance
  • Stability is important (maintaining order of equal elements)
  • Used as default sorting in many programming languages

Tim Sort is the default sorting algorithm in:

  • Python (list.sort() and sorted())
  • Java (for non-primitive types)
  • Android platform
  • Many other modern systems

Example

Tim Sort excels when data has natural runs:

Array: [5, 2, 8, 1, 9, 3, 7, 4, 6]

Step 1: Find runs
  Run 1: [5, 2] → [2, 5] (reversed)
  Run 2: [8] (extended with insertion sort)
  Run 3: [1, 9] (extended)
  ...

Step 2: Merge runs
  Merge [2, 5] and [1, 8, 9] → [1, 2, 5, 8, 9]
  ...

Final: [1, 2, 3, 4, 5, 6, 7, 8, 9]
                

Related Algorithms

Explore other sorting algorithms: