Greedy Algorithms

Introduction to Greedy Algorithms

Greedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum. The term "greedy" comes from the strategy of always choosing the option that looks best at the moment, without considering the broader context or future consequences.

Unlike dynamic programming, which considers all possible subproblems, greedy algorithms make a single choice and never reconsider it. This makes them typically faster and simpler, but they only work correctly when the problem has specific properties that guarantee the greedy choice leads to an optimal solution.

Greedy algorithms are widely used in optimization problems, scheduling, graph algorithms, and compression. They're particularly valuable when you need efficient solutions and can prove that the greedy strategy is optimal.

When to Use Greedy Algorithms

Greedy algorithms are suitable when:

The problem has the greedy choice property - a globally optimal solution can be reached by making locally optimal choices
The problem has optimal substructure - an optimal solution contains optimal solutions to subproblems
You need a fast, efficient solution (often O(n log n) or better)
The problem can be solved by making a series of choices where each choice is independent

Greedy vs Dynamic Programming

Understanding when to use greedy algorithms versus dynamic programming is crucial:

Feature	Greedy Algorithms	Dynamic Programming
Decision Making	Make one choice, never reconsider	Consider all possibilities, store results
Time Complexity	Usually O(n log n) or O(n)	Usually O(n²) or higher
Space Complexity	Usually O(1) or O(n)	Usually O(n) or O(n²)
Optimality	Requires proof of greedy choice property	Guaranteed optimal if problem has optimal substructure
Best For	Optimization with greedy choice property	Problems with overlapping subproblems

Key Properties of Greedy Algorithms

1. Greedy Choice Property

A problem exhibits the greedy choice property if a globally optimal solution can be reached by making a locally optimal (greedy) choice at each step. This means that at each decision point, we can make the choice that looks best right now, and this will lead to an optimal solution.

Example: In the activity selection problem, always choosing the activity that ends earliest (greedy choice) leads to the maximum number of non-overlapping activities (optimal solution).

2. Optimal Substructure

A problem has optimal substructure if an optimal solution to the problem contains optimal solutions to its subproblems. This property is shared with dynamic programming, but greedy algorithms use it differently - they make one choice and solve one subproblem, rather than solving all subproblems.

Example: In the minimum spanning tree problem, if we know the MST of a graph, and we add a new edge, the optimal way to incorporate it uses the existing MST structure.

3. Proving Greedy Algorithms

To prove a greedy algorithm is correct, you typically need to:

Prove the greedy choice property: Show that making the greedy choice is always safe (doesn't exclude optimal solutions)
Prove optimal substructure: Show that the problem can be solved by combining the greedy choice with an optimal solution to the remaining subproblem
Use exchange argument: Show that any optimal solution can be transformed to include the greedy choice without making it worse

Activity Selection Problem

The activity selection problem is a classic greedy algorithm problem: given a set of activities with start and finish times, select the maximum number of non-overlapping activities.

Problem Statement

Given n activities with start times s[i] and finish times f[i], where activity i takes place during [s[i], f[i]), find the maximum number of activities that can be scheduled without overlapping.

Greedy Strategy

The greedy choice is to always select the activity that finishes earliest. This leaves the maximum amount of time for remaining activities.

Algorithm

def activity_selection(activities):
    # Sort activities by finish time
    activities.sort(key=lambda x: x[1])
    
    selected = [activities[0]]  # Always select first activity
    last_finish = activities[0][1]
    
    for start, finish in activities[1:]:
        if start >= last_finish:  # No overlap
            selected.append((start, finish))
            last_finish = finish
    
    return selected

# Example
activities = [(1, 4), (3, 5), (0, 6), (5, 7), (8, 9), (5, 9)]
result = activity_selection(activities)
# Result: [(1, 4), (5, 7), (8, 9)] - 3 activities

Complexity Analysis

Time Complexity: O(n log n) - dominated by sorting
Space Complexity: O(n) - for storing selected activities

Why It Works

The greedy choice property holds because if we have an optimal solution that doesn't include the earliest-finishing activity, we can replace the first activity in that solution with the earliest-finishing one without reducing the total count (since it finishes earlier, it can't conflict with more activities).

Fractional Knapsack Problem

The fractional knapsack problem allows taking fractions of items, making it solvable with a greedy approach. This contrasts with the 0-1 knapsack problem, which requires dynamic programming.

Problem Statement

Given items with weights and values, and a knapsack with capacity W, maximize the total value. You can take fractions of items (unlike 0-1 knapsack where items are indivisible).

Greedy Strategy

Sort items by value-to-weight ratio (value/weight) in descending order. Take items in this order, taking as much as possible of each item until the knapsack is full.

Algorithm

def fractional_knapsack(items, capacity):
    # Sort by value/weight ratio (descending)
    items.sort(key=lambda x: x[1] / x[0], reverse=True)
    
    total_value = 0
    remaining_capacity = capacity
    
    for weight, value in items:
        if remaining_capacity >= weight:
            # Take entire item
            total_value += value
            remaining_capacity -= weight
        else:
            # Take fraction of item
            fraction = remaining_capacity / weight
            total_value += value * fraction
            break
    
    return total_value

# Example
items = [(10, 60), (20, 100), (30, 120)]  # (weight, value)
capacity = 50
result = fractional_knapsack(items, capacity)
# Takes all of item 1 (10, 60), all of item 2 (20, 100), 
# and 2/3 of item 3 (20, 80) = 240 total value

Complexity Analysis

Time Complexity: O(n log n) - sorting dominates
Space Complexity: O(1) - only using a few variables

Why Greedy Works for Fractional Knapsack

Since we can take fractions, the item with the highest value-to-weight ratio should be prioritized. If we don't take as much as possible of the best ratio item, we can always improve the solution by replacing some of a lower-ratio item with the higher-ratio item.

Huffman Coding

Huffman coding is a lossless data compression algorithm that creates optimal prefix-free codes for characters based on their frequencies. It's used in file compression formats like ZIP and JPEG.

Problem Statement

Given characters and their frequencies, assign binary codes to each character such that:

No code is a prefix of another (prefix-free property)
The total encoded length is minimized

Greedy Strategy

Build a binary tree bottom-up by repeatedly merging the two nodes with lowest frequencies. The characters with higher frequencies get shorter codes.

Algorithm

import heapq

class Node:
    def __init__(self, char, freq, left=None, right=None):
        self.char = char
        self.freq = freq
        self.left = left
        self.right = right
    
    def __lt__(self, other):
        return self.freq < other.freq

def build_huffman_tree(frequencies):
    # Create priority queue (min-heap)
    heap = [Node(char, freq) for char, freq in frequencies.items()]
    heapq.heapify(heap)
    
    # Build tree by merging nodes
    while len(heap) > 1:
        left = heapq.heappop(heap)
        right = heapq.heappop(heap)
        merged = Node(None, left.freq + right.freq, left, right)
        heapq.heappush(heap, merged)
    
    return heap[0]  # Root of Huffman tree

def build_codes(root, code="", codes={}):
    if root.char is not None:
        codes[root.char] = code if code else "0"
    else:
        build_codes(root.left, code + "0", codes)
        build_codes(root.right, code + "1", codes)
    return codes

# Example
frequencies = {'a': 5, 'b': 9, 'c': 12, 'd': 13, 'e': 16, 'f': 45}
tree = build_huffman_tree(frequencies)
codes = build_codes(tree)
# Result: {'f': '0', 'c': '100', 'd': '101', 'a': '1100', 
#          'b': '1101', 'e': '111'}

Complexity Analysis

Time Complexity: O(n log n) - building heap and merging nodes
Space Complexity: O(n) - for the tree and codes

Applications

File compression (ZIP, GZIP)
Image compression (JPEG uses Huffman coding)
Network protocols
Data transmission optimization

Minimum Spanning Tree (MST)

A minimum spanning tree connects all vertices in a weighted, undirected graph with minimum total edge weight. Two famous greedy algorithms solve this: Kruskal's and Prim's.

Kruskal's Algorithm

Kruskal's algorithm builds the MST by repeatedly adding the smallest edge that doesn't form a cycle. It uses the Union-Find data structure to efficiently detect cycles.

Algorithm

def kruskal_mst(edges, num_vertices):
    # Sort edges by weight
    edges.sort(key=lambda x: x[2])  # (u, v, weight)
    
    mst = []
    parent = list(range(num_vertices))
    
    def find(x):
        if parent[x] != x:
            parent[x] = find(parent[x])  # Path compression
        return parent[x]
    
    def union(x, y):
        root_x, root_y = find(x), find(y)
        if root_x != root_y:
            parent[root_x] = root_y
            return True
        return False
    
    for u, v, weight in edges:
        if union(u, v):  # No cycle
            mst.append((u, v, weight))
            if len(mst) == num_vertices - 1:
                break
    
    return mst

Complexity

Time Complexity: O(E log E) - sorting edges dominates
Space Complexity: O(V) - for Union-Find structure

Prim's Algorithm

Prim's algorithm grows the MST from a starting vertex, always adding the minimum-weight edge that connects a vertex in the MST to a vertex outside it.

Algorithm

import heapq

def prim_mst(graph, start):
    n = len(graph)
    mst = []
    visited = [False] * n
    pq = [(0, start, -1)]  # (weight, vertex, parent)
    
    while pq and len(mst) < n - 1:
        weight, u, parent = heapq.heappop(pq)
        
        if visited[u]:
            continue
        
        visited[u] = True
        if parent != -1:
            mst.append((parent, u, weight))
        
        for v, w in graph[u]:
            if not visited[v]:
                heapq.heappush(pq, (w, v, u))
    
    return mst

Complexity

Time Complexity: O(E log V) with binary heap, O(E + V log V) with Fibonacci heap
Space Complexity: O(V) - for priority queue and visited array

Kruskal's vs Prim's

Feature	Kruskal's	Prim's
Data Structure	Union-Find	Priority Queue
Best For	Sparse graphs (E ≈ V)	Dense graphs (E ≈ V²)
Time Complexity	O(E log E)	O(E log V)
Approach	Edge-based (adds edges)	Vertex-based (grows from vertex)

For more details on Union-Find, see Graph Algorithms.

Interval Scheduling Problems

Interval scheduling problems are a class of problems where you need to schedule intervals (time periods, tasks, etc.) optimally. Many variations use greedy strategies.

Maximum Non-Overlapping Intervals

This is the same as the activity selection problem - find the maximum number of non-overlapping intervals.

Interval Partitioning (Minimum Rooms)

Given intervals, find the minimum number of resources (rooms, machines) needed so no two overlapping intervals use the same resource.

Algorithm

def min_rooms(intervals):
    # Create events: (time, type) where type: 1=start, -1=end
    events = []
    for start, end in intervals:
        events.append((start, 1))
        events.append((end, -1))
    
    # Sort by time, then by type (end before start at same time)
    events.sort(key=lambda x: (x[0], x[1]))
    
    max_rooms = 0
    current_rooms = 0
    
    for time, event_type in events:
        current_rooms += event_type
        max_rooms = max(max_rooms, current_rooms)
    
    return max_rooms

# Example
intervals = [(1, 4), (2, 5), (3, 6), (5, 7)]
# Result: 3 rooms needed

Complexity

Time Complexity: O(n log n) - sorting events
Space Complexity: O(n) - for events array

Coin Change (Greedy Version)

The coin change problem asks: given coins of different denominations and a target amount, find the minimum number of coins needed. A greedy approach works for certain coin systems (like US currency) but not all.

When Greedy Works

Greedy works for coin systems where each coin is at least twice the value of the next smaller coin. For example, US coins: 1, 5, 10, 25, 50, 100.

Algorithm

def coin_change_greedy(coins, amount):
    coins.sort(reverse=True)  # Sort descending
    count = 0
    i = 0
    
    while amount > 0 and i < len(coins):
        if coins[i] <= amount:
            num_coins = amount // coins[i]
            count += num_coins
            amount -= num_coins * coins[i]
        i += 1
    
    return count if amount == 0 else -1

# Example (US coins)
coins = [1, 5, 10, 25]
amount = 67
# Result: 6 coins (2 quarters, 1 dime, 1 nickel, 2 pennies)

When Greedy Fails

Greedy fails for coin systems like [1, 3, 4] and amount 6:

Greedy: 4 + 1 + 1 = 3 coins
Optimal: 3 + 3 = 2 coins

For general coin systems, use dynamic programming (see Dynamic Programming).

Common Greedy Patterns

1. Sorting + Greedy Choice

Many greedy algorithms start by sorting the input, then make greedy choices in order:

Activity Selection: Sort by finish time
Fractional Knapsack: Sort by value/weight ratio
Interval Scheduling: Sort by end time

2. Priority Queue / Heap

Use a priority queue when you need to repeatedly select the best option:

Huffman Coding: Merge nodes with lowest frequencies
Prim's MST: Always add minimum-weight edge
Dijkstra's Algorithm: Process closest unvisited vertex

3. Two-Pointer Technique

Some greedy problems can be solved with two pointers:

Meeting room scheduling
Merging intervals
Container with most water

Proving Greedy Algorithms Correct

Exchange Argument

The most common proof technique is the exchange argument:

Assume you have an optimal solution that doesn't use the greedy choice
Show you can exchange/swap elements to include the greedy choice
Prove the new solution is at least as good (optimal)
Conclude the greedy choice is safe

Example: Activity Selection Proof

Claim: Always choosing the activity that ends earliest is optimal.

Proof: Let S be an optimal solution that doesn't include activity a (earliest finishing). Let a' be the first activity in S. Since a finishes before a', we can replace a' with a in S. The new solution has the same number of activities (optimal) and a finishes earlier, so it can't conflict with more activities. Therefore, including a is optimal.

Mathematical Induction

Another common approach is to prove by induction that after making k greedy choices, there exists an optimal solution that includes those k choices.

Common Pitfalls and Mistakes

1. Assuming Greedy Always Works

Not all problems can be solved greedily. Always verify the greedy choice property and optimal substructure before using a greedy approach.

2. Wrong Greedy Criterion

Choosing the wrong criterion for the greedy choice leads to incorrect solutions. For example:

Activity Selection: Must sort by finish time, not start time
Fractional Knapsack: Must use value/weight ratio, not just value

3. Not Handling Edge Cases

Greedy algorithms often have edge cases:

Empty input
Single element
All elements identical
Boundary conditions

4. Confusing with Dynamic Programming

Remember: Greedy makes one choice and never looks back. DP considers all possibilities and stores results. If you need to reconsider previous choices, use DP, not greedy.

Practice Problems

Easy

LeetCode 455: Assign Cookies - Match children with cookies greedily
LeetCode 860: Lemonade Change - Greedy change making
LeetCode 1005: Maximize Sum Of Array After K Negations - Greedy negation strategy

Medium

LeetCode 435: Non-overlapping Intervals - Activity selection variant
LeetCode 452: Minimum Number of Arrows to Burst Balloons - Interval scheduling
LeetCode 621: Task Scheduler - Greedy task scheduling
LeetCode 763: Partition Labels - Greedy partitioning
LeetCode 122: Best Time to Buy and Sell Stock II - Greedy profit maximization

Hard

LeetCode 135: Candy - Greedy distribution with constraints
LeetCode 330: Patching Array - Greedy patching strategy
LeetCode 502: IPO - Greedy project selection

Real-World Applications

Network Design

Minimum spanning trees for network topology
Shortest path routing (Dijkstra's algorithm)
Network flow optimization

Data Compression

Huffman coding for file compression
Lempel-Ziv-Welch (LZW) compression
Image and video compression

Scheduling

CPU task scheduling
Meeting room allocation
Job scheduling in operating systems
Resource allocation

Optimization

Portfolio optimization (fractional knapsack)
Resource allocation problems
Approximation algorithms for NP-hard problems

Summary

Greedy algorithms are powerful tools for optimization problems that exhibit the greedy choice property and optimal substructure. Key takeaways:

Greedy algorithms make locally optimal choices at each step
They're typically faster than dynamic programming (O(n log n) vs O(n²))
They require proof that the greedy choice leads to optimal solutions
Common patterns include sorting + greedy choice, priority queues, and two pointers
Classic problems: Activity Selection, Fractional Knapsack, Huffman Coding, MST
Always verify the greedy choice property before using a greedy approach

When in doubt between greedy and DP, consider: if you need to reconsider previous choices, use DP. If making one choice and moving forward works, greedy might be the right approach.

Greedy Algorithms

Introduction to Greedy Algorithms

When to Use Greedy Algorithms

Greedy vs Dynamic Programming

Key Properties of Greedy Algorithms

1. Greedy Choice Property

2. Optimal Substructure

3. Proving Greedy Algorithms

Activity Selection Problem

Problem Statement

Greedy Strategy

Algorithm

Complexity Analysis

Why It Works

Fractional Knapsack Problem

Problem Statement

Greedy Strategy

Algorithm

Complexity Analysis

Why Greedy Works for Fractional Knapsack

Huffman Coding

Problem Statement

Greedy Strategy

Algorithm

Complexity Analysis

Applications

Minimum Spanning Tree (MST)

Kruskal's Algorithm

Algorithm

Complexity

Prim's Algorithm

Algorithm

Complexity

Kruskal's vs Prim's

Interval Scheduling Problems

Maximum Non-Overlapping Intervals

Interval Partitioning (Minimum Rooms)

Algorithm

Complexity

Coin Change (Greedy Version)

When Greedy Works

Algorithm

When Greedy Fails

Common Greedy Patterns

1. Sorting + Greedy Choice

2. Priority Queue / Heap

3. Two-Pointer Technique

Proving Greedy Algorithms Correct

Exchange Argument

Example: Activity Selection Proof

Mathematical Induction

Common Pitfalls and Mistakes

1. Assuming Greedy Always Works

2. Wrong Greedy Criterion

3. Not Handling Edge Cases

4. Confusing with Dynamic Programming

Practice Problems

Easy

Medium

Hard

Real-World Applications

Network Design

Data Compression

Scheduling

Optimization

Summary

Related Topics