Algorithm Complexity
Goals
Understand the concepts of algorithm complexity (time and space), the definition and usage of $O$ notation, and how to compare the complexity of common algorithms.
Prerequisites
- Basic understanding of algorithms
- Fundamentals of sequences and limits
Contents
§1 Overview
Algorithm complexity is a quantitative measure of the computational resources (time and memory) required for an algorithm to solve a problem. It is an indispensable concept for designing efficient algorithms, understanding the inherent difficulty of problems, and predicting the performance of practical programs.
Complexity theory classifies problems into those solvable in polynomial time (class P), those whose solutions are verifiable in polynomial time (class NP), and undecidable problems, revealing the limits and possibilities of algorithms.
§2 Time Complexity
Definition: Time Complexity
The time complexity of an algorithm is a function $T(n)$ representing the number of elementary operations executed as a function of the input size $n$.
Elementary operations include comparisons, assignments, and arithmetic operations—operations that execute in constant time.
Worst-Case Time Complexity
The maximum execution time over all inputs of size $n$: $$T_{\text{worst}}(n) = \max_{\text{size}(I) = n} T(I)$$ When we say "complexity" without qualification, we usually mean worst-case time complexity.
Average-Case Time Complexity
The expected execution time under an assumed input distribution: $$T_{\text{avg}}(n) = \mathbb{E}[T(I) \mid \text{size}(I) = n]$$
Best-Case Time Complexity
The minimum execution time over the most favorable input (usually less important).
§3 Big-O Notation (Asymptotic Notation)
Big-O notation provides a concise way to express algorithm complexity.
O (Big-O)
Definition: Big-O Notation
$f(n) = O(g(n))$ means there exist constants $C > 0$ and $n_0$ such that for all $n \geq n_0$, $$f(n) \leq C \cdot g(n).$$ That is, $f$ is asymptotically bounded above by $g$.
Ω (Big-Omega)
$f(n) = \Omega(g(n))$: $f$ is asymptotically bounded below by $g$ ($f(n) \geq C \cdot g(n)$ for large $n$).
Θ (Big-Theta)
$f(n) = \Theta(g(n))$: $f(n) = O(g(n))$ and $f(n) = \Omega(g(n))$ ($f$ and $g$ have the same order of growth).
Examples:
- $3n^2 + 5n + 10 = O(n^2)$
- $\log_2 n = O(\log n)$ (the base of the logarithm differs only by a constant factor)
- $2^n + n^{100} = O(2^n)$ (exponentials dominate polynomials)
§4 Common Complexity Classes
| Notation | Name | Examples |
|---|---|---|
| $O(1)$ | Constant time | Array element access, hash table lookup (average) |
| $O(\log n)$ | Logarithmic time | Binary search, balanced tree lookup |
| $O(n)$ | Linear time | Array traversal, linear search |
| $O(n \log n)$ | Linearithmic time | Merge sort, heapsort, FFT |
| $O(n^2)$ | Quadratic time | Bubble sort, selection sort, naive matrix multiplication |
| $O(n^3)$ | Cubic time | Gaussian elimination, Floyd–Warshall algorithm |
| $O(2^n)$ | Exponential time | Traveling salesman (brute force), subset enumeration |
| $O(n!)$ | Factorial time | Permutation enumeration |
Polynomial Time vs Exponential Time
Algorithms running in $O(n^k)$ (polynomial time) are considered "efficient," while those in $O(2^n)$ (exponential time) or worse are "inefficient." For $n = 100$, $n^3 = 10^6$ (practical), but $2^n \approx 10^{30}$ (more than the number of atoms in the universe).
§5 Examples
Example 1: Linear Search
def linear_search(arr, target):
for i in range(len(arr)): # n iterations
if arr[i] == target: # constant-time comparison
return i
return -1
Time complexity: $O(n)$ (worst case: scan the entire array)
Example 2: Binary Search
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right: # range halved each time
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
Time complexity: $O(\log n)$ (search range halves each iteration; terminates after $\log_2 n$ steps)
Example 3: Bubble Sort
def bubble_sort(arr):
n = len(arr)
for i in range(n): # outer loop: n times
for j in range(n-i-1): # inner loop: up to n times
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
Time complexity: $O(n^2)$ (nested loops; $\sim n(n-1)/2$ comparisons)
Example 4: Merge Sort
A divide-and-conquer algorithm that recursively splits and sorts the array.
Time complexity: $O(n \log n)$ (recursion depth $\log n$, $O(n)$ work at each level)
§6 Space Complexity
Space complexity measures the amount of memory used by an algorithm as a function of the input size $n$.
- In-place algorithms: $O(1)$ additional memory (bubble sort, heapsort)
- Recursive algorithms: proportional to the stack depth (merge sort: $O(n)$, binary search: $O(\log n)$)
§7 Complexity Classes P and NP
Class P (Polynomial Time)
The class of problems solvable in polynomial time by a deterministic Turing machine. These are considered "efficiently solvable" problems.
Examples: sorting, shortest path (Dijkstra's algorithm), linear programming
Class NP (Nondeterministic Polynomial Time)
The class of problems whose solutions can be verified in polynomial time. Equivalently, solvable in polynomial time by a nondeterministic Turing machine.
Examples: satisfiability (SAT), graph coloring, knapsack problem
NP-Complete Problems
The hardest problems in NP. If any NP-complete problem can be solved in polynomial time, then every NP problem can be solved in polynomial time.
Examples: SAT, Hamiltonian cycle, traveling salesman problem (TSP)
P vs NP Problem
Open Problem: P = NP?
Whether P = NP is the most important open problem in computer science (one of the Millennium Prize Problems). Most researchers conjecture that P ≠ NP.
§8 Applications
Algorithm Design
Choosing and designing efficient algorithms. For large problem sizes, the difference between $O(n \log n)$ and $O(n^2)$ has a dramatic impact on execution time.
Data Structure Selection
Choosing the optimal data structure based on operation complexity (arrays, linked lists, hash tables, balanced trees, etc.).
Cryptography
The security of RSA encryption relies on the exponential complexity of integer factorization (no known polynomial-time algorithm).
Machine Learning
Evaluating the complexity of training algorithms. Depends on data size $n$, number of features $d$, and iteration count (e.g., linear regression $O(nd^2)$; k-means is iterative).
Optimization Problems
For NP-hard problems, approximation algorithms or heuristics (genetic algorithms, simulated annealing) are employed.
§9 Implementation Example (Complexity Measurement)
Python
import time
import matplotlib.pyplot as plt # pip install matplotlib
def measure_time(func, arr):
"""Measure execution time of a function"""
arr_copy = arr.copy()
start = time.time()
func(arr_copy)
return time.time() - start
# Algorithm definition
def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(n-i-1):
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
# Experimental verification of complexity
sizes = [100, 200, 400, 800, 1600]
times = []
for n in sizes:
arr = list(range(n, 0, -1)) # worst case (descending)
t = measure_time(bubble_sort, arr)
times.append(t)
print(f"n={n}: {t:.6f} sec")
# Plot
plt.plot(sizes, times, 'o-', label='Bubble Sort')
plt.xlabel('Array Size (n)')
plt.ylabel('Time (sec)')
plt.title('Time Complexity of Bubble Sort')
plt.legend()
plt.grid()
plt.show()
# Verify O(n^2): t ≈ C*n^2
# If t[i] / n[i]^2 is approximately constant, then O(n^2)
§10 References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
- Knuth, D. E. (1997–2011). The Art of Computer Programming, Vol. 1–4A. Addison-Wesley.
- Sedgewick, R., & Wayne, K. (2011). Algorithms (4th ed.). Addison-Wesley.
- Sipser, M. (2012). Introduction to the Theory of Computation (3rd ed.). Cengage Learning.
- Arora, S., & Barak, B. (2009). Computational Complexity: A Modern Approach. Cambridge University Press.
- Weisstein, E. W. "Computational Complexity." From MathWorld—A Wolfram Web Resource.