How can we prove that there are infinitely many primes?

The most famous proof is Euclid's proof by contradiction, but this article presents an analytic approach using the uniqueness of prime factorization and logarithms. We estimate the number of natural numbers up to x by the number of exponent tuples in their prime factorizations, and derive a lower bound π(x) ≥ log x / log(1 + log x / log 2). Since the right-hand side diverges as x → ∞, we conclude π(x) → ∞, i.e., there are infinitely many primes.

How does this proof differ from Euclid's proof?

Euclid's proof is a proof by contradiction (assuming only finitely many primes exist and deriving a contradiction), whereas this proof directly shows π(x) → ∞ in a constructive manner. It is based on the ideas of analytic number theory that lead to the Prime Number Theorem π(x) ∼ x / log x.

Infinitude of Primes

Analytic Proof: A Lower Bound for $\pi(x)$ via Prime Factorization and Logarithms

日本語版

Problem

Show that there are infinitely many prime numbers.

The most famous proof is Euclid's proof by contradiction, but here we give an analytic proof that connects to the Prime Number Theorem.

Proof

Every natural number $n$ can be uniquely expressed as

$$n = \prod_{i} p_i^{k_i} \tag{1}$$

where $p_i$ is the $i$-th prime and $k_i$ is a non-negative integer (with $k_i = 0$ for all but finitely many $i$). This is the Fundamental Theorem of Arithmetic.

By this uniqueness, the natural numbers $n$ with $1 \le n \le x$ are in one-to-one correspondence with tuples of exponents $(k_i)$ satisfying $\prod_i p_i^{k_i} \le x$. Since there are $\lfloor x \rfloor$ natural numbers from $1$ to $\lfloor x \rfloor$,

$$\#\left\{(k_i) \;\middle|\; \prod_i p_i^{k_i} \le x\right\} = \lfloor x \rfloor \tag{2}$$

Next, we bound from above the number of tuples $(k_i)$ satisfying $\prod_i p_i^{k_i} \le x$. If $\prod_i p_i^{k_i} \le x$, then for each factor,

$$p_i^{k_i} \le x \tag{3}$$

Taking logarithms of both sides,

$$k_i \log p_i \le \log x \tag{4}$$

so $k_i$ satisfies

$$0 \le k_i \le \dfrac{\log x}{\log p_i} \tag{5}$$

Since $k_i$ is a non-negative integer, the number of possible values for $k_i$ is at most $\left\lfloor \dfrac{\log x}{\log p_i} \right\rfloor + 1$ (namely $0, 1, 2, \ldots, \left\lfloor \dfrac{\log x}{\log p_i} \right\rfloor$).

Note that primes $p_i > x$ can only have $k_i = 0$ (since $p_i^1 > x$), so we need only consider $p_i \le x$, i.e., $i = 1, 2, \ldots, \pi(x)$, where $\pi(x)$ denotes the number of primes up to $x$.

Therefore, the number of valid tuples is bounded above by the product of the number of choices for each $k_i$:

$$\lfloor x \rfloor \le \prod_{i=1}^{\pi(x)} \left(\left\lfloor \dfrac{\log x}{\log p_i} \right\rfloor + 1\right) \tag{6}$$

Note: The left-hand side is $\lfloor x \rfloor$ (the equality in (2)), while the right-hand side counts tuples where each $k_i$ is varied independently. The set of tuples satisfying $\prod p_i^{k_i} \le x$ is a subset of all independently chosen tuples, so the cardinality satisfies $\le$.

Since $\pi(x)$ is an integer-valued function, we may restrict $x$ to positive integers without loss of generality. For positive integer $x$, $\lfloor x \rfloor = x$. Also, removing the floor function only makes the right-hand side larger, so

$$x \le \prod_{i=1}^{\pi(x)} \left(1 + \dfrac{\log x}{\log p_i}\right) \tag{7}$$

Since every prime satisfies $p_i \ge 2$, we have $\log p_i \ge \log 2$, and hence $\dfrac{\log x}{\log p_i} \le \dfrac{\log x}{\log 2}$. Therefore,

$$x \le \left(1 + \dfrac{\log x}{\log 2}\right)^{\pi(x)} \tag{8}$$

Taking logarithms of both sides of inequality (8),

$$\log x \le \pi(x) \log\!\left(1 + \dfrac{\log x}{\log 2}\right) \tag{9}$$

Solving for $\pi(x)$,

$$\pi(x) \ge \dfrac{\log x}{\log\!\left(1 + \dfrac{\log x}{\log 2}\right)} \tag{10}$$

If we can show that this lower bound diverges as $x \to \infty$, then $\pi(x) \to \infty$, proving that there are infinitely many primes.

Comparison of π(x) and the lower bound from the Prime Number Theorem — Figure 1 (horizontal: $x$, vertical: $\pi(x)$): $\pi(x)$ (black step function) compared with the lower bound from equation (10) (red dashed line). The lower bound is much smaller than $\pi(x)$, but both diverge.

Proof That the Right-Hand Side of (10) Diverges

Step 1: Simplifying the Denominator

For $x \ge 2$, we have $\dfrac{\log x}{\log 2} \ge 1$, so $1 + \dfrac{\log x}{\log 2} \le \dfrac{2\log x}{\log 2}$. Since $\log$ is monotonically increasing,

$$\log\!\left(1 + \dfrac{\log x}{\log 2}\right) \le \log\!\left(\dfrac{2\log x}{\log 2}\right) = \log 2 + \log\log x - \log\log 2 \tag{11}$$

Dropping the term $-\log\log 2$ (which is positive) only makes the right-hand side larger, preserving the direction of the inequality:

$$\log\!\left(1 + \dfrac{\log x}{\log 2}\right) \le \log 2 + \log\log x \tag{12}$$

Replacing the denominator in (10) with (12) gives a weaker (smaller) lower bound for $\pi(x)$, but the inequality is preserved:

$$\pi(x) \ge \dfrac{\log x}{\log 2 + \log\log x} \tag{13}$$

Comparison of π(x) with the lower bounds from equations (10) and (13) — Figure 2 (horizontal: $x$, vertical: $\pi(x)$): Comparison of the lower bounds from equation (10) (gray dotted line) and equation (13) (red dashed line). Even with the enlarged denominator, the bound remains valid.

Step 2: Substituting $t = \log x$

Setting $t = \log x$, equation (13) becomes

$$\pi(x) \ge \dfrac{t}{\log 2 + \log t}$$

For sufficiently large $x$ (specifically $x \ge e^e$, i.e., $t \ge e > 2$), we have $\log 2 < \log t$, so $\log 2 + \log t \le 2\log t$, giving

$$\pi(x) \ge \dfrac{t}{2\log t} \tag{14}$$

Divergence graph of t/(2 log t) — Figure 3 (horizontal: $x$, vertical: $\pi(x)$): The relationship (10) (gray) $\ge$ (13) (red) $\ge$ (14) (blue). All three lower bounds diverge.

Step 3: Showing $t/(2\log t) \to \infty$

Let $f(t) = \dfrac{t}{2\log t}$. Differentiating,

$$f'(t) = \dfrac{\log t - 1}{2(\log t)^2}$$

For $t > e$, we have $\log t > 1$, so $f'(t) > 0$, meaning $f(t)$ is monotonically increasing.

We now show $f(t) \to \infty$. Setting $t = e^k$ ($k = 1, 2, 3, \ldots$) gives $\log t = k$, so (14) becomes

$$\pi(x) \ge \dfrac{t}{2\log t} = \dfrac{e^k}{2k}$$

We show $\dfrac{e^k}{2k} \to \infty$ as $k \to \infty$. From the Taylor expansion of $e^k$,

$$e^k = 1 + k + \dfrac{k^2}{2!} + \dfrac{k^3}{3!} + \cdots$$

Since every term is $\ge 0$, retaining only the third term gives

$$e^k \ge \dfrac{k^2}{2} \tag{15}$$

Dividing both sides by $2k > 0$,

$$\pi(x) \ge \dfrac{e^k}{2k} \ge \dfrac{k^2/2}{2k} = \dfrac{k}{4} \tag{16}$$

As $k \to \infty$, $\dfrac{k}{4} \to \infty$, so $\pi(x) \to \infty$ along the subsequence $x = e^k$. Since $\pi(x)$ is non-decreasing, the divergence holds for all $x \to \infty$:

$$\pi(x) \to \infty \quad (x \to \infty) \tag{17}$$

This completes the proof: there are infinitely many primes. $\blacksquare$

Comparison of the four lower bounds from equations (10), (13), (14), and (16) — Figure 4 (horizontal: $x$, vertical: $\pi(x)$): Comparison of the four lower bounds. Each simplification produces a weaker bound, yet even the crudest bound (16) diverges as $x \to \infty$.

Remarks

In fact, the Prime Number Theorem states that

$$\pi(x) \sim \dfrac{x}{\log x} \quad (x \to \infty)$$

This was conjectured in 1792 by 15-year-old Gauss based on his observations of prime tables, and independently published by Legendre in 1798. The proof came about 100 years later in 1896, given independently by Hadamard and de la Vallée-Poussin.

Furthermore, Riemann in his 1859 paper derived an exact formula for $\pi(x)$ using the zeros of the zeta function $\zeta(s) = \displaystyle\sum_{n=1}^{\infty} n^{-s}$:

$$\pi(x) = \operatorname{Li}(x) - \sum_{\rho} \operatorname{Li}(x^{\rho}) - \log 2 + \int_{x}^{\infty} \dfrac{dt}{t(t^2-1)\log t}$$

Here $\operatorname{Li}(x) = \displaystyle\int_2^x \dfrac{dt}{\log t}$ is the logarithmic integral, and $\displaystyle\sum_{\rho}$ is the sum over all non-trivial zeros of $\zeta(s)$. This identity reveals that the distribution of primes is governed by the zeros of the zeta function, leading to the still-unsolved Riemann Hypothesis: "Do all non-trivial zeros of $\zeta(s)$ have real part equal to $1/2$?"

The proof in this article is a variant of the famous proof by Erdős. Erdős decomposed each integer up to $x$ into a "square-free part $\times$ perfect square" to obtain $\lfloor x \rfloor \le 2^{\pi(x)}\sqrt{x}$, yielding $\pi(x) \ge \dfrac{1}{2}\log_2 x$. Our approach independently bounds the range of each exponent $k_i$ and takes the product (equation (6)). While both start from the same idea (bounding the number of tuples via the uniqueness of prime factorization), the intermediate estimates differ.