The sum of the primes below $10$ is $2 + 3 + 5 + 7 = 17$.
Find the sum of all the primes below two million.
This is a perfect problem to solve using the Sieve of Eratosthenes, and Wikipedia has a great animation showing how it works.
Usually the sieve stores a boolean for every integer up to the limit, but we can cut memory usage in half because 2 is the only even prime. Instead of storing all integers, we'll just store odd numbers. Then array index $i$ represents the odd number $2i + 1$, and an odd number $n$ maps to index $(n - 1) / 2$.
The sieving logic also changes slightly. When we find a prime $p$ at index $i$, we mark its odd multiples as composite. Since $p^2$ is odd (odd times odd), we start there. The next odd multiple is $p^2 + 2p$, then $p^2 + 4p$, etc. In terms of array indices, consecutive odd multiples are spaced $p$ apart.
function _sieve_of_eratosthenes(limit)
limit < 3 && return (Bool[], limit)
# index i represents the odd number 2i + 1
max_index = (limit - 1) ÷ 2
is_prime = fill(true, max_index)
# For each prime p, mark odd multiples starting at p² which has index (p² - 1) ÷ 2.
# The step between consecutive odd multiples is p.
i = 1
while (2i + 1)^2 <= limit
if is_prime[i]
p = 2i + 1
for j in ((p^2 - 1) ÷ 2):p:max_index
is_prime[j] = false
end
end
i += 1
end
return is_prime, limit
end
The internal _sieve_of_eratosthenes returns the boolean array directly, allowing us to build different functions on top of it. For this problem we only need the sum so we can avoid allocating a primes array by summing directly from the boolean array:
function sum_sieve_of_eratosthenes(limit)
limit < 2 && return 0
limit == 2 && return 2
is_prime, _ = _sieve_of_eratosthenes(limit)
total = 2
for i in eachindex(is_prime)
is_prime[i] && (total += 2i + 1)
end
return total
end
With this we can compute the sum of all primes below $2 \times 10^6$ in 972.300 μs using 976.63 KiB.
We can go a bit further and compute the sum of all primes below $10^8$ in — using —.
The sieve uses $\mathcal{O}(n/2)$ space for the boolean array. For time complexity, consider how much work we do crossing off multiples. For each prime $p$, we cross off roughly $n/2p$ odd multiples. The total work is
The sum of reciprocals of primes up to $n$ grows like $\log \log n$, giving us $\mathcal{O}(n \log \log n)$ time complexity.