Wednesday, August 09, 2006

Prime gap equation

My last few posts have been about what results from primes not having a preference with regard to their residue modulo another prime, which may sound complicated but it's not, as for instance, 3 has 3 residues: 0, 1, 2. And the residue of 5 modulo 3 is 2, as 5-3 = 2, whereas 7 mod 3 = 1, since 7-6=1, and I'm simply noting that if you check primes modulo 3, you will find no preference for a particular residue beyond not being 0.

An approach which can be proven rigorously to be absolutely true as I've shown in an earlier post where I related the probability that x is prime directly to the exact prime count.

But that result is not a surprise as the naturals have a rather simple ordering, which is they advance by 1, so you have 5,6,7, where you see all the possible residues, and repeat, so 8, 9, 10, shows the same ordering and, of course, that goes out to infinity. So primes modulo 3 can't prefer, say, 1 as the residue, as then the composites would have to tend towards 1 as well, when in fact, naturals split by thirds.

That simple idea leads to methods for figuring out how many primes should be in a particular interval using probability methods, as, for instance with 3, 1/2 of the primes will have 1 as a residue modulo 3, so you can say that there is a 50% chance any given prime has 1 as a residue modulo 3. Moving up to the other primes you can get a probability based on its number of non-zero residues, so like with 5, there is a 25% chance that a prime will have a residue modulo 5 of 1, and so forth.

In my previous posts I worked through some equations that you can quickly derive using this simple idea, and now I'd like to talk about yet one more, which is the equation for the odds of a particular prime gap g. A prime gap is just a situation where you have a run of composites between primes, like between 13 and 17, you have 3 composites, 14, 15, 16, but it's called a prime gap of 4, because 17-13=4.

Based on my simple idea the equation for prime gaps is as follows.

The prime gap equation given a natural number x and even gap g greater than 2 is

probPrimeGap = ((pj - 2)/(pj - 1)*...*(1/2))*(1 - ((pj - 2)/(pj-1)*...*(1/2))n*Corr

where n equals (g/2)- floor(g/4) - 1, j is the number of primes up to sqrt(x), pj is the jth prime, and Corr corrects for any odd primes that are factors of g, if there are any, and if there are none Corr=1, otherwise it adjusts to (p-1)/p for that prime versus (p-2)/(p-1).

When g=4 is a special case because if x is prime and x-4 is prime then x-2 cannot be, unless x=7. So for the g=4 case you have the same probability as for twin primes:

probTwinPrime = ((pj - 2)/(pj - 1)*...*(1/2))

So let's try it out with g=6.

For a 6 gap, n = 6/2 - floor(6/4) - 1 = 1, and instead of figuring out the value for Corr it's easier to just note that instead of using 1/2 you use 2/3. So you have:

probPrimeGap = ((5 - 2)/(5 - 1))*2/3*(1 - ((5 - 2)/(5 - 1))*2/3) = (3/4)(2/3)(1 - (3/4)(2/3)) = 0.5*0.5 = 0.25

And 6*0.25 = 1.5, so 1 case is expected and one occurs at 31, 37.

Now reiterating the idea.

First as shown in my post on twin primes if you have x a prime, then you have the probability that x-2 does NOT have a prime p that is less than or equal to sqrt(x) as a factor, is (p-2)/(p-1), but if x may not be prime, you need the probability that it is prime.

You just multiply all the odds of something not happening together to get the odds that none of them happen. So for the first piece of my prime gap equation I just multiplied the probability (p-2)/(p-1) for each prime less than or equal to sqrt(x), where for any primes that are factors of g you just have (p-1)/p as there is 0 probability that p can add to another prime and give a number divisible by itself.

For the second piece I simply consider the probability if x is prime that x-2 is NOT prime or x-4 is not prime, and then that x-6 is NOT prime, and so forth up until before the gap, like with g=10, that goes up to x-8, so n=4, as it turns out each probability is roughly the same, and that's how I get the equation, as I just subtract the probability that x-g is prime from 1.

So can it be wrong?

Well, what is the idea based upon?

It's based upon the primes not having a preference modulo the residue of another prime, so like, if they did, with 3, then it'd have to be the case that numbers leaned towards a particular residue modulo 3, but they don't as they go lockstep in a simple pattern 1, 2, 3, and then 4, 5, 6, and then 7, 8, 9 and so forth, where in each case the residues modulo 3 are the same in a perfect pattern that goes out to infinity. Each prime produces a similar pattern.

So that part of the idea cannot be wrong.

But it is probability.

The expectation value is just the most likely result, but the actual results follow the gaussian distribution also called the normal distribution, so for instance, there is a 50% probability of flipping a coin heads, so given 10 flips the expected number of heads is 5.

But you can sit and flip a coin 10 times and get 10 heads or none.

Oh, one other thing is the question of how big of a gap is possible at various levels, like, of course, you can't have a gap of 10 starting at 11, so what determines it? My guess is that the maximum gap size is p+1 where p is the largest prime less than the sqrt(x), where x is the start of the interval.

But that only matters at smaller values. As you look at bigger gaps the prime gap equation dominates, so that the gap is likely to occur much further down than when it is first allowed.

James Harris

Tuesday, August 01, 2006

Prime distribution--new approach?

The idea I presented in my previous post for looking at the twin primes probability might seem to easily apply with the prime distribution itself giving the odds that a natural number x is prime as

ProbPrime(x) = ((pj - 1)/pj)* ((pj-1 - 1)/pj-1)*...*1/2

where j is the number of primes less than sqrt(x), and pj is the jth prime.

But there's a problem with that approach as the naturals are NOT random. Let's see how well it does and then I'll explain the correct equation.

For instance, the primes up to sqrt(100) are 2, 3, 5, and 7 so I have

ProbPrime(100) = (6/7)*(4/5)*(2/3)*(1/2) = 0.2285 to four significant digits.

or 22.85%, which is nicely close to 1/(ln x), as 1/(ln 100) = 0.2171 to four significant digits.

So it is close but gets progressively off until it is about 12.2% larger than the correct count of primes, but why?

Consider that given a natural number x, roughly 1/3 of numbers up to and including x will have 3 as a factor, so x/3 is about that number, but exactly floor(x/3) will have 3 as a factor, where the floor() function just clips off the remainder, so for instance floor(10/3) = 3, exactly, and notice there are 3 numbers up to and including 10 that have 3 as a factor--3, 6 and 9.

So this idea can be looked at by finding exactly how many numbers have a particular prime as a factor, and subtracting that total from the total of numbers minus 1 as 1 is not prime to get the count of primes, and then dividing by the number.

For example, up to 10, you have

10 - floor(10/2) - floor(10/3) + floor(10/6) + 2 - 1 = 4 primes

which is explained as, subtract the evens from 10, and subtract the numbers divisible by 3, but then you are also subtracting some evens that have 3 as a factor so add back in those that have both as a factor, add 2 for the primes 2 and 3 being subtracted and subtract 1, since 1 is not prime. Divide 4 by 10 and you have 25% probability of primeness.

That way to count primes is called Legendre's Formula and it's not complicated, but what does that have to do with my probability idea?

Well, looking at 2/3 as the odds a number does NOT have 3 as a factor and 1/2 as the odds it does not have 2 as a factor, I have the equivalent of

10 - 10/2 - 10/3 + 10/6 = 10(1 - 1/2 - 1/3 + 1/6) = 10*(1/3)

which shows that this idea is really just like going through the same process as above, but without using floor() so there are remainders floating around, and without correcting for 2 and 3 being prime, as well as 1 not being prime.

But why does that work and not the other approach? Simple answer is, the naturals are NOT random! The list of counting numbers is very ordered as it is, of course, just 1, 2, 3, 4, 5 and so on out to infinity, where you advance by 1. Very ordered, not random, so an equation for random is not correct as rather than say about 2/3 of the numbers up to x have 3 as a factor you can find that exactly x - floor(x/3) do.

So the key difference is use of floor() above versus not using it, and that offers a way of writing out the Legendre's approach that is similar to what we've seen before!

x*(1 - floor(/p)) = x - floor(x/p)

where the numerator is left off to show another step is needed. And you have the rule that

floor(/p1)*floor(/p2) = floor(/p1p2)

which I will immediately show.

So that with 2 and 3 you have:

x*(1 - floor(/2)*(1 - floor(/3) = x - floor(x/2) - floor(x/3) + floor(x/6)

showing the rule that floor(/2)*floor(/3) = floor(/6)).

Notice then correct prime probability using exact numbers is

probPrime(x) = (x*(1 - floor(/pj))* (1 - floor(/pj-1)*...*(1 - floor(/2)) + pi(sqrt(x) - 1)/x

where again pj is the largest prime up to and including sqrt(x), and here I've also used the traditional prime counting function notation versus using my own with my prime counting function, adding back in the primes subtracted out as explained above, and subtracting for 1 as it's not prime as explained above, and then dividing the full count by x to get the ratio.

Notice that is very close to what we had before, except the floor functions clip off certain terms, to give exact values which can be determined because the list of naturals is NOT random, but well-ordered.

In contrast, with the twin primes probability as shown in my previous post, there is no similar way to getting the exact count, as the primes are random with respect to each other.

So there is a simple way to explain how the probability of primeness differs from the random by using the floor() function.

James Harris

Sunday, July 30, 2006

Twin Primes Probability

Here a trivial little result relating to twin primes as if x is prime and greater than 3 the probability that x+2 is prime is given by:

prob = ((pj-2)/(pj-1))*((pj-1-2)/(pj-1-1))*...*(1/2)

where j is the number of primes up to sqrt(x+2), and pj is the jth prime, pj-1 is the prime before it and so forth.

The result is easy as it is just multiplying the probability for each prime that it is NOT true that

x + 2 ≡ 0 mod p

which probability is just the result of dividing one minus the number of non-zero residues by the total number of residues together to get the total probability that a prime plus 2 is also prime.

I thought I'd check quickly to see if the series converges but stopped with the last prime before 100, for which I got:

prob = 0.1558 to 4 significant digits.

So that says that the probability that for a prime between 972 and 1002 that adding 2 to it gives a prime is about 15.58% and there are 66 primes in that interval so there should be about 10 twin primes, and a quick count shows that there are:

(9419, 9421), (9431, 9433), (9437, 9439), (9461, 9463), (9629, 9631), (9677, 9679), (9719, 9721), (9767, 9769), (9857, 9859), (9929, 9931)

Come to think of it, it probably can't converge to a non-zero value but approaches 0 asymptotically, like the prime distribution itself.

Interestingly to get the distribution of twin primes you just multiply that result times the probability that a given number is prime.

Oh yeah, remarkably, efficiently the result holds for any even difference between primes towards infinity--so for any even gap g.

So then if the Twin Primes Conjecture is true it's also true that any even composite can be written as the difference of two primes.


Hmmm...there's no reason for g to have to be positive, it could be negative, as the residue argument still holds.

Then the argument is that probability holds when

x + g ≡ 0 mod p

is NOT true for each prime p up to sqrt(x+g) if g is positive or sqrt(x-g) if it's negative.

So you could have p1 - C = -p2, where C is a composite and the p's are primes, as the residue result will also apply whether you take positive or negative residues, so you can consider the probability for the gap C, and if the Twin Primes Conjecture holds it would be non-zero, approaching 0 asymptotically.


James Harris