Translate

Friday, October 17, 2014

Modular integer factorization algorithm

The general reduced form of the binary quadratic Diophantine equation x2 - Dy2 = F can be solved modularly to give:

y2 = (r2 - 2m'r)/D mod N'

2x = r + (Dy2)/r mod N'

Where N' is any integer for which F is a quadratic residue, where: m'2 = F mod N'

This result is found by the modular factorization:

x2 - F = Dy2  = (x - m')(x + m') mod N'

by considering r, a residue of N', where: x+m' = r, and x-m' = (Dy2)/r

That (r2 - 2m'r)/D must be a quadratic residue of N gives a constraint which can be used for a modular integer factorization algorithm.

Modular algorithm:

Let D = 4. Then x2 - 4y2 = F, will have solutions for any odd F, as long as F = 1 mod 4. So F is positive or negative as every odd integer is either 1 or -1 mod 4.

And of course we have a factorization with: (x-2y)(x+2y) = F.

Then collect a list of primes p for which F is a quadratic residue. That should be the case for roughly 50% of primes. The needed size of that list will depend on the size of x and y necessary to factor. The primorial gives roughly the size of the factor that can be handled by any particular number of primes in the list.

Given two primes p1 and p2 on that list, Let N' equal the product of the two primes, and calculate y and x, which may have multiple solutions. There will be two values for m' and you need to use both! While y will tend to have two distinct solutions for each r that works, they differ by each prime only by sign, so you only need to use one.

Discard values where r = 2m' or -2m' mod N', as those always exist and give no useful information, and cover the trivial factorizations, unless there are no other solutions, as then y = 0 mod N is forced. For other cases where y2 = 0 mod N' verify that x and y calculated actually work with x2 - 4y2 = F mod N'. Discard any cases where they do not.

Use your x and y to find the factors mod N'. The two factors are (x-2y) and (x+2y).

For example, f1 = x - 2y mod N', and f2 = x+2y mod N' is an option.

There must be solutions since the explicit equation has solutions. But this approach automatically leaves off the trivial factorization. If no other solutions besides the trivial factorization, represented by r = 2m' or -2m mod N', exist then F must be a prime number.

Save possible factors by prime, so for instance if you have f1 mod p1p2 you'd take f1 mod p1 and store it, and then f1 mod p2 and store it, and increment the count for each factor residue by prime.

Save linked prime for each if it is less than the larger prime, assuming you go from smaller primes to greater in your list or vice versa otherwise. That way you save only one linked prime per prime.

Repeat with each combination of your list of primes by 2, and keep a hit count for each factor modulo a particular prime. You can optionally store x and y as well, but shouldn't need them.

Values that are the same across both primes have the highest probability of coming from the explicit solutions as they must always exist.

After you have completed looping through all prime combinations by two, and stored values by prime incrementing the counts as they are verified, you should have resonance spikes around values that come from the explicit solutions.

That is, the values with the highest counts should be values from the explicit solutions.

Now solve for the explicit factors using the solutions at each prime, which is of course easy, as for instance you could use the Chinese Remainder Theorem. If you have multiple factors you can use linked primes to tell which factors go with each other.

I'll use a small example to show some numbers without demonstrating the full algorithm as described above.

Let F = -119. I need it negative as I need F mod 4 = 1. I note that 3 and 5 are primes that will work, so will take them as a pair with N' = 15.

As m'2 = F mod N' = -119 mod 15 = 1 mod 15, m' = 1 mod 15 is one solution, while m' = 11 mod 15 is the other. Using m' = 1 mod 15 first.

So: y2 = 4(r2 - 2r) mod 15

And: 2x = r + (4y2)/r mod 15

The quadratic residues of 15 are: 1, 4, 6, 9, 10

We find that r = 2, gives 0, and works, but no other values work except:

r = 6 and r = 11, both give y2 = 6 mod 15, which has y = 6 mod 15 as a solution, and is a non-trivial case.

Solving for x with the first: 2x = 6 + (4(6))/6 mod 15 = 6+4 mod 15 = 10 mod 15, so x = 5 mod 15.

And r = 11, gives x = 10 mod 15, which of course is -5 mod 15.

So x = 5 mod 15 and y = 6 mod 15 are solutions.

And x = 5 and y = 6 work explicitly, which is why it's a small example for basic demonstration purposes. In general of course the factor can be much larger than the primes being used.

x2 - 4y2 = -119 = (x+2y)(x-2y) = (5 + 2*6)(5 - 2*6) = (17)(-7)

The other case is m' = 11 mod 15.

So: y2 = 4(r2 - 22r) mod 15 = 4(r2 - 7r) mod 15

And: 2x = r + (4y2)/r mod 15

And r = 7 mod 15 works, to give x = 11 mod 15, and y = 0 mod 15. This case is removed though because 11(11) = 121, which is larger than 119.

If you were using the algorithm you would increment the count for 2 mod 3, and for 2 mod 5, for one factor, and increment 2 mod 3 for the second and 3 mod 5 for the second.

So for factor one, 0 mod 3, and 1 mod 3 would have zero hits, while 2 mod 3 would have one hit, and 2 mod 5 would have one hit, while all other residues and 0 mod 5 would have zero hits. And for factor two, 2 mod 3 would be the only one with a hit modulo 3, while 3 mod 5 would be the only one modulo 5.

The linked prime for 2 mod 3 would be 5 with the solution 2 mod 5.

You only need to link one way, so conversely could link 2 mod 5 to 3 with the solution 2 mod 3. And probably only need one link, so might choose the next closest prime, either larger or smaller.

Linking primes allows you to distinguish multiple factorizations. Information would be unnecessary if there are only two prime factors, as in the example.


James Harris

No comments: