Sunday, March 29, 2015

Algorithm to find the bounds of a Godbach's prime couple (p,q), p+q=n, for an even number n by using "congruences paths".

This post is related with the following topics:

# Elementary number theory
# -- Goldbach's conjecture
# --- Modular arithmetic
# ---- Computational number theory
# ----- Prime number generator function

While I was studying how to calculate congruences and making some tests, I found out that it seems possible to generate always for every even number n at least one prime couple (p,q), p+q=n by using the following algorithm based on what I am calling "congruences paths". I think it is just a consequence of the Goldbach's conjecture, so I cannot demonstrate the reason (because it would suppose demonstrating the conjecture!), but I wanted to point out the possibilities of this algorithm and work a little bit on the bounds of the prime pair that I can obtain by using it.

First, I will define when it is possible to obtain (p,q).


I call C(p) a "congruences path" of p for n. Basically it is possible to reach the value of n by summing up the congruences path values (q), which is also a prime number, plus the original starting point p, which is also a prime number.

Not all the congruences paths are starting in a prime, and not all congruences paths starting on a prime sum up a prime number, but there is always at least one congruences path starting in a prime number, whose congruences path sum is also a prime number q, so both sum up to n. Or, if it does not exist, then n/2 is prime, and p=q=n/2.

One important point is that when the congruences path exists, the congruences path element k(i) will never be a factor of n. So there is always a prime p able to generate a congruences path C(p) following the above mentioned algorithm, going through a set of k(i) that are not including factors of n, or if it is not possible to find C(p), then it turns out to be that n/2 is prime, so p=q=n/2.

Trying to understand how these congruences paths look like in Excel, I found out that is possible to find a lower and upper bounds for a very special single pair of (p,q) which is easier to isolate. Here are some samples to get the idea:


There seems to be a lower bound for a very specific kind of (p,q) at LB = (((n-4)//6)*2)+1. In the image, the "X" points out the point p, and the different colors are showing all the different existing paths, in the samples we are focused on the ones targeted with an "X". 

For instance, for n = 56, p=19, C(p) is {18,19}, the path is in red color, q = 18+19 = 37, so p+q = 19+37 = 56, meaning that (p,q) = (19,37) is a Goldbach' pair prime for n=56. The lower bound that I detected (LB=17) is related with the behavior of the congruences for an even number.

There is always a 0 congruence at n/2 (because n/2 is a factor of n), and then from (n/2) the congruences grow up backwards inside the interval [LB, LB+1 .. (n/2)-1, n/2] from 0 to (n/2 - (LB+2)) by 2. For instance: in the case of n=56, LB=17, in the interval [LB,n/2] = [19..28] the congruences are [18,16,14,12,10,8,6,4,2,0].

The observation is that there is always a (p,q) prime pair inside [LB,n/2] or if there is not such pair, then n/2 = p = q is prime.

For those p primes for every even n, I have been able to find the lower bound (LB) as stated above, and an upper bound (UB) based on the family of multiplicative inverse functions (K/x). 

This is how I found the upper bound. Below there is a sample graph about the distance between the found p and its lower bound (LB) for each even number n, so f(n)=p/LB showing for each n that proportion.


As n gets higher, the relative distance from p to the lower bound gets smaller. And the shape of the decreasing functions looks familiar. First I tried to check if it was following a statistical distribution with some initial noise, so I tried to find the distribution of the dataset with some applications, but no one of them was close enough. Then I found out that by using the family of functions f(x)=K/x where K is an integer constant I was able to follow the decreasing path. The trick is, when the distance to the current upper bound is under an specific critical value, then jump to a bigger K, f(x)=K/x function and then continue following the decreasing data. Doing so, it looks like this:


In the image above, the upper bound function so far jumped twice, for instance, the first time is around n=153852 because the upper bound UB gets too closer to the p that was found. In that moment, the upper bound jumps to a new bigger function of the f(x) = K/x family, so the new UB calculation is again over the critical distance to the following p's that will be found, and then continues safely decreasing as the data set also decreases, but always being able to be a close upper bound (not touching the dataset). 

 

 

 The Python code

Here is the Python code that applies the above explanation.

# Elementary number theory
# -- Goldbach's conjecture
# --- Modular arithmetic
# ---- Computational number theory
# ----- Prime number generator function
# By David@http://hobbymaths.blogspot.jp/

#import cProfile

def Goldbach_Congruences():
    from math import floor, log, sqrt
    from gmpy2 import xmpz, c_mod, mul, is_odd, is_prime, add, sub, divexact, set_cache, is_square, digits, div

    # Print calculation time
    def printTime():
        import time, datetime
        ts = time.time()
        st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
        print(st+"\n")

    printTime()

    # Print results
    print("n"+"\t"+"q"+"\t"+"p"+"\t"+"Lower bound"+"\t"+"Upper bound"+"\t"+"C(p)"+"\t"+"Len(C(p))"+"\t"+"(K/n)")

    # Set the test limit here
    test_limit = 1000

    # upper bound function f(x) = K/x. K will vary dinamically following a rule (below)
    K = 4000
   
    # K_div_n = f(n) = K/n
    K_div_n = 1
   
    for n in range(4,test_limit,2):
   
        # when p if found finished = True
        finished = False
       
        a=[]
        acu = 0       
       
        # lower and upper bounds are calculated
        lower_bound = (((n-4)//6)*2)+1
       
        # this is the by default upper bound for n < 4000
        # there is a lot of noise in the lower integers n so the first interval is [4..4000]
        # this one is a trial-error bound due to the noise
        upper_bound = lower_bound + ((((n//2)-lower_bound)*19)//20)

        # when n>4000 the upper bound is calculated from the f(x)=K/x family of multiplicative inverse functions
        if n > 4000:
            K_div_n = K/n
            upper_bound = int((K/n)*((n//2)-lower_bound))+lower_bound
       
        # now the algorithm will try to find a p in the interval [lower_bound, upper_bound]
        p = lower_bound

        while (p<=upper_bound):
            if is_prime(p):   
                a=[]
                current_k = p
                last_a = 0
                acu = 0

                # tries to find the path C(p) for the current prime p
                while (not finished):
                   
                    current_k = last_a + current_k
                   
                    if (current_k == n):
                        if is_prime(acu) and is_prime(n-acu):
                            finished = True
                            break
                                           
                    current_a = (((current_k-1)*2)+(n+2))%current_k
                    a.append(current_a)
                    acu = acu + current_a
                   
                    # if a factor is found, the calculation of the path stops
                    if current_a == 0:
                        a = []
                        break
                    last_a = current_a
                   

            if finished == True:
                break
            p=p+2
           

        if a==[]:
            # (p,q) pair was not found
            if (is_prime(n//2)):
                # n/2 = p = q should be prime
                print(str(n)+"\t"+str(n//2)+"\t"+str(n//2)+"\t"+str(lower_bound )+"\t"+str(upper_bound )+"\t"+"["+str(n//2)+"]"+"\t"+"1"+"\t"+str(K_div_n))
            else:
                # If this message appears then the algorithm is wrong!
                print(str(n)+"\t"+str(n//2)+"\t"+str(n//2)+"\t"+str(lower_bound )+"\t"+str(upper_bound )+"\t"+"ERROR ["+str(n//2)+"]"+"\t"+"1"+"\t"+str(K_div_n))
                break

        else:
            # (p,q) pair was found
            print(str(n)+"\t"+str(acu)+"\t"+str(n-acu)+"\t"+str(lower_bound )+"\t"+str(upper_bound )+"\t"+str(a)+"\t"+str(len(a))+"\t"+str(K_div_n))
           
            # if n > 4000 then we are using an upper bound based on f(x) = K/x for an specific integer K.        
            if n > 4000:
                # rule of critical value: if a sixth part of the distance of p to LB (d(LB,p)/6) is greater than the distance to the upper bound (d(p,UB)
                # then jump from the current f(x)=K/x upper bound function to the next one, because the current one is going to collide with the dataset,
                # so it is not safe to use it anymore.
                if int((n-acu-lower_bound)/6) > (upper_bound-(n-acu)):
                    # the bigger n is, the lower K increasing is required (this is just an observation based on the calculations that seems to fit very well)
                    # so the new upper bound function is defined. From this n point the new function will be used to calculate the upper bound UB.
                    K = K + int(4000)

    printTime()
       
#cProfile.run('Goldbach_Congruences()')
Goldbach_Congruences()

E.g. I have generated every C(p) for the interval of even number in [4,19999998], and for every n it was able to define a correct [UB,LB] where a (p,q) Goldbach's prime pair is found relatively close to the lower bound. The algorithm checks for every even number n which one is the lowest prime number p able to generate a congruences set C(p) for n, starting from LB. If the stop condition, UB, is reached, then C(p) does not exist for the current n, and (p,q)=(n/2,n/2).

Here are some results. 

(n,    q,    p,    Lower bound,    Upper bound,    C(p),    Len(C(p)),    (K/n))
(4,   2,    2,    1,    1,    [2],    1,    1)
(6,    3,    3,    1,    2,    [3],    1,    1)
...
(14,    11,    3,    3,    6,    [2, 4, 5],    3,    1)
(16,    11,    5,    5,    7,    [1, 4, 6],    3,    1)
(18,    13,    5,    5,    8,    [3, 2, 8],    3,    1)
(20,    13,    7,    5,    9,    [6, 7],    2,    1)
(22,    11,    11,    7,    10,   [11],    1,    1)
...
(19999998,13333249,666749,6666665,6669998,[6666500, 6666749],2,0.00100000010000001)

Another interesting point is that when Len(C(p)) = 3 two consecutive times at [n,n+2], that is because p or q is a twin prime. There are more of such rules to detect the twin primes in the list depending on the content of C(p) of n and C(p) of (n+2).

E.g.:

(n,    q,    p,    Lower bound,    Upper bound,    C(p),    Len(C(p)),    (K/n))
 24    17    7    7    11    [3, 4, 10]    3    1
26    19    7    7    12    [5, 2, 12]    3    1

I did not found any kind of "congruences path" definition like this in Internet, so if anybody knows if this was studied before, it would be appreciated any reference.

PD: I did a similar review of congruences paths for odd numbers, explained at Mathematics Stack Exchange, but it seems nobody had information about it.

No comments:

Post a Comment