Monday, March 30, 2015

A binary plot of the Catalan numbers and the pseudo-Fibonacci series that can be found inside.

I was trying to find in Internet a binary plot of the Catalan numbers, and I did not find anyone... so I did it myself, and here it is!

Catalan Numbers binary plot (first 2000 elements)




















There are not clear patterns inside the binary plot, as it happens for instance with the binary plot of Fibonacci numbers, but it is possible to see a very interesting pattern of decreasing pseudo-Fibonacci series in the upper binary numbers of the binary plot. This characteristic can be found in the binary plot of square numbers and prime numbers as well... why?

Sunday, March 29, 2015

Algorithm to find the bounds of a Godbach's prime couple (p,q), p+q=n, for an even number n by using "congruences paths".

This post is related with the following topics:

# Elementary number theory
# -- Goldbach's conjecture
# --- Modular arithmetic
# ---- Computational number theory
# ----- Prime number generator function

While I was studying how to calculate congruences and making some tests, I found out that it seems possible to generate always for every even number n at least one prime couple (p,q), p+q=n by using the following algorithm based on what I am calling "congruences paths". I think it is just a consequence of the Goldbach's conjecture, so I cannot demonstrate the reason (because it would suppose demonstrating the conjecture!), but I wanted to point out the possibilities of this algorithm and work a little bit on the bounds of the prime pair that I can obtain by using it.

First, I will define when it is possible to obtain (p,q).


I call C(p) a "congruences path" of p for n. Basically it is possible to reach the value of n by summing up the congruences path values (q), which is also a prime number, plus the original starting point p, which is also a prime number.

Not all the congruences paths are starting in a prime, and not all congruences paths starting on a prime sum up a prime number, but there is always at least one congruences path starting in a prime number, whose congruences path sum is also a prime number q, so both sum up to n. Or, if it does not exist, then n/2 is prime, and p=q=n/2.

One important point is that when the congruences path exists, the congruences path element k(i) will never be a factor of n. So there is always a prime p able to generate a congruences path C(p) following the above mentioned algorithm, going through a set of k(i) that are not including factors of n, or if it is not possible to find C(p), then it turns out to be that n/2 is prime, so p=q=n/2.

Trying to understand how these congruences paths look like in Excel, I found out that is possible to find a lower and upper bounds for a very special single pair of (p,q) which is easier to isolate. Here are some samples to get the idea:


There seems to be a lower bound for a very specific kind of (p,q) at LB = (((n-4)//6)*2)+1. In the image, the "X" points out the point p, and the different colors are showing all the different existing paths, in the samples we are focused on the ones targeted with an "X". 

For instance, for n = 56, p=19, C(p) is {18,19}, the path is in red color, q = 18+19 = 37, so p+q = 19+37 = 56, meaning that (p,q) = (19,37) is a Goldbach' pair prime for n=56. The lower bound that I detected (LB=17) is related with the behavior of the congruences for an even number.

There is always a 0 congruence at n/2 (because n/2 is a factor of n), and then from (n/2) the congruences grow up backwards inside the interval [LB, LB+1 .. (n/2)-1, n/2] from 0 to (n/2 - (LB+2)) by 2. For instance: in the case of n=56, LB=17, in the interval [LB,n/2] = [19..28] the congruences are [18,16,14,12,10,8,6,4,2,0].

The observation is that there is always a (p,q) prime pair inside [LB,n/2] or if there is not such pair, then n/2 = p = q is prime.

For those p primes for every even n, I have been able to find the lower bound (LB) as stated above, and an upper bound (UB) based on the family of multiplicative inverse functions (K/x). 

This is how I found the upper bound. Below there is a sample graph about the distance between the found p and its lower bound (LB) for each even number n, so f(n)=p/LB showing for each n that proportion.


As n gets higher, the relative distance from p to the lower bound gets smaller. And the shape of the decreasing functions looks familiar. First I tried to check if it was following a statistical distribution with some initial noise, so I tried to find the distribution of the dataset with some applications, but no one of them was close enough. Then I found out that by using the family of functions f(x)=K/x where K is an integer constant I was able to follow the decreasing path. The trick is, when the distance to the current upper bound is under an specific critical value, then jump to a bigger K, f(x)=K/x function and then continue following the decreasing data. Doing so, it looks like this:


In the image above, the upper bound function so far jumped twice, for instance, the first time is around n=153852 because the upper bound UB gets too closer to the p that was found. In that moment, the upper bound jumps to a new bigger function of the f(x) = K/x family, so the new UB calculation is again over the critical distance to the following p's that will be found, and then continues safely decreasing as the data set also decreases, but always being able to be a close upper bound (not touching the dataset). 

 

 

 The Python code

Here is the Python code that applies the above explanation.

# Elementary number theory
# -- Goldbach's conjecture
# --- Modular arithmetic
# ---- Computational number theory
# ----- Prime number generator function
# By David@http://hobbymaths.blogspot.jp/

#import cProfile

def Goldbach_Congruences():
    from math import floor, log, sqrt
    from gmpy2 import xmpz, c_mod, mul, is_odd, is_prime, add, sub, divexact, set_cache, is_square, digits, div

    # Print calculation time
    def printTime():
        import time, datetime
        ts = time.time()
        st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
        print(st+"\n")

    printTime()

    # Print results
    print("n"+"\t"+"q"+"\t"+"p"+"\t"+"Lower bound"+"\t"+"Upper bound"+"\t"+"C(p)"+"\t"+"Len(C(p))"+"\t"+"(K/n)")

    # Set the test limit here
    test_limit = 1000

    # upper bound function f(x) = K/x. K will vary dinamically following a rule (below)
    K = 4000
   
    # K_div_n = f(n) = K/n
    K_div_n = 1
   
    for n in range(4,test_limit,2):
   
        # when p if found finished = True
        finished = False
       
        a=[]
        acu = 0       
       
        # lower and upper bounds are calculated
        lower_bound = (((n-4)//6)*2)+1
       
        # this is the by default upper bound for n < 4000
        # there is a lot of noise in the lower integers n so the first interval is [4..4000]
        # this one is a trial-error bound due to the noise
        upper_bound = lower_bound + ((((n//2)-lower_bound)*19)//20)

        # when n>4000 the upper bound is calculated from the f(x)=K/x family of multiplicative inverse functions
        if n > 4000:
            K_div_n = K/n
            upper_bound = int((K/n)*((n//2)-lower_bound))+lower_bound
       
        # now the algorithm will try to find a p in the interval [lower_bound, upper_bound]
        p = lower_bound

        while (p<=upper_bound):
            if is_prime(p):   
                a=[]
                current_k = p
                last_a = 0
                acu = 0

                # tries to find the path C(p) for the current prime p
                while (not finished):
                   
                    current_k = last_a + current_k
                   
                    if (current_k == n):
                        if is_prime(acu) and is_prime(n-acu):
                            finished = True
                            break
                                           
                    current_a = (((current_k-1)*2)+(n+2))%current_k
                    a.append(current_a)
                    acu = acu + current_a
                   
                    # if a factor is found, the calculation of the path stops
                    if current_a == 0:
                        a = []
                        break
                    last_a = current_a
                   

            if finished == True:
                break
            p=p+2
           

        if a==[]:
            # (p,q) pair was not found
            if (is_prime(n//2)):
                # n/2 = p = q should be prime
                print(str(n)+"\t"+str(n//2)+"\t"+str(n//2)+"\t"+str(lower_bound )+"\t"+str(upper_bound )+"\t"+"["+str(n//2)+"]"+"\t"+"1"+"\t"+str(K_div_n))
            else:
                # If this message appears then the algorithm is wrong!
                print(str(n)+"\t"+str(n//2)+"\t"+str(n//2)+"\t"+str(lower_bound )+"\t"+str(upper_bound )+"\t"+"ERROR ["+str(n//2)+"]"+"\t"+"1"+"\t"+str(K_div_n))
                break

        else:
            # (p,q) pair was found
            print(str(n)+"\t"+str(acu)+"\t"+str(n-acu)+"\t"+str(lower_bound )+"\t"+str(upper_bound )+"\t"+str(a)+"\t"+str(len(a))+"\t"+str(K_div_n))
           
            # if n > 4000 then we are using an upper bound based on f(x) = K/x for an specific integer K.        
            if n > 4000:
                # rule of critical value: if a sixth part of the distance of p to LB (d(LB,p)/6) is greater than the distance to the upper bound (d(p,UB)
                # then jump from the current f(x)=K/x upper bound function to the next one, because the current one is going to collide with the dataset,
                # so it is not safe to use it anymore.
                if int((n-acu-lower_bound)/6) > (upper_bound-(n-acu)):
                    # the bigger n is, the lower K increasing is required (this is just an observation based on the calculations that seems to fit very well)
                    # so the new upper bound function is defined. From this n point the new function will be used to calculate the upper bound UB.
                    K = K + int(4000)

    printTime()
       
#cProfile.run('Goldbach_Congruences()')
Goldbach_Congruences()

E.g. I have generated every C(p) for the interval of even number in [4,19999998], and for every n it was able to define a correct [UB,LB] where a (p,q) Goldbach's prime pair is found relatively close to the lower bound. The algorithm checks for every even number n which one is the lowest prime number p able to generate a congruences set C(p) for n, starting from LB. If the stop condition, UB, is reached, then C(p) does not exist for the current n, and (p,q)=(n/2,n/2).

Here are some results. 

(n,    q,    p,    Lower bound,    Upper bound,    C(p),    Len(C(p)),    (K/n))
(4,   2,    2,    1,    1,    [2],    1,    1)
(6,    3,    3,    1,    2,    [3],    1,    1)
...
(14,    11,    3,    3,    6,    [2, 4, 5],    3,    1)
(16,    11,    5,    5,    7,    [1, 4, 6],    3,    1)
(18,    13,    5,    5,    8,    [3, 2, 8],    3,    1)
(20,    13,    7,    5,    9,    [6, 7],    2,    1)
(22,    11,    11,    7,    10,   [11],    1,    1)
...
(19999998,13333249,666749,6666665,6669998,[6666500, 6666749],2,0.00100000010000001)

Another interesting point is that when Len(C(p)) = 3 two consecutive times at [n,n+2], that is because p or q is a twin prime. There are more of such rules to detect the twin primes in the list depending on the content of C(p) of n and C(p) of (n+2).

E.g.:

(n,    q,    p,    Lower bound,    Upper bound,    C(p),    Len(C(p)),    (K/n))
 24    17    7    7    11    [3, 4, 10]    3    1
26    19    7    7    12    [5, 2, 12]    3    1

I did not found any kind of "congruences path" definition like this in Internet, so if anybody knows if this was studied before, it would be appreciated any reference.

PD: I did a similar review of congruences paths for odd numbers, explained at Mathematics Stack Exchange, but it seems nobody had information about it.

Friday, March 13, 2015

A new kind of simpler 1D cellular automata? Self-replication based cellular automata.

I came across a way of creating very simple 1D cellular automata, that might be new, still I did not find any references about this method in the main sources of information about this kind of structures. 

The type I have been studying is a little bit different from the ones explored and extended by Stephen Wolfram (previously by Von Neumann, Ulam...), but that could be included on his classification. I call them "self-replication cellular automata". It is not a substitution system, neither a cellular automaton, and according to Wolfram's classification it is a one-dimensional neighbour dependant structure.

I think is in the middle of them, sharing properties of both types of pattern generation types. Initially I did not find any reference inside the Wolfram's automata atlas, and other pattern generation types classifications, so it could be possible that my so-called "self-replication" automaton is still not classified with samples per se

The results in terms of the complexity of the behavior of a self-replication automaton when it evolves are very similar to the ones obtained by using the standard replication rules classified by Wolfram. (e.g. Rule 30, Rule 90, etc.) but without using them. 

It is simpler, so all the rich complexity that can be obtained by using the more elaborated standard replication rules is not available in all its full extent by using my self-replication version, but is very close to some of the behaviors of the standard structures and closer to what I think could be found in real life. 

The replication rule is as follows.

1. The seed: or original state of the 1D cellular automaton, or first row of the evolution graphic, is a sequence or word, for instance {10*} (= "1000000..."), or whatever sequence we want to study, for instance "mod 2" is the sequence in which we put a high value '1' to the points that are positioned in those numbers of the sequence that are multiple of 2, including the position 0 as the first element of the sequence. This is the seed for that case: {10101010...} = {(10)*} We can study any sequence, e.g. prime numbers, perfect squares, arithmetic series, etc.

2. A "High state" will be defined as a continuous sequence of '1's and a "Low state" will be a continuous sequence of '0's. A stable state is defined as a "High state" or a "Low state", independently of the quantity of elements of the high or low state (an isolated 0 or an isolated 1 is also a stable state).

3. The next evolution of the automaton will be performed as follows: from left to right, in the last generated sequence of the automaton (in the first step it is the seed) the high states {1*} are replaced with the seed again (form left to right) and same for low states. That is all. 

And here are some results (click to enlarge). 























It is possible to reproduce for instance the famous Sierpinski gasket (Pascal's triangle mod 2) by using the seed {10*}, and also some other sequences to visualize turbulence in boundaries, as the ones generated by the standard Wolfram's rules. The main visual difference is that avoiding the substitution rules makes the orientation of the self-replication patterns a little bit different from the standard cellular automaton (which is usually symmetrical) and the results are not so wide as the ones obtained by the use of the standard rules, it is a reduced set of those results. 

It is possible also to use more that 2 states (1,0), for instance replacing in the seed stable states with a natural number sequence {0123...} and using the same replication rule it is possible to obtain a very complex cellular automaton with (in)finite states. Applying a different color to a different value provides very interesting patterns (for instance replacing the {0,1,2,3..} values of the sequence by a black-to-red-to-yellow-to-white gradient depending on the growing values shows a nice visual fractal-like pattern). 

The main advantage about the self-replication patterns is that they are simpler than the rule-generated automaton, and (I guess) closer to the ones that could be biologically generated. For instance, always mollusc shells are shown as a sample of this kind of patterns, but being dependant on a rule (e.g. the above mentioned Rule 30) for a replication of the patters seems more complex than a replication simply based on two different states (high, low) of the original pattern seed. I think that using self-replication is closer to a biological pattern, more organic, because biochemically it might be easier probably to launch a chemical reaction depending on an original chemical or electrical high or low state inside a cell or group of cells. The organism would not need "to store" genetically a more complex replication rule, it would just need to replicate the original pattern, so stable states of the pattern will lead to a replication of the original seed.

If somebody already heard about this kind of more simple "self-replication" automaton patterns I found please let me know, I would like to learn more about them if they were defined before.

Hobbymath's wanderings #4

Fibonacci word.

Pseudoprimes appearing as primes in the GNU GMP library.

Prime Tests.

Pseudo-Fibonacci numbers.

Brun's constant.

Index of biographies of mathematicians.

Mathpuzzle.

Binary Plot.

Catalan numbers.

Elementary cellular automaton.

Fibonacci word fractal (pdf).

Wednesday, March 11, 2015

A prime number generator algorithm based on x^2+(x-1)^2 that generates only primes.

(I wrote also a question about this in Mathematics Stack Exchange, here)

Update: a senior fellow at the Mathematics Stack Exchange clarified the reason why it works. It is just a trivial brute-force algorithm because for each value in [1..s] the algorithm is going through all f2(k+s) mod k until the first 0 congruent value is found, which turns to be a factor of f2(s) and then jumps to the next s. So basically finds for every f2(k) the first prime number that is a factor of f2(k). I did not see the forest for the trees. I am learning so much from them!

I think I could have found a prime number generator algorithm, but still I am not very sure, maybe this is an already known property of perfect square numbers, maybe not, but it looks amazing, so I wanted to share here the idea and the code.

The Python Code

# Elementary number theory
# -- Modular arithmetic (perfect square numbers)
# --- Computational number theory
# ----- Prime number generator function

# created by David @ Hobbymaths
# http://hobbymaths.blogspot.jp/

#import cProfile

def prime_number_generator():
    from math import floor, log, sqrt
    from gmpy2 import xmpz, c_mod, mul, is_odd, is_prime, add, sub, divexact, set_cache, is_square, digits

    # the execution takes around 50 minutes, depending on the computer
    def printTime():
        import time, datetime
        ts = time.time()
        st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
        print(st+"\n")

    printTime()

    testlimit = xmpz(3000000)

    # insert all the f2 values in resfunc
    resfunc = []
    for n in range(1, testlimit):
        resfunc.append(add(sub(mul(50,mul(n,n)),mul(10,n)),1))

    totlist = []
    # shift S
    for s in range (0,50000):
        found = False
        # range K
        for k in range (1,testlimit-s-1):
            if (k!=1) and (resfunc[s+k]%k == 0 ):
                if (is_prime(k)):
                    totlist.append(str(k))
                else:
                    totlist.append("If this message appears the algorythm is wrong")
                found = True
                break
               
        #if not found:
            # if no prime number was found in the current range K[1..k] then append 0 if you want to know which shift s
            # does not find a prime number in the range K
            #totlist.append("0")
           
    mystr=""
    for l in totlist:
        mystr = mystr + l + "\t"

    # Will print the generated prime numbers
    print(mystr)

    printTime()
       
#cProfile.run('prime_number_generator()')
prime_number_generator()

       

1. How did I find it?

While I was studying the properties of perfect square numbers, I came across an observation about how some perfect square numbers were able to generate prime numbers when added to the previous or next perfect square. Specifically only those perfect squares of a natural number n = 5*k / k belongs to N, in other words, those perfect squares ending with "00" or "25", are able to generate prime numbers when added to the previous or next perfect square, and of course that happens only sometimes.

I focused on those who are able to generate two primes, with the previous and next perfect square.

For instance, these two samples:

n=(5*k)

n=(5*5=) 25+ (4*4=) 16(previous) = 41 (prime)
n=(5*5=) 25+ (6*6=) 36(next) = 61 (prime)

n=(30*30=) 900 + (29*29=) 841(previous) = 1741 (prime)
n=(30*30=) 900 + (31*31=) 961(next) = 1861 (prime)

...being the distance between those primes 4*k

Unfortunately, there is not a clear rule why some of the perfect squares only ending with "00" or "25" are able to obtain prime numbers.

For that reason, I wanted to learn more about the functions that are generating the above mentioned results, which are f1=x^2+(x+1)^2 and f2=x^2+(x-1)^2, but applied to x = 5*k and there was a great surprise in the results of that study.

2. The prime number generator algorithm (only generates primes):

First I wrote the same functions replacing x = 5*k:

f1(k)=50k^2+10k+1 and f2(k)=50k^2-10k+1.

I just wanted to know the relationship between k, f1(k) and f2(k), so I started to play with modular arithmetic, it was easier to use first f2, calculating some congruences series in the following way:

1) (shift s=1) Calculated the congruences serie [f2(k+1) mod k] for a range K [1..k], and I found out that there are only two k numbers whose congruence is 0, one of them is k=1 and the other was a prime number, k=41.

2) (shif s=2) Then, I did the same calculation of the congruences serie but shifting one number to the right, so now I calculated the congruences serie [f2(k+2) mod k] for all k, and again the two k numbers congruent with 0 were k=1 and in this case a new prime, k=181.

3) (shift s) I extended the calculation of congruence series shifting up from 3 to s=50000 to the right, and always in the congruences serie associated to the current s, the non trivial (I consider k=1 the trivial solution) k number that is congruent with 0 is always a prime number, when I have been able to find that number in the limits of the range of number K [1..k] I am using for the tests(*).

(*) Sometimes I cannot find the prime number, because, depending on the shift s, the 0 congruent primer number appears later, because it might be very big, and does not appear in my current defined K[1..k] range (of my Python test code) or it would be possible also that for certain s values there is no 0 congruent number in the congruences serie.

My tested range was K[1..50000] and S[1..50000] and it was able to find 40357 prime numbers inside the K range, but for some S values, the prime number still not appears inside the K range (in this test case S=50000 so 50000 - 40357 = 9643 primes are not found for a certain s shift value).

The greater prime found is 489061 and the smaller is 13, and they appear unordered along the shifting of S.

So according to the results, it seems possible to generate a list of only prime numbers by calculating f2(k+s) into a specifig range of K [1..k] and finding the first non trivial 0 congruent k which turns to be a prime number, applying it for a shifting S [1..s] (**). Each [f2(k+s) mod k] congruences serie provides a prime number for the final list of primes (when the prime number which is congruent to 0 is found inside K).

(**)I did not make the setup for f1 because is easier to work with f2.

The generated prime numbers list has repeated elements and the primes appear unordered, but it seems to provide only primes inside the range K[1..k].

The first k elements of the list are (for s=1, s=2, s=3...):

k=41,181,421,761, 1201, 1741, 2381, 3121, 17, 13, 13, 73, 53, 9661, 17, 12641, 14281, etc.

I did not find any reference in OEIS.

I am trying to understand why it works, but it is very promising. Please if somebody has some ideas about why the algorithm is working (or if there is a counterexample I did not see), please let me know.

I wonder if this is new or somehow it was already known, or I am wrong...
David