Saturday, February 21, 2015

Fibonacci - Catalan deterministic prime number sieve

This text is related with the following topics:

# Elementary number theory
# - Combinatorial Number Theory
# -- Sieve methods (Fibonacci, Catalan)
# --- Computational number theory
# ----- Primality-testing (deterministic)


As I was learning about the properties of the Pascal's triangle, I came across a probably well known property of the prime numbers, but I did not find much information about it on Internet, online papers, or books. Basically, playing with the properties of the triangle, I was able to prepare a prime number sieve based on the Fibonacci and Catalan numbers, both included in the triangle that seems to be a deterministic sieve, this means that the set of pseudoprimes generated by the Fibonacci primality rule and the Catalan primality rule are disjoint sets, so the combination of both rules provides a deterministic prime sieve. I will write more about the reasons why I decided to prepare this type of sieve in a former post. 

There are two well known primality tests using Fibonacci and Catalan numbers. It is possible to find several online papers of several universities about the primality properties of Fibonacci (Lucas) and Catalan sequences and general information about them in the Wikipedia and other sources of information. Both primality tests are able to generate prime numbers but also there are sets of pseudoprimes (false candidates) that verifiy the primality rules in both tests.

E.g.: "Catalan numbers, primes and twin primes" paper of Christian Aebi and Grant Cairns,
of College Calvin and the Department of Mathematics of La Trobe University (Melbourne, Australia) respectively.
(In Google usually the pdf is available in the first link that appears in the results page)

Basically my observation (computational, non theoretical) is that the Fibonacci pseudoprimes and the Catalan pseudoprimes are disjoint sets, so it is possible to create a sieve by using both primality rules as a two step primality filter. 

For that purpose I have created a prime number sieve (in Python) able to obtain the prime numbers applying the Fibonacci and Catalan primality rules. I have run tests in the range N = (4, 10,000,000) and no pseudoprimes appeared, all the candidates that verify both rules are true prime numbers. So the sieve seems to be a deterministic sieve.

It is not competitive in terms of computing time as the usual probabilistic sieves, the point here is to verify that the pseudoprimes associated to the Catalan prime sieve and the pseudoprimes associated to the Fibonacci (Lucas) prime sieve are disjoint sets. About this point I did not find a clear source of information in Internet, and I find it very interesting.

The following is the explanation of the algorithm and the Python code, so anyone can run it. 
Python 3.4.2, gmpy2 and Cython are required to run the tests.


THE ALGORITHM


It is based on two properties of the prime numbers.

It sieves the primes inside a selected N range = (4, test_limit),
(2,3,5) are included by default.

1. Fibonacci numbers primality, calculation of the Fibonacci number associated to the current integer n.

For every n in N, the following association is done (is a shifting of Fibonacci to optimize the computing time):

n=4, fib = 2
n=5, fib = 3
n=6, fib = 5
n=7, fib = 8
n=8, fib = 13
etc.

(where 2,3,5,8,13 is the Fibonacci sequence)

PRIMALITY RULE (1):
n is a Fibonacci prime (candidate) if its associated Fibonacci number mod n is = 0 or = 1 (is congruent 0 or 1 mod n) 

(the current Python version uses gmpy2.c_mod function, so the congruences are negative, so some operation is performed to recover the corrent number,
but the applied rule is the above mentioned rule)

2. Catalan numbers primality, (related with Wilson Theorem)
http://en.wikipedia.org/wiki/Wilson%27s_theorem

2.1 For every n ODD inside N:

The Catalan number C(((n-5)/2)+1) is associated to n.

The calculation of the associated Catalan number recursively is as follows:

The Catalan number C(((n-5)/2)+1) is associated to n and C(((n-5)/2)+1) = (Catalan_number_associated_to_n-2_in_previous_recursive_step * (n-4) * 4)//(n-1)
// means integer division

The first odd number inside N is 5 and for this value its associated Catalan number is 1.

E.g, here are some samples of the recursive calculation:

(n,(((n-5)/2)+1))

n=5, will be associated to the Catalan number 1
n=7, will be associated to the Catalan number 2
n=9, will be associated to the Catalan number 5
n=11, will be associated to the Catalan number 14
n=13, will be associated to the Catalan number 42
etc.

(where 1,2,5,14,42 is the Catalan Numbers sequence)

PRIMALITY RULE (2):
n is a Catalan prime number (candidate) if its associated Catalan Number mod n when is multiplied by 4 is equal exactly to (n-1) or (n+1), so it is (n-1) or (n+1) congruent.
In other words: 4*Catalan_number_associated_to_n is (n-1) o (n+1) congruent (mod n).
In the code I am appliying the (Catalan mod n)*4 approach.


FINAL PRIMALITY FILTER

A n candite of (4,N) is a prime number if and only if PRIMALITY RULE (1) and PRIMALITY RULE (2) are passed. 

This is because the Fibonacci pseudprimes (false candidates passing rule 1) and the Catalan pseudoprimes (passin rule 2) are disjoint sets (there are not pseudoprimes n that at the same time are Fibonaaci and Catalan pseudoprimes) 

The Python code verifies first the Fibonnaci primality rule because the computing time is lower than the Catalan rule computing time, so in case that a candidate n does not verify the Fibonacci rule it is not neccessary to calculate the Catalan primality rule. 


The FibCat Algorithm:

Basically is as follows:

Loop n in (4, test_limit)

a) Calculate the associated Fibonacci number of the current candidate n

b) If n is even continue loop (next n)

c) Calculate the associated Catalan number of the current candidate n (n is a odd number)

d) Avoid basic prime composites: if n is a multiple of 3,5,7,11,13,17,19, or the next odd number after a cuple of twin primes located in (n-4,n-2), or a perfect square continue loop (next n)

e) If primality rule (1) is verified and primality rule (2) is verified, then the candidate n is prime. 

In the Python code there are more details and optimizations, some of them derived of using Python as the programming language, but with the above explanation, it is possible to rewrite the code in any language.


THE PYTHON CODE

# This code was created by David
# http://hobbymaths.blogspot.jp/

# Elementary number theory
# - Combinatorial Number Theory
# -- Sieve methods (Fibonacci, Catalan)
# --- Computational number theory
# ----- Primality-testing (deterministic)

#import cProfile

def fibcatSieve():
    from gmpy2 import xmpz, c_mod, mul, is_odd, is_prime, add, sub, divexact, set_cache, is_square, digits

    set_cache(1000,16384)

    def printTime():
        import time, datetime
        ts = time.time()
        st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
        print(st+"\n")

    printTime()

    test_init, test_limit = xmpz(4), xmpz(1000000)

    def Multiple(n,a):
        return (not c_mod(n,a)) and (n>a)

    falsecand = []

    catalan_number, prcounter, totalcounter, tpflag, quarterlimit, divisible_four = xmpz(1),xmpz(3),xmpz(3),0,test_limit>>2, True
    quarterlist, quartercheck = [0,mul(quarterlimit, 3),mul(quarterlimit, 2)], quarterlimit

    fv = [xmpz(0),xmpz(1),xmpz(1)]

    #prlist = [xmpz(2),xmpz(3),xmpz(5)]
    #print(2)
    #print(3)
    #print(5)

    for n in range(test_init, test_limit):

        fv[0] = fv[1]
        fv[1] = fv[2]
        fv[2] = fv[0]+fv[1]

        if (not n & 1):
            if (n == quartercheck):
                print("Current position: "+str(n))
                quartercheck = quarterlist.pop()
            continue

        catalan_number = divexact(mul(catalan_number, sub(n,4)), xmpz(n)>>2) if divisible_four else divexact(mul(catalan_number, sub(n,4)), xmpz(n)>>1)<<1
        divisible_four = not divisible_four

        if (Multiple(n,3) or (n>5 and str(n)[-1:]=="5") or Multiple(n,7) or Multiple(n,11) or Multiple(n,13) or Multiple(n,17) or Multiple(n,19) or (tpflag == 2) or (is_square(n))):
            tpflag = 0
            continue

        fmod = c_mod(fv[2],n)

        if ( (not fmod or (fmod == -~-n)) and (abs(add(c_mod(catalan_number, n)<<2,add(n<<1,n))) == 1) ):
            #print(str(n))
            #prlist.append(n)

            totalcounter = add(totalcounter, 1)

            if is_prime(n):
                prcounter = add(prcounter, 1)
                tpflag += 1
            else:
                falsecand.append(n)
                print("If this message appears then the algorithm is wrong!")

            continue

        tpflag = 0

    print("Test Checkout"+"\n"+"n"+"\t"+"Total :"+"\t"+"Primes: "+"\t"+"False candidates (pseudoprimes):"+

    "\n"+str(test_limit)+"\t"+str(totalcounter)+"\t"+str(prcounter)+"\t"+str(totalcounter - prcounter))

    strfalse = ""
    for myfkey in falsecand:
        strfalse = strfalse + str(myfkey)+"\t"
    print("Pseudoprimes list: "+"\n"+strfalse)

    printTime()

#cProfile.run('fibcatSieve()')
fibcatSieve()


I am comparing my results with gmpy2.is_prime(n) function, which uses a probabilistic primality test. To be sure that there are no errors, I am double checking the results versus a Excel list of prime numbers taken from the famous "The Prime Pages" website. Special thanks to my friend Ignacio for the good tips to enhance the computing time of the algorithm.

If somebody has some ideas to enhance the code, or some comments about this please write freely, probably the properties above mentioned are well known, or are direct result of the pseudoprime classification of the Catalan and Fibonacci pseudoprimes. I am very interested in learning more about this.

4 comments:

  1. It is deterministic **to 10M**. The last part is important. Using a table of Fibonacci pseudoprimes to 2*10^12, I have verified it holds below 1.3 * 10^10 as well. The main problem is that doing the Catalan test is extremely slow, especially for isolated large values (does any method allow solving 355455324377 in less than a minute?). Not only does this make it impractical to run, but we have a very limited range of known results compared to other tests.

    Since we only need to check the 3 known Catalan pseudoprimes for up to 13M, many other tests would work as well as the Fibonacci test. PSP-3, PSP-5, Lucas, Strong Lucas, AES Lucas, ES Lucas, Frobenius(1,-1), Bruckman Lucas, Pell, Perrin -- all don't have overlap. Interestingly, PSP-2 and SPSP-2 do.

    More importantly, you can combine a SPSP-2 and strong Lucas test and have deterministic results for all 64-bit numbers. It also runs many orders of magnitude faster. This is the BPSW test, and is used by most math software. An input of size 100,000 *digits* is not unreasonable for BPSW (it's a probable prime test at that size, but one with no known counterexamples). Imagine trying to do a Catalan pseudoprime test on an input that size.

    Back to the Catalan pseudoprimes, I am not aware of any published test results for the unrestricted search space (i.e. how far has someone checked all odd composites). Vardi apparently searched to 13M. I've searched to 90M (I'm sure someone has searched farther but not published, which isn't helpful). Not counting the three known examples, Aebi and Cairns 2008 note that there are no semiprime Catalan pseudoprimes below 10^10, so any below that range would have to have at least 3 factors; based on the Wieferich search, any Catalan pseudoprime of form p^2 would have to be > 10^34.

    ReplyDelete
    Replies
    1. Dana, hello! and thank you for your explanation, I have enjoyed very much your text. I agree with you about this algorithm being impractical to run, the purpose of the exercise was not finding a fast algorithm but testing as much as possible if it is deterministic or not, trying to find a counterexample, imho it is theoretically interesting because it would mean that Fibonacci and Catalan pseudo primes are disjoint sets. Best Regards! David

      Delete
    2. I'm also fascinated by them, as can be seen by my spending CPU years on enumerating various pseudoprimes. I know they're not practical, but I'm still looking at Perrin and Catalan pseudoprimes :). It looks like the Fibonacci and Pell tests have overlap with many other tests. The Bruckman Lucas pseudoprimes (A005845) are *correlated* with the Fermat pseudoprimes, compared to the Lucas-Selfridge (A217120 and A217255) pseudoprimes which are anti-correlated with the PSP-2s. Other interesting ones include the quadratic Frobenius test and Paul Underwood's fast QFT.

      Delete
    3. Again simply awesome, thank you very much for sharing all that information!!! I was also interested in the Fibonacci pseudo primes, I wrote something about it at MSE:

      http://math.stackexchange.com/questions/1161649/are-there-infinite-fibonacci-primes-if-and-only-if-there-are-infinite-fibonacci

      By the way I suppose you are DanaJ from MSE, am I right? :) you tried to add comments on other post of the blog but for some reason they were not saved! I am sorry about that.

      May I abuse from your knowledge and ask you about a new conjecture I posted recently at MSE? If it is not a topic that you like please disregard, I just find it quite interesting, still did not make a post of it in my blog. I hope to have some feedback from the fellows at MSE because my knowledge is very poor.

      http://math.stackexchange.com/questions/1287585/conjecture-the-set-of-maximum-order-elements-to-mod-n-contains-one-pair-of-gold

      Delete