Monday, April 11, 2016

Can we replace the upper limit condition of the Sieve of Eratosthenes sqrt(n) with the value sqrt(p) where p is the closest prime < n ?

By chance I stumbled upon the OEIS list A033677 of the smallest divisor of n greater or equal to sqrt(n). Roughly speaking if we use the classic enhanced sieve of Eratosthenes, sqrt(n) is the typical upper limit to find a divisor d>1 of n, so if n is composite then it is expected that exists d in [2,sqrt(n)].

(*This is a mirror of my same question at MSE.)

I did the same exercise but instead of sqrt(n) using as the limit the square root of the previous closest prime p to n and made the calculation of the smallest divisor of n greater or equal to sqrt(p).

Testing with Python and if I did not make an error, curiously I obtained exactly the same list than A033677 for every n in [3,6*10^6] (after that point it gets slower, I will try with PARI/GP later). So I wonder if it is possible to use the above mentioned sqrt(p) upper limit instead of sqrt(n) to find a divisor of n when sieving.

This is the (very basic) Python code that compares one by one the elements of the sequence A033677 with the sequence of smallest divisor of n greater or equal to sqrt(p), please feel free to use it:

    from sympy import prevprime, divisors
        for n in range (3,10000000):
            A0033677_Value=0
            for d in divisors(n):
                if d>=sqrt(n):
                    A0033677_Value=d
                    break
                   
            for d in divisors(n):
                if d>int(sqrt(prevprime(n))):
                    if A0033677_Value == d:
                        print("SUCCESS "+str(n))
                    else:
                        print("ERROR " + str(n) + "\t" + str(A0033677_Value)+"\t"+str(prevprime(n))+"\t"+str(d))
                        return
                    break


If the observation is true, the property would be something like this:

for all n (composite) exists d: d|n, d in [2,sqrt(p1)], p1 is prime, p1<n, and not exists p2: p1 < p2 < n.

For instance for n=20,21,22,23 we would look for a divisor of n in the interval [2,sqrt(19)] because 19 is the closest prime lower than those n. And we already sieved 19 in a former step so it is in our list of already sieved primes and thus available to be used (meaning we already know at this point that 19 is prime). Then applying Eratosthenes as usual it would be detected that n=20,21,22 are composite and n=23 is the next prime.

In some cases (e.g when n is a perfect square) having the same divisor d over sqrt(n) and sqrt(p) does not imply that we can find a divisor at the interval [2,sqrt(p)] and it is required to go up to [2,sqrt(p)+1], meaning that the first divisor is exactly sqrt(p)+1.

If true I am not sure if it might be significant. By Bertrand's postulate, there should be a prime between [n/2,n], so in the best of cases, the previous prime would be p=n/2, so it would imply for that best of cases the use of sqrt(n/2) as the upper limit for the sieve instead of sqrt(n).

So I wonder, would it be possible to use sqrt(p) instead of sqrt(n)? Was already in use that approach? Is there a counterexample (probably very big)? If not, what could be the reason behind it?

The code below tests the Sieve of Eratosthenes using the modified algorithm as explained above, including the special condition for perfect squares. It seems to work so far up to 10^5.
    from gmpy2 import is_prime, is_square
        # modified Eratosthenes, upper sieving limit sqrt(p)
            lopM=[2] #list of already sieved primes
            lp=2 # last prime sieved
            for pos in range(3,1000008):
                ul=int(sqrt(lp)) #upper limit
                if is_square(pos):
                    ul=ul+1 # correction for perfect squares
                    # basically: if is a perfect square we would continue for
                if ul==1:
                    ul=2 # exclude 1 of possible divisors
                comp=False #composite detected
                for j in lopM: # using the list of already known primes
                    if j in range(2,ul+1): # while j is under the upper limit
                        if (pos%j)==0:
                            comp=True
                            break
                    else: # we arrived to the upper limit
                        break
                    if comp==False:
                        lopM.append(pos)
                        lp=pos
           
            for elem in lopM:
                if not is_prime(elem): # sanity check
                    print("ERROR, not prime " + str(elem))
                    return
            print("Primes sieved: " + str(len(lopM)) + " and last prime was " + str(lp))


So heuristically it seems correct, but is there a demonstration or a counterexample of it?

No comments:

Post a Comment