快速算法将所有数字分解为给定数字 [英] Fast Algorithm to Factorize All Numbers Up to a Given Number

查看:71
本文介绍了快速算法将所有数字分解为给定数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种算法,该算法可以根据已经分解的数字分解因子.换句话说,我正在寻找一种将所有数字分解为给定数字的快速算法,并将它们存储在(我想这是最容易使用的数据结构)元组列表/元组中.我正在寻找最多n"个算法,因为我需要所有不超过"n"的数字,而且我猜它比仅逐一检查要快.

I am looking for an algorithm that could factorize numbers based on numbers it already factorized. In other words, I am searching for a fast algorithm to factorise all numbers up to a given number, and store them in a (I guess this is the easiest data structure to use) list / tuple of tuples. I am looking for an "up to n" algorithm because I need all numbers up to "n", and I guess it's faster than just checking one by one.

对于我正在运行的程序,我希望该算法能够在合理的时间内(不到一个小时)工作2 * 10 ^ 8.我已经尝试了python中一种较幼稚的方法,发现所有素数都达到"n".首先,然后对于每个数字"k"通过检查每个素数直到除以它(我们称它为p),发现它是素数分解,然后它的因式分解就是k/p + p的因式分解.

I want this algorithm to work within a reasonable time (less than an hour) for 2*10^8, for a program I am running. I have tried one of the more naive approaches in python, finding all primes up to "n" first, and then for each number "k" finding it's prime factorization by checking each prime until one divides it (we will call it p), then it's factorization is the factorization of k/p + p.

from math import *
max=1000000 # We will check all numbers up to this number, 

lst = [True] * (max - 2) # This is an algorithm I found online that will make the "PRIMES" list all the primes up to "max", very efficent
for i in range(2, int(sqrt(max) + 1)):
  if lst[i - 2]:
    for j in range(i ** 2, max, i):
      lst[j - 2] = False

PRIMES = tuple([m + 2 for m in range(len(lst)) if lst[m]]) # (all primes up to "max")

FACTORS = [(0,),(1,)] #This will be a list of tuples where FACTORS[i] = the prime factors of i
for c in range(2,max): #check all numbers until max
  if c in PRIMES:
    FACTORS.append((c,)) #If it's a prime just add it in
  else: #if it's not a prime...
    i=0
    while PRIMES[i]<= c: #Run through all primes until you find one that divides it,
      if c%PRIMES[i] ==0: 
        FACTORS.append(FACTORS[c//PRIMES[i]] + (PRIMES[i],)) #If it does, add the prime with the factors of the division
        break
      i+=1

根据测试,在检查候选者是否是素数之后,绝大部分时间都浪费在else部分上.对于max = 200000000,这需要花费比我们更多的时间

From testing, the vast majority of time is wasted on the else section AFTER checking if the candidate is prime or not. This takes more than an our for max = 200000000

我正在为此运行的程序是找到最小的"n".这样对于某个"a"这样(2n)!/(((n + a)!^ 2)是一个整数.基本上,我定义a_n =最小k,使得(2k)!/((k + n)!^ 2)是整数.结果是,a_1 = 0,a_2 = 208,a_3 = 3475,a_4 = 8174,a_5 = 252965,a_6 = 3648835,a_7 = 72286092.数学上.使用Legendre的公式: https://en.wikipedia.org/wiki/Legendre%27s_formula,我编写了这段代码:

The program I am running this for is to find the smallest "n" such that for a certain "a" such that (2n)!/((n+a)!^2) is a whole number. Basically, I defined a_n = smallest k such that (2k)!/((k+n)!^2) is an integer. turns out, a_1 =0, a_2 = 208, a_3 = 3475, a_4 = 8174, a_5 = 252965, a_6 = 3648835, a_7 = 72286092. By the way, I noticed that a_n + n is squarefree, although can't prove it mathematically. Using Legendre's formula: https://en.wikipedia.org/wiki/Legendre%27s_formula, I wrote this code:

from math import *
from bisect import bisect_right
max=100000000 # We will check all numbers up to this number, 

lst = [True] * (max - 2) # This is an algorithm I found online that will make the "PRIMES" list all the primes up to "max", very efficent
for i in range(2, int(sqrt(max) + 1)):
  if lst[i - 2]:
    for j in range(i ** 2, max, i):
      lst[j - 2] = False

PRIMES = tuple([m + 2 for m in range(len(lst)) if lst[m]]) # (all primes up to "max")
print("START")

def v(p,m):
  return sum([ (floor(m/(p**i))) for i in range(1,1+ceil(log(m,p)))]) #This checks for the max power of prime p, so that p**(v(p,m)) divides factorial(m)

def check(a,n): #This function checks if a number n competes the criteria for a certain a
  if PRIMES[bisect_right(PRIMES, n)]<= n + a: #First, it is obvious that if there is a prime between n+1 and n+a the criteria isn't met
    return False
  i=0
  while PRIMES[i] <= n: #We will run through the primes smaller than n... THIS IS THE ROOM FOR IMPROVEMENT - instead of checking all the primes, check all primes that divide (n+1),(n+2),...,(n+a)
    if v(PRIMES[i],2*n)<2*v(PRIMES[i],n+a): # If any prime divides the denominator more than the numerator, the fraction is obviously not a whole number
      return False
    i+=1
  return True #If for all primes less than n, the numerator has a bigger max power of p than the denominator, the fraction is a whole number.

#Next, is a code that will just make sure that the program runs all numbers in order, and won't repeat anything.

start = 0 #start checking from this value
for a in range(1,20): #check for these values of a.
  j=start
  while not check(a,j):
    if j%100000==0:
      print("LOADING ", j) #just so i know how far the program has gotten.
    j+=1
  print("a-",a," ",j) #We found a number. great. print the result.
  start=j #start from this value again, because the check obviously won't work for smaller values with a higher "a"

推荐答案

您可以使用脚本的第一部分来做到这一点!

You can use the first part of your script in order to do that!

代码:

from math import *
import time

MAX = 40000000

t = time.time()
# factors[i] = all the prime factors of i
factors = {}
# Running over all the numbers smaller than sqrt(MAX) since they can be the factors of MAX
for i in range(2, int(sqrt(MAX) + 1)):
    # If this number has already been factored - it is not prime
    if i not in factors:
        # Find all the future numbers that this number will factor
        for j in range(i * 2, MAX, i):
            if j not in factors:
                factors[j] = [i]
            else:
                factors[j].append(i)
print(time.time() - t)

for i in range(3, 15):
    if i not in factors:
        print(f"{i} is prime")
    else:
        print(f"{i}: {factors[i]}")

结果:

3:是素数
4:[2]
5:是素数
6:[2,3]
7:是素数
8:[2]
9:[3]
10:[2,5]
11:是素数
12:[2,3]
13:是素数
14:[2,7]

3: is prime
4: [2]
5: is prime
6: [2, 3]
7: is prime
8: [2]
9: [3]
10: [2, 5]
11: is prime
12: [2, 3]
13: is prime
14: [2, 7]

说明:

如评论中所述,它是对筛网筛网算法的修改.br/>对于每个数字,我们都会找到将来可以分解的所有数字.
如果数字未出现在结果字典中,则为质数,因为没有数字将其分解.我们使用的是字典而不是列表,因此根本不需要保存质数-这对内存更友好,但也更慢.

As mentioned in the comments it is a modification of the Sieve of Eratosthenes algorithm.
For each number we find all the numbers it can factorize in the future.
If the number does not appear in the result dictionary it is a prime since no number factorize it. We are using dictionary instead of list so the prime numbers will not need to be saved at all - which is a bit more memory friendly but also a bit slower.

时间:

根据带有 time.time() MAX = 40000000 的简单检查: 110.14351892471313 秒.
对于 MAX = 1000000 : 1.0785243511199951 秒.
对于带有 time.time() MAX = 200000000 :1.5小时后未完成...在6325个项目中,它已达到主循环中的第111个项目(这是还不错,因为循环越远,它们变得越短.

According to a simple check for MAX = 40000000 with time.time(): 110.14351892471313 seconds.
For MAX = 1000000: 1.0785243511199951 seconds.
For MAX = 200000000 with time.time(): Not finished after 1.5 hours... It has reached the 111th item in the main loop out of 6325 items (This is not so bad since the farther the loops go they become shorter).

但是,我确实相信编写良好的C代码可以在半小时内完成(如果您愿意考虑的话,我可能会写另一个答案).可以完成的更多优化工作是使用多线程和一些Primality测试,例如Miller–Rabin.当然,值得一提的是,这些结果在我的笔记本电脑上,也许在PC或专用计算机上,运行速度会更快或更慢.

I do believe however that a well written C code could do it in half an hour (If you are willing to consider it I might write another answer). Some more optimization that can be done is use multithreading and some Primality test like Miller–Rabin. Of course it is worth mentioning that these results are on my laptop and maybe on a PC or a dedicated machine it will run faster or slower.

这篇关于快速算法将所有数字分解为给定数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆