Cython的火花 [英] Spark with Cython
问题描述
我最近想将 Cython
与Spark配合使用,为此,我遵循了以下参考文献。
I recently wanted to use Cython
with Spark, for which I followed the following reference.
我写了以下提到的程序但我得到一个:
I wrote the following programs as mentioned but I am getting a:
TypeError:
fib_mapper_cython() takes exactly 1 argument (0 given)
spark-tools.py
spark-tools.py
def spark_cython(module, method):
def wrapped(*args, **kwargs):
global cython_function_
try:
return cython_function_(*args, **kwargs)
except:
import pyximport
pyximport.install()
cython_function_ = getattr(__import__(module), method)
return cython_function_(*args, **kwargs)
return wrapped()
fib.pyx
fib.pyx
def fib_mapper_cython(n):
'''
Return the first fibonnaci number > n.
'''
cdef int a = 0
cdef int b = 0
cdef int j = int(n)
while b<j:
a, b = b, a+b
return b, 1
main.py
main.py
from spark_tools import spark_cython
import pyximport
import os
from pyspark import SparkContext
from pyspark import SparkConf
pyximport.install()
os.environ["SPARK_HOME"] = "/home/spark-1.6.0"
conf = (SparkConf().setMaster('local').setAppName('Fibo'))
sc = SparkContext()
sc.addPyFile('file:///home/Cythonize/fib.pyx')
sc.addPyFile('file:///home/Cythonize/spark_tools.py')
lines = sc.textFile('file:///home/Cythonize/nums.txt')
mapper = spark_cython('fib', 'fib_mapper_cython')
fib_frequency = lines.map(mapper).reduceByKey(lambda a, b: a+b).collect()
print fib_frequency
我收到 TypeError
每当我运行程序时。有想法吗?
I get a TypeError
whenever I run the program. Any Ideas?
推荐答案
这不是 Cython
也不是 PySpark
问题,不幸的是,您在 spark_cython
的定义期间添加了一个额外的函数调用。具体来说,包装对 cython_function
的调用的函数在返回时不带参数的情况下被调用:
This is not a Cython
nor a PySpark
issue, you unfortunately added an extra function call during the definition of spark_cython
. Specifically, the function that wraps the call to the cython_function
is called with no arguments on return:
return wrapped() # call made, no args supplied.
因此,执行此调用时不会返回包装函数。您要做的是调用包装好的
,没有 * args
或 ** kwargs
。 包装
然后不带任何参数调用 fib_mapper_cython
(因为 * args,** kwargs
未提供),因此 TypeError
。
As a result you won't return the wrapped function when you execute this call. What you do is call wrapped
with no *args
or **kwargs
. wrapped
then calls fib_mapper_cython
with no arguments (since *args, **kwargs
are not supplied) hence the TypeError
.
您应改为:
return wrapped
这个问题应该不再存在。
and this issue should no longer be present.
这篇关于Cython的火花的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!