Python:将 pandas 数据帧作为参数传递给子进程 [英] Python: Pass a pandas Dataframe as an argument to subprocess

查看:74
本文介绍了Python:将 pandas 数据帧作为参数传递给子进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用子进程将数据帧作为参数发送到带有 spark-submit 的 python 脚本.我已经尝试了下面的代码,但没有成功,因为我们无法连接字符串和对象.

How to send a dataframe as an argument to a python script with spark-submit using subprocess. I have tried the below code but did not work out as we cant concatenate string and an object.

def spark_submit(self, test_cases, email):
    command = 'spark-submit TestRunner.py '+test_cases+" "+email
    print(command)
    process = subprocess.Popen([command], shell=True,
                               stdout=subprocess.PIPE,
                               stderr=subprocess.PIPE)
    output, error = process.communicate()
   status = process.returncode
   print(status)```

推荐答案

您不能连接任何不是字符串(或强制转换为字符串)的内容.我假设您不能直接将数据帧作为命令行参数传递,因此我建议将其转换为文件并传递文件路径而不是数据帧本身.

You can't concatenate anything that isn't a string (or casted as one). I assume you can't pass directly a dataframe as a command line argument, so I suggest converting it to a file and passing the file path instead of the dataframe itself.

df.to_csv('mydf.csv')
command = 'spark-submit TestRunner.py mydf.csv ' + email 

这篇关于Python:将 pandas 数据帧作为参数传递给子进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆