pandas 数据框到Hive表 [英] Pandas DataFrame to Hive Table
问题描述
我是Python和Hive的新手。
我希望能得到一些建议。 >有没有人有关于如何将python熊猫数据框转换为配置表格的任何提示?解析方案
您的脚本应该在机器,其中hive可以使用在路径中加载本地数据方法加载数据。
-
查询熊猫数据框以创建列表列名数据类型
-
使用python字符串操作(基本上是连接)创建一个有效的HQL(DDL)create table语句
-
将熊猫数据框写为cvs,由\ t分隔,将标题关闭和索引关闭(检查to_csv()的参数)
Query pandas data frame to create a list of column name datatype
Compose a valid HQL (DDL) create table statement using python string operations (basically concatenations)
Issue a create table statement in Hive.
Write the pandas dataframe as cvs separated by "\t" turning headers off and index off (check paramerets of to_csv() )
$ b
5.-从你的python脚本调用系统控制台运行hive -e:
用途:例如:
p = subprocess.Popen( ['hive','-e',str_com ()),stdout = subprocess.PIPE,
stderr = subprocess.PIPE)
out,err = p.communicate()
这将调用配置单元控制台并执行例如load data local inpath,将您的csv数据插入到创建的表中。
那么你很高兴。
I'm new to Python and Hive.
I was hoping I might get some advice.
Does anyone have any tips on how to turn a python pandas dataframe into a hive table?
Your script should run inside a machine where hive can load data using the "load local data in path" method.
5.- From your python script call a system console running hive -e:
Use: for instance:
p = subprocess.Popen( ['hive', '-e', str_command_list], stdout = subprocess.PIPE,
stderr = subprocess.PIPE )
out, err = p.communicate()
This will call hive console and execute for instance, load data local inpath, inserting your csv data into the created table.
Then you are happy.
这篇关于 pandas 数据框到Hive表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!