Stata和Python(Pandas)之间的最有效的I / O设置 [英] most efficient I/O setup between Stata and Python (Pandas)
问题描述
我使用Stata处理一些数据,导出csv文件中的数据,并使用pandas read_csv函数在Python中加载它。
I am using Stata to process some data, export the data in a csv file and load it in Python using the pandas read_csv function.
问题是,是这么慢。从Stata导出到csv文件需要很长时间(以dta Stata格式导出要快得多),并且通过read_csv加载数据也很慢。使用read_stata pandas函数更糟糕。
The problem is that everything is so slow. Exporting from Stata to a csv file takes ages (exporting in the dta Stata format is much faster), and loading the data via read_csv is also very slow. Using the read_stata pandas function is even worse.
我不知道有没有其他选项?像导出一个格式而不是csv?我的csv数据集约有6-7 Gb大。
I wonder is there are any other options? Like exporting a format other than csv? My csv dataset is approx 6-7 Gb large.
感谢任何帮助
感谢
推荐答案
效率很高 pd.read_stata()/。to_stata()
,请参阅此处
这篇关于Stata和Python(Pandas)之间的最有效的I / O设置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!