如何将scikit学习数据集写入CSV文件 [英] How do I write scikit-learn dataset to csv file
问题描述
我可以使用
从scikit-learn
加载数据集
I can load a data set from scikit-learn
using
from sklearn import datasets
data = datasets.load_boston()
print(data)
我要做的是将此数据集写入平面文件(.csv
)
What I'd like to do is write this data set to a flat file (.csv
)
使用open()
函数
f = open('boston.txt', 'w')
f.write(str(data))
有效,但包括对数据集的描述.
works, but includes the description of the data set.
我想知道是否可以通过某种方式生成一个简单的.csv
,其中包含来自此Bunch对象的标头,以便可以将其移动并在其他地方使用.
I'm wondering if there is some way that I can generate a simple .csv
with headers from this Bunch object so I can move it around and use it elsewhere.
推荐答案
data = datasets.load_boston()
将生成一个字典.为了将数据写入.csv
文件,您需要实际数据data['data']
和列data['feature_names']
.您可以使用这些命令来生成熊猫数据框,然后使用to_csv()
以便将数据写入文件:
data = datasets.load_boston()
will generate a dictionary. In order to write the data to a .csv
file you need the actual data data['data']
and the columns data['feature_names']
. You can use these in order to generate a pandas dataframe and then use to_csv()
in order to write the data to a file:
from sklearn import datasets
import pandas as pd
data = datasets.load_boston()
print(data)
df = pd.DataFrame(data=data['data'], columns = data['feature_names'])
df.to_csv('boston.txt', sep = ',', index = False)
,输出boston.txt
应该是:
CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT
0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98
0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14
0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03
...
这篇关于如何将scikit学习数据集写入CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!