如何在python中选择要写入(.csv)的列 [英] how to choose which column to write in (.csv) in python
问题描述
import csv
f = csv.reader(open('lmt.csv','r')) # open input file for reading
Date, Open, Hihh, mLow, Close, Volume = zip(*f) #s plit it into separate columns
ofile = open("MYFILEnew1.csv", "wb") # output csv file
c = csv.writer(ofile)
item = Date
item2 = Volume
rows = zip(item, item)
i = 0
for row in item2:
print row
writer = csv.writer(ofile, delimiter='\t')
writer.writerow([row])
ofile.close()
以上是我到目前为止所产生的.
Above is what I have produced so far.
如第三行所示,我从电子表格中提取了6列.
我想创建一个名为MYFILEnew1.csv
的.csv文件,该文件只有两列Date
和Volume
.
As you can see in the 3rd line, I have extracted 6 columns from a spreadsheet.
I want to create a .csv file under the name of MYFILEnew1.csv
which only has two columns, Date
and Volume
.
我在上面创建的.csv只将Volume
列写入新.csv文件的第一列.
您将如何将Date
放入第二列?
What I have above creates a .csv that only writes Volume
column into the first column of the new .csv file.
How would you go about placing Date
into the second column?
例如
Date Open High Low Close Volume
17-Feb-16 210 212.97 209.1 212.74 1237731
是我所拥有的.并且想生成一个新的csv文件,使其具有
is what i have. and Id like to produce a new csv file such that it has
Date Volume
17-Feb-16 1237731
推荐答案
如果我正确地理解了您的问题,则可以使用熊猫的 to_csv (@downvoter:请问您能解释一下downvote吗?!);您可以在下面的 EDIT2 中找到最终解决问题的方法:
If I understand you question correctly, you can achieve that very easily using panda's read_csv and to_csv (@downvoter: Could you explain your downvote, please!?); the final solution to your problem can be found below EDIT2:
import pandas as pd
# this assumes that your file is comma separated
# if it is e.g. tab separated you should use pd.read_csv('data.csv', sep = '\t')
df = pd.read_csv('data.csv')
# select desired columns
df = df[['Date', 'Volume']]
#write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)
因此,如果您的data.csv
文件如下所示:
So, if your data.csv
file looks like this:
Date,Open,Hihh,mLow,Close,Volume
1,5,9,13,17,21
2,6,10,14,18,22
3,7,11,15,19,23
4,8,12,16,20,24
运行上面的脚本后,MYFILEnew1.csv
看起来像这样:
The the MYFILEnew1.csv
would look like this after running the script above:
Date Volume
1 21
2 22
3 23
4 24
编辑
使用数据(制表符分隔,存储在文件data3.csv
中):
Using your data (tab separated, stored in the file data3.csv
):
Date Open Hihh mLow Close Volume
17-Feb-16 210 212.97 209.1 212.74 1237731
然后
import pandas as pd
df = pd.read_csv('data3.csv', sep='\t')
# select desired columns
df = df[['Date', 'Volume']]
# write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)
提供所需的输出
Date Volume
17-Feb-16 1237731
EDIT2
由于输入的csv文件中的标题似乎被弄乱了(如注释中所述),因此您必须重命名第一列.现在,使用您的整个数据集,以下内容对我来说都很好:
Since your header in your input csv file seems to be messed up (as discussed in the comments), you have to rename the first column. The following now works fine for me using your entire dataset:
import pandas as pd
df = pd.read_csv('lmt.csv', sep=',')
# get rid of the wrongly formatted column name
df.rename(columns={df.columns[0]: 'Date' }, inplace=True)
# select desired columns
df = df[['Date', 'Volume']]
# write to the file (tab separated)
df.to_csv('MYFILEnew1.csv', sep='\t', index=False)
这篇关于如何在python中选择要写入(.csv)的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!