为什么散列的内容无法保存到CSV文件中? [英] Why is the hashed content not getting saved into the CSV file?
问题描述
我在CSV文件中有一个列, Original.csv
,其中有一堆用户ID,其中一些重复,如下所示:
UDID
d0155049772de9
8b57d8c7f1e5a31e4adaef5fe6c52df1ada7fcd5
8b57d8c7f1e5a31e4adaef5fe6c52df1ada7fcd5
465088425ceb38c62bf8d1d9cc33bcfab4fe4293
3eabe40461773086
3eabe40461773086
e24356719f086021
212b5b0415560be3
1c046451a3761ef51fbf52759748f66c98b02313
我想要处理它们MATLAB以后,所以我想哈希和转换成整数,并将它们存储在一个新的文件, New.csv
。这是我的代码:
require'csv'
udids = []
id = []
CSV.foreach('Original.csv',:headers => true).map do | row |
udids<< row [0]
end
udids = udids.uniq
arrayHash = []
for i in 0..udids.size-1
arrayHash<< udids
arrayHash<< i
end
hash = Hash [arrayHash.each_slice(2).to_a]
$ b b id = hash.values_at * udids
for i in 0..id.size-1
logfile = File.new('New.csv',w)
logfile.print(#{id [i]} \\\
)
logfile.close
end
由于某些原因我不能弄清楚,运行代码后, New.csv
文件是空的。问题是什么?
编辑:这个程序的哈希值比简单比较和检查用户ID是否重复之前执行得更快?像这样:
CSV.open('New.csv',wb)do | csv |
/ pre>
CSV.foreach('Original.csv',:headers => true).map do | row |
unless udids.include?(row [0])
udids<< row [0]
end
csv< udids.index(row [56])+ 1
end
end
在这两种情况下,你能说为什么一个会比另一个更快?
解决方案没有对您的完整代码进行深入研究:
使用
for in in 0..id.size-1
logfile = File.new('New.csv',w)
logfile.print(#{id [i]} \\\
)
logfile.close
end
您打开文件
id.size-1
次,写一行并关闭它。
看起来你想要这样的
File.open('New.csv',w)do | logfile | #打开文件
id.each {| one_id | #loop on all ids
logfile.print(#{one_id} \\\
)#write一行中的一个id
}
end#关闭文件
I have a column in a CSV file,
Original.csv
, which has a bunch of user IDs, some of which repeat, like following:udid d0155049772de9 8b57d8c7f1e5a31e4adaef5fe6c52df1ada7fcd5 8b57d8c7f1e5a31e4adaef5fe6c52df1ada7fcd5 465088425ceb38c62bf8d1d9cc33bcfab4fe4293 3eabe40461773086 3eabe40461773086 e24356719f086021 212b5b0415560be3 1c046451a3761ef51fbf52759748f66c98b02313
I want to process them in MATLAB later, so I wanted to hash and convert them into integers and store them in a new file,
New.csv
. This is my code:require 'csv' udids = [] id=[] CSV.foreach('Original.csv', :headers=>true).map do |row| udids << row[0] end udids=udids.uniq arrayHash=[] for i in 0..udids.size-1 arrayHash<<udids arrayHash<<i end hash = Hash[arrayHash.each_slice(2).to_a] id=hash.values_at *udids for i in 0..id.size-1 logfile = File.new('New.csv',"w") logfile.print("#{id[i]}\n") logfile.close end
Due to some reason I haven't able to figure out, the
New.csv
file is empty after running the code. What's the issue?Edit: Is hashing for this program going to perform faster than simply comparing and checking if a user ID has been repeated before? Something like this:
CSV.open('New.csv', "wb") do |csv| CSV.foreach('Original.csv', :headers=>true).map do |row| unless udids.include?(row[0]) udids << row[0] end csv<<udids.index(row[56]) + 1 end end
In either case, could you please say why one would perform faster over the other? My CSV has 60 million records, if that matters.
解决方案Without a deeper look on your complete code:
With
for i in 0..id.size-1 logfile = File.new('New.csv',"w") logfile.print("#{id[i]}\n") logfile.close end
You open the file
id.size-1
times, write one line and close it. In the end you have the file with the last entry.It seems you want something like this
File.open('New.csv',"w") do |logfile| #Open the file id.each{|one_id| #loop on all ids logfile.print("#{one_id}\n") #write one id in line } end #Close the file
这篇关于为什么散列的内容无法保存到CSV文件中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!