python命令的dict问题 [英] python ordered dict issue
问题描述
如果我有一个CSV文件,每行都有一个字典值(列为[Location],[MovieDate],[Formatted_Address],[Lat],[Lng] ),如果我想通过位置
分组,并附加到所有 MovieDate
值上,我被告知要使用OrderDict共享相同的位置
值。
If I have a CSV file that has a dictionary value for each line (with columns being ["Location"], ["MovieDate"], ["Formatted_Address"], ["Lat"], ["Lng"]), I have been told to use OrderDict if I want to group by Location
and append on all the MovieDate
values that share the same Location
value.
ex of data:
ex of data:
Location,MovieDate,Formatted_Address,Lat,Lng
"Edgebrook Park, Chicago ",Jun-7 A League of Their Own,"Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
"Edgebrook Park, Chicago ","Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
对于每个具有相同位置的行(在本示例中为^),我想要这样输出,以便有没有重复的位置。
For every row that has the same location (^as in this example), i'd like to make an output like this so that there are no duplicate locations.
"Edgebrook Park, Chicago ","Jun-7 A League of Their Own Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
我的代码使用ordereddict怎么办?
What's wrong with my code using ordereddict to do this?
from collections import OrderedDict
od = OrderedDict()
import csv
with open("MovieDictFormatted.csv") as f,open("MoviesCombined.csv" ,"w") as out:
r = csv.reader(f)
wr = csv.writer(out)
header = next(r)
for row in r:
loc,rest = row[0], row[1]
od.setdefault(loc, []).append(rest)
wr.writerow(header)
for loc,vals in od.items():
wr.writerow([loc]+vals)
我最终得到的是这样的:
What I end up with is something like this:
['Edgebrook Park, Chicago ', 'Jun-7 A League of Their Own']
['Gage Park, Chicago ', "Jun-9 It's a Mad, Mad, Mad, Mad World"]
['Jefferson Memorial Park, Chicago ', 'Jun-12 Monsters University ', 'Jul-11 Frozen ', 'Aug-8 The Blues Brothers ']
['Commercial Club Playground, Chicago ', 'Jun-12 Despicable Me 2']
e是我在这种情况下没有得到其他列出现,我该怎么做最好?我也喜欢使MovieDate值只有一个长的字符串,如下所示:
'Jun-12怪物大学7月11日冻结8月8日蓝调兄弟
而不是:
The issue is that I'm not getting the other columns to show up in this case, how would I best do that? I would also prefer to make the MovieDate values just one long string as here:
'Jun-12 Monsters University Jul-11 Frozen Aug-8 The Blues Brothers '
instead of :
'Jun-12 Monsters University ', 'Jul-11 Frozen ', 'Aug-8 The Blues Brothers '
谢谢你们,赞赏。我是一个python noob。
thanks guys, appreciate it. I'm a python noob.
将行[0],行[1]
更改为 row [0],row [1:]
不幸的是没有给我我想要的..我只想在第二列(MovieDate)中添加值,而不是复制所有其他栏目如下:
Changing row[0], row[1]
to row[0], row[1:]
unfortunately doesn't give me what I want.. I only want to be adding the values in the second column (MovieDate), not replicating all the other columns as such:
['Jefferson Memorial Park, Chicago ', ['Jun-12 Monsters University ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353'], ['Jul-11 Frozen ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353'], ['Aug-8 The Blues Brothers ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353']]
推荐答案
你只需要几个更改,你需要加入lat和long,要删除dupe lat和longs我们也需要使用它作为关键:
You just need a couple of changes, you need to join the lat and long,to remove the dupe lat and longs we need to also use that as the key:
with open("data.csv") as f,open("new.csv" ,"w") as out:
r = csv.reader(f)
wr= csv.writer(out)
header = next(r)
for row in r:
od.setdefault((row[0], row[-2], row[-1]), []).append(" ".join(row[1:-2]))
wr.writerow(header)
for loc,vals in od.items():
wr.writerow([loc[0]] + vals+list(loc[1:]))
输出:
Location,MovieDate,Formatted_Address,Lat,Lng
"Edgebrook Park, Chicago ","Jun-7 A League of Their Own Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA","Jun-9 It's a Mad, Mad, Mad, Mad World Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
他们自己的联盟
首先是因为它是在疯狂的生命线之前,
row [ 1:-2]
获取所有的条纹,纬度,长和位置,我们存储纬度和很长一段时间我们的关键元组,以避免在每一行的末尾重复写入。
A League of Their Own
is first because it comes before the mad,mad line,
row[1:-2]
gets everything bar the lat,long and location, we store the lat and long in our key tuple to avoid duplicating writing it at the end of each row.
使用名称和解包可能会使它更容易一些:
Using names and unpacking might make it a little easier to follow:
with open("data.csv") as f, open("new.csv", "w") as out:
r = csv.reader(f)
wr = csv.writer(out)
header = next(r)
for row in r:
loc, mov, form, lat, long = row
od.setdefault((loc, lat, long), []).append("{} {}".format(mov, form))
wr.writerow(header)
for loc, vals in od.items():
wr.writerow([loc[0]] + vals + list(loc[1:]))
使用csv.Dictwriter保留五列:
Using csv.Dictwriter to keep five columns:
od = OrderedDict()
import csv
with open("data.csv") as f, open("new.csv", "w") as out:
r = csv.DictReader(f,fieldnames=['Location', 'MovieDate', 'Formatted_Address', 'Lat', 'Lng'])
wr = csv.DictWriter(out, fieldnames=r.fieldnames)
for row in r:
od.setdefault(row["Location"], dict(Location=row["Location"], Lat=row["Lat"], Lng=row["Lng"],
MovieDate=[], Formatted_Address=row["Formatted_Address"]))
od[row["Location"]]["MovieDate"].append(row["MovieDate"])
for loc, vals in od.items():
od[loc]["MovieDate"]= ", ".join(od[loc]["MovieDate"])
wr.writerow(vals)
#
输出:
"Edgebrook Park, Chicago ","Jun-7 A League of Their Own, Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
所以五列保持不变,我们将MovieDate
加入单个字符串,并且 Formatted_Address = form
始终是唯一的,因此我们不会不需要更新。
So the five columns remain intact, we joined the "MovieDate"
into single strings and Formatted_Address=form
is always unique so we don't need to update that.
事实证明,匹配您想要的所有我们需要的do连接 MovieDate的
,并删除位置,Lat,Lng和'Formatted_Address'
的重复条目。
It turns out to match what you wanted all we needed to do was concatenate the MovieDate's
and remove duplicate entries for Location, Lat, Lng and 'Formatted_Address'
.
这篇关于python命令的dict问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!