python命令的dict问题 [英] python ordered dict issue

查看：144 发布时间：2017/5/24 20:56:02 python dictionary ordereddictionary

本文介绍了python命令的dict问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如果我有一个CSV文件，每行都有一个字典值（列为[Location]，[MovieDate]，[Formatted_Address]，[Lat]，[Lng] ），如果我想通过位置分组，并附加到所有 MovieDate 值上，我被告知要使用OrderDict共享相同的位置值。

If I have a CSV file that has a dictionary value for each line (with columns being ["Location"], ["MovieDate"], ["Formatted_Address"], ["Lat"], ["Lng"]), I have been told to use OrderDict if I want to group by Location and append on all the MovieDate values that share the same Location value.

ex of data：

ex of data:

Location,MovieDate,Formatted_Address,Lat,Lng
    "Edgebrook Park, Chicago ",Jun-7 A League of Their Own,"Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
    "Edgebrook Park, Chicago ","Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672

对于每个具有相同位置的行（在本示例中为^），我想要这样输出，以便有没有重复的位置。

For every row that has the same location (^as in this example), i'd like to make an output like this so that there are no duplicate locations.

 "Edgebrook Park, Chicago ","Jun-7 A League of Their Own Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672

我的代码使用ordereddict怎么办？

What's wrong with my code using ordereddict to do this?

from collections import OrderedDict

od = OrderedDict()
import csv
with open("MovieDictFormatted.csv") as f,open("MoviesCombined.csv" ,"w") as out:
    r = csv.reader(f)
    wr = csv.writer(out)
    header = next(r)
    for row in r:
        loc,rest = row[0], row[1]
        od.setdefault(loc, []).append(rest)
    wr.writerow(header)
    for loc,vals in od.items():
        wr.writerow([loc]+vals)

我最终得到的是这样的：

What I end up with is something like this:

['Edgebrook Park, Chicago ', 'Jun-7 A League of Their Own']
['Gage Park, Chicago ', "Jun-9 It's a Mad, Mad, Mad, Mad World"]
['Jefferson Memorial Park, Chicago ', 'Jun-12 Monsters University ', 'Jul-11 Frozen ', 'Aug-8 The Blues Brothers ']
['Commercial Club Playground, Chicago ', 'Jun-12 Despicable Me 2']

e是我在这种情况下没有得到其他列出现，我该怎么做最好？我也喜欢使MovieDate值只有一个长的字符串，如下所示：
'Jun-12怪物大学7月11日冻结8月8日蓝调兄弟
而不是：

The issue is that I'm not getting the other columns to show up in this case, how would I best do that? I would also prefer to make the MovieDate values just one long string as here: 'Jun-12 Monsters University Jul-11 Frozen Aug-8 The Blues Brothers ' instead of :

'Jun-12 Monsters University ', 'Jul-11 Frozen ', 'Aug-8 The Blues Brothers '

谢谢你们，赞赏。我是一个python noob。

thanks guys, appreciate it. I'm a python noob.

将行[0]，行[1] 更改为 row [0]，row [1：] 不幸的是没有给我我想要的..我只想在第二列（MovieDate）中添加值，而不是复制所有其他栏目如下：

Changing row[0], row[1] to row[0], row[1:] unfortunately doesn't give me what I want.. I only want to be adding the values in the second column (MovieDate), not replicating all the other columns as such:

['Jefferson Memorial Park, Chicago ', ['Jun-12 Monsters University ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353'], ['Jul-11 Frozen ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353'], ['Aug-8 The Blues Brothers ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353']]

推荐答案

你只需要几个更改，你需要加入lat和long，要删除dupe lat和longs我们也需要使用它作为关键：

You just need a couple of changes, you need to join the lat and long,to remove the dupe lat and longs we need to also use that as the key:

with open("data.csv") as f,open("new.csv" ,"w") as out:
    r = csv.reader(f)
    wr= csv.writer(out)
    header = next(r)
    for row in r:
        od.setdefault((row[0], row[-2], row[-1]), []).append(" ".join(row[1:-2]))
    wr.writerow(header)
    for loc,vals in od.items():
        wr.writerow([loc[0]] + vals+list(loc[1:]))

输出：

Location,MovieDate,Formatted_Address,Lat,Lng
"Edgebrook Park, Chicago ","Jun-7 A League of Their Own Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA","Jun-9 It's a Mad, Mad, Mad, Mad World Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672

他们自己的联盟首先是因为它是在疯狂的生命线之前，
row [ 1：-2] 获取所有的条纹，纬度，长和位置，我们存储纬度和很长一段时间我们的关键元组，以避免在每一行的末尾重复写入。

A League of Their Own is first because it comes before the mad,mad line, row[1:-2] gets everything bar the lat,long and location, we store the lat and long in our key tuple to avoid duplicating writing it at the end of each row.

使用名称和解包可能会使它更容易一些：

Using names and unpacking might make it a little easier to follow:

with open("data.csv") as f, open("new.csv", "w") as out:
    r = csv.reader(f)
    wr = csv.writer(out)
    header = next(r)
    for row in r:
        loc, mov, form, lat, long = row
        od.setdefault((loc, lat, long), []).append("{} {}".format(mov, form))
    wr.writerow(header)
    for loc, vals in od.items():
        wr.writerow([loc[0]] + vals + list(loc[1:]))

使用csv.Dictwriter保留五列：

Using csv.Dictwriter to keep five columns:

od = OrderedDict()
import csv

with open("data.csv") as f, open("new.csv", "w") as out:
    r = csv.DictReader(f,fieldnames=['Location', 'MovieDate', 'Formatted_Address', 'Lat', 'Lng'])
    wr = csv.DictWriter(out, fieldnames=r.fieldnames)
    for row in r:
        od.setdefault(row["Location"], dict(Location=row["Location"], Lat=row["Lat"], Lng=row["Lng"],
                                        MovieDate=[], Formatted_Address=row["Formatted_Address"]))

        od[row["Location"]]["MovieDate"].append(row["MovieDate"])
    for loc, vals in od.items():
        od[loc]["MovieDate"]= ", ".join(od[loc]["MovieDate"])
        wr.writerow(vals)

＃
输出：

"Edgebrook Park, Chicago ","Jun-7 A League of Their Own, Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672

所以五列保持不变，我们将MovieDate加入单个字符串，并且 Formatted_Address = form 始终是唯一的，因此我们不会不需要更新。

So the five columns remain intact, we joined the "MovieDate" into single strings and Formatted_Address=form is always unique so we don't need to update that.

事实证明，匹配您想要的所有我们需要的do连接 MovieDate的，并删除位置，Lat，Lng和'Formatted_Address'的重复条目。

It turns out to match what you wanted all we needed to do was concatenate the MovieDate's and remove duplicate entries for Location, Lat, Lng and 'Formatted_Address'.

这篇关于python命令的dict问题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

python命令的dict问题 [英] python ordered dict issue

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

python命令的dict问题 [英] python ordered dict issue

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭