Python CSV:根据dict映射写入行 [英] Python CSV: write rows according to dict mapping

查看:182
本文介绍了Python CSV:根据dict映射写入行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 dict 描述了我要应用于CSV文件中每一行的映射。

  dict1 = {key1:[value1,value2],key2:[value3]} 

我的程序应该读取一行,并将特定列中的键映射到由字典。如果每个键只有一个值,那么脚本应该写入一个包含新值的行的新文件。如果一个键有多个值,那么应该为每个值写一个新行。



例如, csvin 包含2行。一行有一列,其中 key1 存在,另一列有 key2 。在这种情况下,输出文件 csvout 应该包含比 csvin 更多的行3.除了一个单一的值之外,其中两行(与 key1 相关)将相同



我现在的脚本是这样的:

  def convSan(sfin,cfout):
with open( sfin,rb)as fin:
with open(cfout,wb)as fout:
csvin = csv.reader(fin)
csvout = csv.writer(fout,delimiter =,)
fline = csvin.next()
csvout.writerow(fline)

用于csvin中的行:
row [25] = dict1 [行[25]]
csvout.writerow(row)

这产生一个输出文件,与输入文件相同数量的列,但使用正确的新值填充每个字段(某些字段现在是值的列表)。



由@ sr2222提供的答案在简单列表的情况下工作,但是



帮助表示赞赏。

解决方案

div>

首先:

 索引值,枚举值(list1):
list1 [index] = list2 [index]

是更简洁的格式化第一个循环的方法。但是,这相当于 list1 = copy.copy(list2)。我想你想要做的是:

  normalized_values = ['123','456'] 
content = ['a123','123','b456','789']
for index,枚举(content)中的值:
在normalized_values中的normalized_value:
如果normalized_value的值:
content [index] = normalized_value

哪个会留下你:

  content = ['123','123','456','789'] 
/ pre>

问题更新后修改:

  replacement_map = {' 123':('a123','1234'),'456':('00456',)} 
input = ['123','456','234','123' ]
output = []
输入值:
try:
output.extend(replacement_map [value])
除了KeyError:
输出。 append(value)

try / except相当于:

 如果replace_map中的值:
output.extend(replacement_map [value])
else:
output.append(value)

响应于如上所述从2个列表中构建地图的注释(请注意,如果您始终可以假设list1和list2是相同的长度):

  replacement_map = {} 
为键,值为zip(list1,list2):
try:
replacement_map [key] .append(value)
除了KeyError:
replacement_map [key] = [value]
/ pre>

I have a dict that describes a mapping I want applied to every row in a CSV file.

dict1 = {"key1":["value1", "value2"], "key2":["value3"]}

My program should read one row and map the key in a specific column to the value(s) provided by the dict. If there's only one value per key, then the script should write to a new file the row containing the new value. If there are multiple values to a key, then there should be one new row written per value.

For example, csvin contains 2 rows. One row has a column in which key1 is present, and the other has key2. In this case, the output file csvout should contain more rows than csvin, in effect 3. Two of the rows (associated with key1) will be identical except for one single value.

My current script is this:

def convSan(sfin, cfout):
    with open(sfin, "rb") as fin:
        with open(cfout, "wb") as fout:
            csvin = csv.reader(fin)
            csvout = csv.writer(fout, delimiter=",")
            fline = csvin.next()
            csvout.writerow(fline)

        for row in csvin:
            row[25] = dict1[row[25]]
            csvout.writerow(row)

This produces an output file with the same number of columns as the input file, but populates every field with the correct new values (some fields are now lists of values).

The answer provided by @sr2222 works in the case of simple lists, but I cannot get it to work in my particular case.

Help is appreciated.

解决方案

First:

for index, value in enumerate(list1):
    list1[index] = list2[index]

Is a cleaner way to format your first loop. However, that is equivalent to list1 = copy.copy(list2). I think what you are trying to do is:

normalized_values = ['123', '456']
content = ['a123', '123', 'b456', '789']
for index, value in enumerate(content):
    for normalized_value in normalized_values:
        if normalized_value in value:
            content[index] = normalized_value

Which will leave you with:

content = ['123', '123', '456', '789']

Edit after question update:

replacement_map = {'123' : ('a123', '1234'), '456' : ('00456',)}
input = ['123', '456', '234', '123', '789']
output = []
for value in input:
    try:
        output.extend(replacement_map[value])
    except KeyError:
        output.append(value)

The try/except is equivalent to:

if value in replacement_map:
    output.extend(replacement_map[value])
else:
    output.append(value)

In response to comment on building the map from 2 lists as described above (note this will only behave correctly if you can always assume list1 and list2 are the same length):

replacement_map = {}
for key, value in zip(list1, list2):
    try:
        replacement_map[key].append(value)
    except KeyError:
        replacement_map[key] = [value]

这篇关于Python CSV:根据dict映射写入行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆