使用csv中的python创建列表的字典 [英] Create a dict of list using python from csv
问题描述
我有一个csv文件,其数据如下:
I have a csv file with data as below
XPATH,ColumName,CSV_File_Name,ParentKey
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/id,id,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/attachments/attachment[]/id,aid,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/Internalid,Internalid,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/isDelete,FormId,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/fields/field[]/id,SupplierFormRecordFieldId,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/fields/field[]/value,SupplierFormRecordFieldValue,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/integrationTrackingNumber,integrationTrackingNumber,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityHeader/referenceCodeForEntity,referenceCodeForEntity,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/requestId,requestId,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/id,supplier_id,integrationEntityDetailsForms.csv,Y
我想创建一个列表字典,其结果将基本上像这样在[]上分割,并将所有[0]放在每个元素的第一个列表上.丢弃没有[]的记录.这将给出每个级别的标签列表.
I wanted to create a dictionary of list which would result like this basically split on [] and put all put the [0] on the first list for every element . discard the records which dont have [] . This will give the list of tag at each level.
{ 1 : ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form', 'integration-outbound:IntegrationEntity.integrationEntityHeader.attachments.attachment'] , 2 : ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record'] , 3 : ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record.fields.field'] }
到目前为止,我直到使用[]分割字符串,将/转换为为止.和列表被拆分并累积.但是我被迫退回到清单的字典.这将使我了解每个标签的级别
so far i have reached till splitting the string using [] , converting the / to . and the list are split up and accumulated. But i am stuck in putting back to dictonary of list . Which will give me the level at which the each tag are
df_process_sub_explode_Level
为该csv中的每一行提供了单独的行,但需要删除重复字母并弹出dict.
df_process_sub_explode_Level
gives the individual line for each row in that csv, but need to remove duplciates and popualte to the dict .
CSV_File_Name = []
with open(process_config_csv, newline='') as csvfile:
DataCaptured = csv.DictReader(csvfile)
for row in DataCaptured:
if row['CSV_File_Name'] not in CSV_File_Name:
CSV_File_Name.append(row['CSV_File_Name'])
df_process = []
df_process_all_col = []
df_process_explode_Level = dict()
for items in CSV_File_Name:
df_subset_process = []
df_subset_list_all_cols = []
with open(process_config_csv, newline='') as csvfile:
DataCaptured = csv.DictReader(csvfile)
for row in DataCaptured:
df_process_sub_explode_Level = []
if row['CSV_File_Name'] in items:
df_subset_process.append(row['XPATH'].replace("/",".").split('[]')[0].replace(".","",1))
df_subset_list_all_cols.append(row['XPATH'].replace("/",".").replace("[]","").replace(".","",1))
if "[]" in row['XPATH']:
print(row['XPATH'])
df_process_sub_explode_Level=row['XPATH'].replace("/",".").replace(".","",1).split('[]')
del df_process_sub_explode_Level[-1]
df_process_sub_explode_Level = list(accumulate(df_process_sub_explode_Level))
for explodeitems in range(len(df_process_sub_explode_Level)):
df_process_explode_Level[explodeitems].append(df_process_sub_explode_Level[explodeitems])
错误:
Traceback (most recent call last):
File "<stdin>", line 17, in <module>
KeyError: 0
请指导重新设置列表.
推荐答案
尝试一下:
from csv import DictReader
from collections import defaultdict
with open('data.csv') as fp:
csv_reader = DictReader(fp)
data = [row['XPATH'].strip('/').replace('/', '.').split('[]') for row in csv_reader]
res = defaultdict(set)
for x in data:
if len(x) > 1:
res[len(x) -1].add(''.join(x[: -1]))
res = {k: list(v) for k, v in res.items()}
print(res)
输出:
{1: ['integration-outbound:IntegrationEntity.integrationEntityHeader.attachments.attachment',
'integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form'],
2: ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record'],
3: ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record.fields.field']}
这篇关于使用csv中的python创建列表的字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!