如何使用Python中的N行在csv文件中创建嵌套字典 [英] How to create a nested dictionary from a csv file with N rows in Python

查看：238 发布时间：2018/6/4 13:46:23 python csv dictionary hashmap nested

本文介绍了如何使用Python中的N行在csv文件中创建嵌套字典的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在寻找一种方法来读取具有未知数量列的csv文件到嵌套字典中。即输入表格

  file.csv：
 1,2,3,4 
 1 ，6,7,8 
 9,10,11,12

我想要一本字典的形式：

  {1：{2：{3：4}，6：{7：8}}，9 ：{10：{11:12}}}

这是为了让O（1）搜索csv文件中的值。
创建字典可能需要相当长的时间，因为在我的应用程序中，我只创建一次，但搜索了数百万次。

我也想要选项来命名相关的列，以便我可以忽略不必要的一次 解决方案

这是我想出来的。

  import csv 
 import itertools 
 
 def list_to_dict （lst）：
＃获取一个列表，并递归地将其转换为一个嵌套字典，其中
＃第一个元素是一个键，其值是从
创建的字典。列表。列表中的最后一个元素将是
＃最内层字典的值
＃INPUTS：
＃lst  - 列表（例如字符串或浮点数）
＃OUTPUT：
＃嵌套字典
＃示例运行：
＃>>> lst = [1，2，3，4] 
＃>>> list_to_dict（lst）
＃{1：{2：{3：4}}} 
如果len（lst）== 1：
 return lst [0] 
 else： 
 data_dict = {lst [-2]：lst [-1]} 
 lst.pop（）
 lst [-1] = data_dict 
 return list_to_dict（lst）
 
 
 def dict_combine（d1，d2）：
＃将两个嵌套字典组合成一个。 
＃输入：
＃d1，d2：两个嵌套字典。该函数可能会改变d1和d2，
＃，因此如果输入字典不会变异，
＃应该传递d1和d2的副本。 
＃请注意，如果d1是
＃更大的字典，则该函数可以更高效地工作。 
＃输出：
＃组合字典
＃示例运行：
＃>>> d1 = {1：{2：{3：4,5：6}}} 
＃>>> d2 = {1：{2：{7：8}，9：{10,11}}} 
＃>>> dict_combine（d1，d2）
＃{1：{2：{3：4,5：6,7：8}，9：{10，11}}} 
 
 for key在d2中：
如果键入d1：
 d1 [key] = dict_combine（d1 [key]，d2 [key]）
 else：
 d1 [key] = d2 [ key] 
返回d1 
 
 
 def csv_to_dict（csv_file_path，params = None，n_row_max =无）：
＃名称：csv_to_dict 
＃
＃说明：读取csv文件并将相关列转换为嵌套的
＃字典。 
＃
＃输入：
＃csv_file_path：数据文件的完整路径
＃params：相关列名称的列表。生成的字典
＃将按照与params中的参数相同的顺序进行嵌套。 
＃默认为无（读取所有列）
＃n_row_max：要读取的最大行数。默认值是None 
＃（读取所有行）
＃
＃OUTPUT：
＃包含所有相关csv数据的嵌套字典
 
 csv_dictionary = { （csv_file_path，'r'）作为csv_file：
 csv_data = csv.reader（csv_file，delimiter ='，'）
 names = next（csv_data）＃
 
 with open阅读标题行
如果不是参数：
＃从csv读取列索引列表
 relevant_param_indices = list（范围（0，len（names） -  1））
 else ：
＃从csv读取的列索引列表
 relevant_param_indices = [] 
在参数中的名称：
如果名称不在名称中：
＃参数名称是没有在标题行中找到
提高ValueError（'在csv file'.format（name）中找不到{}} 
 else：
＃获取相关列的索引ns 
 relevant_param_indices.append（names.index（name））
用于itertools.islice（csv_data，1，n_row_max）中的行：
＃获取仅包含相关列的列表
 relevant_cols = [在相关参数指标中为i行] [$] 
＃将字符串转换为数字。不需要
 float_row = [在relevant_cols中元素的float（元素）] 
＃构建嵌套字典
 csv_dictionary = dict_combine（csv_dictionary，list_to_dict（float_row））
 
 return csv_dictionary

I was looking for a way to read a csv file with an unknown number of columns into a nested dictionary. i.e. for input of the form

file.csv:
1,  2,  3,  4
1,  6,  7,  8
9, 10, 11, 12

I want a dictionary of the form:

{1:{2:{3:4}, 6:{7:8}}, 9:{10:{11:12}}}

This is in order to allow O(1) search of a value in the csv file. Creating the dictionary can take a relatively long time, as in my application I only create it once, but search it millions of times.

I also wanted an option to name the relevant columns, so that I can ignore unnecessary once
解决方案
Here is what I came up with. Feel free to comment and suggest improvements.
import csv import itertools def list_to_dict(lst): # Takes a list, and recursively turns it into a nested dictionary, where # the first element is a key, whose value is the dictionary created from the # rest of the list. the last element in the list will be the value of the # innermost dictionary # INPUTS: # lst - a list (e.g. of strings or floats) # OUTPUT: # A nested dictionary # EXAMPLE RUN: # >>> lst = [1, 2, 3, 4] # >>> list_to_dict(lst) # {1:{2:{3:4}}} if len(lst) == 1: return lst[0] else: data_dict = {lst[-2]: lst[-1]} lst.pop() lst[-1] = data_dict return list_to_dict(lst) def dict_combine(d1, d2): # Combines two nested dictionaries into one. # INPUTS: # d1, d2: Two nested dictionaries. The function might change d1 and d2, # therefore if the input dictionaries are not to be mutated, # you should pass copies of d1 and d2. # Note that the function works more efficiently if d1 is the # bigger dictionary. # OUTPUT: # The combined dictionary # EXAMPLE RUN: # >>> d1 = {1: {2: {3: 4, 5: 6}}} # >>> d2 = {1: {2: {7: 8}, 9: {10, 11}}} # >>> dict_combine(d1, d2) # {1: {2: {3: 4, 5: 6, 7: 8}, 9: {10, 11}}} for key in d2: if key in d1: d1[key] = dict_combine(d1[key], d2[key]) else: d1[key] = d2[key] return d1 def csv_to_dict(csv_file_path, params=None, n_row_max=None): # NAME: csv_to_dict # # DESCRIPTION: Reads a csv file and turns relevant columns into a nested # dictionary. # # INPUTS: # csv_file_path: The full path to the data file # params: A list of relevant column names. The resulting dictionary # will be nested in the same order as parameters in 'params'. # Default is None (read all columns) # n_row_max: The maximum number of rows to read. Default is None # (read all rows) # # OUTPUT: # A nested dictionary containing all the relevant csv data csv_dictionary = {} with open(csv_file_path, 'r') as csv_file: csv_data = csv.reader(csv_file, delimiter=',') names = next(csv_data) # Read title line if not params: # A list of column indices to read from csv relevant_param_indices = list(range(0, len(names) - 1)) else: # A list of column indices to read from csv relevant_param_indices = [] for name in params: if name not in names: # Parameter name is not found in title line raise ValueError('Could not find {} in csv file'.format(name)) else: # Get indices of the relevant columns relevant_param_indices.append(names.index(name)) for row in itertools.islice(csv_data, 1, n_row_max): # Get a list containing relevant columns only relevant_cols = [row[i] for i in relevant_param_indices] # Turn the string to numbers. Not necessary float_row = [float(element) for element in relevant_cols] # Build nested dictionary csv_dictionary = dict_combine(csv_dictionary, list_to_dict(float_row)) return csv_dictionary

这篇关于如何使用Python中的N行在csv文件中创建嵌套字典的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用Python中的N行在csv文件中创建嵌套字典 [英] How to create a nested dictionary from a csv file with N rows in Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用Python中的N行在csv文件中创建嵌套字典 [英] How to create a nested dictionary from a csv file with N rows in Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭