在CSV文件中搜索特定值的单个列,并返回整行 [英] Search a single column for a particular value in a CSV file and return an entire row

查看:314
本文介绍了在CSV文件中搜索特定值的单个列,并返回整行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

发行



代码无法正确识别输入(项目)。它只是转储到我的失败消息,即使这样的值存在于CSV文件中。任何人都可以帮我确定我做错了什么?



背景



我正在一个小程序请求用户输入(这里没有给出功能),搜索CSV文件(Item)中的特定列,并返回整行。 CSV数据格式如下所示。我已经缩短了数据从实际金额(49字段名称,18000+行)。



代码

  import csv 
从collections import namedtuple
from contextlib import关闭

def search():
item = 1000001
raw_data ='active_sanitized.csv'
failure ='没有匹配项找到与该项目代码。请重试。
check = False

关闭(open(raw_data,newline =''))as open_data:
read_data = csv.DictReader(open_data,delimiter = ';')
item_data = namedtuple('item_data',read_data.fieldnames)
while check == False:
地图中的行(item_data._make,read_data):
if row.Item == item:
return row
else:
return failure

CSV结构

  active_sanitized.csv 
Item; Name; Cost; Qty; Price; b $ b 1000001;名称:1; 1001; 1; 11;项目描述:1
1000002;名称:2; 1002; 2; 22;项目描述:2
1000003;这里:3; 1003; 3; 33;项目描述:3
1000004;名称:4; 1004; 4; 44;项目描述:4
1000005; 5; 55;项目描述:5
1000006;名称:6; 1006; 6; 66;项目描述:6
1000007;名称:7; 1007; 7;这里:7
1000008;名称:8; 1008; 8; 88;项目描述:8
1000009;名称:9; 1009; 9; 99;项目描述:9



注意



Python的经验相对较少,但我认为这是一个很好的问题,为了了解更多。



我确定了打开方法关闭函数)CSV文件,通过DictReader读取数据(获取字段名称),然后创建一个命名的元组,以便能够快速选择所需的输出列(项目,成本,价格,名称)。列顺序很重要,因此使用DictReader和namedtuple。



虽然有可能硬编码每个字段名称,但我觉得如果程序可以读取它们在文件打开时,在处理具有相同列名但列组织不同的类似文件时会更有帮助。



研究



您有三个问题,这个:




  • 你返回第一个失败,所以它永远不会超过第一行。

  • 您正在从文件中读取字符串,并与int比较。

  • _make 不是值,产生错误的结果( item_data(Item ='Name',Name ='Price',Cost ='Qty',Qty ='Item',Price ='Cost',Description =描述'))。

     用于read_data中的数据(item_data ):
    if row.Item == str(item):
    return row
    return failure




这修复了手头的问题 - 我们检查一个字符串,我们只返回如果没有匹配的项目(虽然你可能想开始将字符串转换为数据中的int,而不是string / int问题的这个hackish修复)。



我也改变了循环的方式 - 使用生成器表达式使一个更自然的语法,使用正常的构造语法命名属性从一个dict。这比使用 _make map()更清晰,更易读。它还修复问题3。


Issue

The code does not correctly identify the input (item). It simply dumps to my failure message even if such a value exists in the CSV file. Can anyone help me determine what I am doing wrong?

Background

I am working on a small program that asks for user input (function not given here), searches a specific column in a CSV file (Item) and returns the entire row. The CSV data format is shown below. I have shortened the data from the actual amount (49 field names, 18000+ rows).

Code

import csv
from collections import namedtuple
from contextlib import closing

def search():
    item = 1000001
    raw_data = 'active_sanitized.csv'
    failure = 'No matching item could be found with that item code. Please try again.'
    check = False

    with closing(open(raw_data, newline='')) as open_data:
        read_data = csv.DictReader(open_data, delimiter=';')
        item_data = namedtuple('item_data', read_data.fieldnames)
        while check == False:
            for row in map(item_data._make, read_data):
                if row.Item == item:
                    return row
                else:
                    return failure     

CSV structure

active_sanitized.csv
Item;Name;Cost;Qty;Price;Description
1000001;Name here:1;1001;1;11;Item description here:1
1000002;Name here:2;1002;2;22;Item description here:2
1000003;Name here:3;1003;3;33;Item description here:3
1000004;Name here:4;1004;4;44;Item description here:4
1000005;Name here:5;1005;5;55;Item description here:5
1000006;Name here:6;1006;6;66;Item description here:6
1000007;Name here:7;1007;7;77;Item description here:7
1000008;Name here:8;1008;8;88;Item description here:8
1000009;Name here:9;1009;9;99;Item description here:9

Notes

My experience with Python is relatively little, but I thought this would be a good problem to start with in order to learn more.

I determined the methods to open (and wrap in a close function) the CSV file, read the data via DictReader (to get the field names), and then create a named tuple to be able to quickly select the desired columns for the output (Item, Cost, Price, Name). Column order is important, hence the use of DictReader and namedtuple.

While there is the possibility of hard-coding each of the field names, I felt that if the program can read them on file open, it would be much more helpful when working on similar files that have the same column names but different column organization.

Research

解决方案

You have three problems with this:

  • You return on the first failure, so it will never get past the first line.
  • You are reading strings from the file, and comparing to an int.
  • _make iterates over the dictionary keys, not the values, producing the wrong result (item_data(Item='Name', Name='Price', Cost='Qty', Qty='Item', Price='Cost', Description='Description')).

    for row in (item_data(**data) for data in read_data):
        if row.Item == str(item):
            return row
    return failure
    

This fixes the issues at hand - we check against a string, and we only return if none of the items matched (although you might want to begin converting the strings to ints in the data rather than this hackish fix for the string/int issue).

I have also changed the way you are looping - using a generator expression makes for a more natural syntax, using the normal construction syntax for named attributes from a dict. This is cleaner and more readable than using _make and map(). It also fixes problem 3.

这篇关于在CSV文件中搜索特定值的单个列,并返回整行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
Python最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆