如果csv中包含特定字词,该如何删除该行? [英] How to remove a line from a csv if it contains a certain word?

查看:184
本文介绍了如果csv中包含特定字词,该如何删除该行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的CSV文件:

I have a CSV file that looks something like this:

    2014-6-06 08:03:19, 439105, 1053224, Front Entrance
    2014-6-06 09:43:21, 439105, 1696241, Main Exit
    2014-6-06 10:01:54, 1836139, 1593258, Back Archway
    2014-6-06 11:34:26, 845646, external, Exit 
    2014-6-06 04:45:13, 1464748, 439105, Side Exit

我想知道如果删除包含"external"一词的行吗?

I was wondering how to delete a line if it includes the word "external"?

我看到了另一个

I saw another post on SO that addressed a very similar issue, but I don't understand completely...

我尝试使用类似这样的东西(如链接文章中所述):

I tried to use something like this (as explained in the linked post):

TXT_file = 'whatYouWantRemoved.txt'
CSV_file = 'comm-data-Fri.csv'
OUT_file = 'OUTPUT.csv'

## From the TXT, create a list of domains you do not want to include in output
with open(TXT_file, 'r') as txt:
    domain_to_be_removed_list = []

## for each domain in the TXT
## remove the return character at the end of line
## and add the domain to list domains-to-be-removed list
for domain in txt:
    domain = domain.rstrip()
    domain_to_be_removed_list.append(domain)


with open(OUT_file, 'w') as outfile:
    with open(CSV_file, 'r') as csv:

        ## for each line in csv
        ## extract the csv domain
        for line in csv:
            csv_domain = line.split(',')[0]

            ## if csv domain is not in domains-to-be-removed list,
            ## then write that to outfile
            if (csv_domain not in domain_to_be_removed_list):
                outfile.write(line)

文本文件只包含一个单词"external",但它不起作用..而且我不明白为什么.

The text file just held the one word "external" but it didn't work.... and I don't understand why.

发生的事情是程序将运行,并且将生成output.txt,但是什么都不会改变,并且不会删除带有"external"的行.

What happens is that the program will run, and the output.txt will be generated, but nothing will change, and no lines with "external" are taken out.

如果使用Windows和python 3.4,会有所不同.

I'm using Windows and python 3.4 if it makes a difference.

很抱歉,这似乎是一个非常简单的问题,但是我是python的新手,在此领域的任何帮助将不胜感激,谢谢!!

Sorry if this seems like a really simple question, but I'm new to python and any help in this area would be greatly appreciated, thanks!!

推荐答案

在分割线后,您似乎正在抓取第一个元素.根据您的示例CSV文件,这将为您提供日期.

It looks like you are grabbing the first element after you split the line. That is going to give you the date, according to your example CSV file.

您可能想要取而代之的(再次假设示例是它始终有效的方式)是获取第3个元素,所以像这样:

What you probably want instead (again, assuming the example is the way it will always work) is to grab the 3rd element, so something like this:

csv_domain = line.split(',')[2]

但是,就像其中一条评论所说的那样,这并不一定是傻瓜.您假设所有单个单元格都没有逗号.根据您的示例,这可能是一个安全的假设,但是通常在处理CSV文件时,我建议使用

But, like one of the comments said, this isn't necessarily fool proof. You are assuming none of the individual cells will have commas. Based on your example that might be a safe assumption, but in general when working with CSV files I recommend working with the Python csv module.

这篇关于如果csv中包含特定字词,该如何删除该行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆