使用Python中的regex在CSV文件中搜索特定短语 [英] Searching for a specific phrase in CSV file using regex in Python

查看：135 发布时间：2020/7/12 1:45:27 python regex csv

本文介绍了使用Python中的regex在CSV文件中搜索特定短语的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个tvets的csv数据库，需要搜索特定短语和单词的列表. 例如，我正在搜索全球变暖".我不仅要找到全球变暖"，还要找到全球变暖"，全球变暖"，#globalwarming"，#Globalwarming"，#GlobalWarming"等.所有可能的形式.

I have a csv database of tweets, which I need to search for a list of specific phrases and words. For example, I'm searching for "global warming". I want to find not only "global warming", but also "Global warming", "Global Warming", "#globalwarming", "#Globalwarming", "#GlobalWarming", etc. So, all the possible forms.

我该如何在代码中实现正则表达式呢?也许还有另一种解决方案?

How could I implement regex into my code to do that? Or maybe there's another solution?

with open('filedirectory.csv', 'w', newline='') as output_file:
    writer = csv.writer(output_file)

    with open('filedirectory1.csv', 'w', newline='') as output_file2:
        writer2 = csv.writer(output_file2)

        with open('filedirectory2.csv') as csv_file:
          csv_read = csv.reader(csv_file)

          for row in csv_read:

                search_terms = ["global warming", "GLOBAL WARMING", etc.]

                if any([term in row[2] for term in search_terms]):
                   writer.writerow(row)

                else:
                   writer2.writerow(row) ``

推荐答案

您可以通过简单的修改使用自己的代码

You can use your own code with very simple modification

...

for row in csv_read:
    row_lower = row.lower()
    search_terms = ["global warming", "globalwarming"]

    if any([term in row_lower for term in search_terms]):
        writer.writerow(row)
    else:
        writer2.writerow(row)

如果必须使用正则表达式，或者您害怕错过某些行，例如:"... global(超过一个空格)warming ..."，".. global ____ warming .."，.global严重变暖" .."

If you must use regex or you are afraid to miss some rows such as : "...global(more than one space)warming...", "..global____warming..", "..global serious warming.."

...

global_regex = re.compile(r'global.*?warming', re.IGNORECASE)
for row in csv_read:            

        if any(re.findall(global_regex, row)):
           writer.writerow(row)
        else:
           writer2.writerow(row)

我在循环外编译了正则表达式，以获得更好的性能.

I compiled the regex outside the loop for better performance.

在这里，您可以看到正在使用的正则表达式.

Here you can see the regex in action.

这篇关于使用Python中的regex在CSV文件中搜索特定短语的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python中的regex在CSV文件中搜索特定短语 [英] Searching for a specific phrase in CSV file using regex in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python中的regex在CSV文件中搜索特定短语 [英] Searching for a specific phrase in CSV file using regex in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭