搜索文件中的多个字符串(从文件中)并打印该行 [英] Search Multiple Strings (from File) in a file and print the line

查看:86
本文介绍了搜索文件中的多个字符串(从文件中)并打印该行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这里再次对您致歉:在下面的代码中尝试搜索从关键字读取的多个字符串并在f中进行搜索并打印该行. 如果我只有一个关键字,那么它会起作用,但是如果我只有一个以上的关键字,它就不会起作用.

Again apologies for been noob here: Trying below code for searching multiple strings read from keywords and search in f and printing the line. It works if I have only one keyword but not if I have more then one.

keywords = input("Please Enter keywords path as c:/example/ \n :")
keys = open((keywords), "r").readline()
with open("c:/saad/saad.txt") as f:
    for line in f:
        if (keys) in line:
            print(line)

推荐答案

查找关键字的挑战之一是定义关键字的含义以及如何解析文件内容以查找完整的关键字集.如果"aa"是关键字,那么它应该匹配"aaa"还是"aa()"?关键字中可以有数字吗?

One of the challenges of looking for keywords is defining what you mean by keyword and how a file's contents should be parsed to find the full set of keywords. If "aa" is a keyword, should it match "aaa" or maybe ""aa()"? Can a keyword have numbers in it?

一个简单的解决方案是说关键字仅按字母顺序排列,并且应完全匹配连续的字母字符串,而忽略大小写.此外,比赛应逐行考虑,而不是逐句考虑.我们可以使用正则表达式来查找字母序列和集合,以检查收容情况,如下所示:

A simple solution is to say that keywords are alphabetic only and should match contiguous alphabetic strings exactly, ignoring case. Further, matches should be considered line by line, not sentence by sentence. We can use a regex to find alphabetic sequences and sets to check containment like so:

keys.txt

aa bb 

test.txt

aa is good
AA is good
bb is good
cc is not good
aaa is not good

test.py

import re

keyfile = "keys.txt"
testfile = "test.txt"

keys = set(key.lower() for key in 
    re.findall(r'\w+', open(keyfile , "r").readline()))

with open(testfile) as f:
    for line in f:
        words = set(word.lower() for word in re.findall(r'\w+', line))
        if keys & words:
            print(line, end='')

结果:

aa is good
AA is good
bb is good

为您的比赛添加更多规则,这会变得更加复杂.

Add more rules for what you mean by a match and it gets more complicated.

编辑

假设每行只有一个关键字,而您只希望一个子字符串匹配(即"aa"匹配"aaa"),而不是关键字搜索,您可以这样做

Suppose you have one keyword per line and you just want a substring match (that is, "aa" matches "aaa") instead of a keyword search, you could do

keyfile = "keys.txt"
testfile = "test.txt"

keys = [key for key in (line.strip() for line in open(keyfile)) if key]

with open(testfile) as f:
    for line in f:
        for key in keys:
            if key in line:
                print(line, end='')
                break

但是我只是在猜测您的标准是什么.

But I'm just guessing what your criteria are.

这篇关于搜索文件中的多个字符串(从文件中)并打印该行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆