python中许多正则表达式的速度 [英] Speed of many regular expressions in python

查看:62
本文介绍了python中许多正则表达式的速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个处理大量字符串/文件的 python 程序.我的问题是,我将看到一段相当短的文本,我需要在其中搜索相当广泛的单词/短语的实例.

I'm writing a python program that deals with a fair amount of strings/files. My problem is that I'm going to be presented with a fairly short piece of text, and I'm going to need to search it for instances of a fairly broad range of words/phrases.

我想我需要编译正则表达式作为匹配文本中这些单词/短语的一种方式.但是,我担心这会花费很多时间.

I'm thinking I'll need to compile regular expressions as a way of matching these words/phrases in the text. My concern, however, is that this will take a lot of time.

我的问题是,重复编译正则表达式,然后搜索一小段文本以找到匹配项的过程有多快?使用一些字符串方法会更好吗?

My question is how fast is the process of repeatedly compiling regular expressions, and then searching through a small body of text to find matches? Would I be better off using some string method?

所以,我想我的问题的一个例子是:用一个正则表达式编译和搜索与说,迭代如果字符串中的单词"说,5次相比有多昂贵?

So, I guess an example of my question would be: How expensive would it be to compile and search with one regular expression versus say, iterating 'if "word" in string' say, 5 times?

推荐答案

如果速度至关重要,那么在决定如何编写生产应用程序之前最好先运行一些测试.

If speed is of the essence, you are better off running some tests before you decide how to code your production application.

首先,您说您正在搜索的单词表明您可以使用 split() 来分解空格上的字符串.然后使用简单的字符串比较进行搜索.

First of all, you said that you are searching for words which suggests that you may be able to do this using split() to break up the string on whitespace. And then use simple string comparisons to do your search.

一定要编译您的正则表达式,并进行计时测试,将其与纯字符串函数进行比较.查看字符串类的文档以获取完整列表.

Definitely do compile your regular expressions and do a timing test comparing that with the plain string functions. Check the documentation for the string class for a full list.

这篇关于python中许多正则表达式的速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆