哪个更快,XPath或Regexp? [英] Which is faster, XPath or Regexp?

查看:172
本文介绍了哪个更快,XPath或Regexp?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我为Firefox做了一个附加组件,它使用ajax加载一个html页面(附加组件是XUL面板)。

现在,我没有搜索创建文档对象并将ajax请求内容放入其中,然后使用xPath来查找我需要的方法。

相反,我正在加载的内容和解析它作为文本与正则expresion。

但我有一个问题。哪个更好用,xPath或正则表达式?哪一个更快执行?



HTML页面将包含数百个包含相同文本的元素,我基本上想要做的是统计有多少元素。

我希望我的插件能够尽可能快地工作,而且我不知道regexp或xPath背后的机制,所以我不知道哪一个更有效。

希望我清楚。谢谢

解决方案

无论您何时处理XML,都可以使用XPath(或XSLT,XQuery,SAX,DOM或任何其他可识别XML方法来通过你的数据)。 从不 为此任务使用正则表达式

为什么? XML处理是错综复杂的,处理所有的怪事,外部/分析/未分析的实体,DTD的,处理指令,空白处理,崩溃,unicode标准化,CDATA部分等等,使得它很难创建一个可靠的正则表达式获取您的数据的方式。只要考虑到已经花了几年的时间来学习如何最好地解析XML,应该有足够的理由不要自己去做这件事。



<回答你的问题:当涉及到速度(这不应该成为你的主要关注点)时,它很大程度上取决于XPath或Regex编译器/处理器的实现。有时,XPath会更快(即,如果可能的话,或者编译XSLT时使用键),其他时候,正则表达式会更快(如果您可以使用预编译的正则表达式,并且您的查询很容易)。但是正则表达式对于HTML / XML来说绝非易事,因为嵌套的括号(标签)问题是无法用正则表达式单独解决的。如果输入是巨大的,正则表达式往往会更快,除非XPath实现可以进行流处理(我相信这不是Firefox内部的方法)。

>

您写了:


更有效*


为您带来最快速的可靠和稳定的实施。使用XPath。这是什么在Firefox和其他浏览器中使用,以及如果您需要您的代码从浏览器运行。


I am making an add-on for firefox and it loads a html page using ajax (add-on has it's XUL panel).

Now at this point, i did not search for a ways of creating a document object and placing the ajax request contents into it and then using xPath to find what i need.
Instead i am loading the contents and parsing it as text with regular expresion.

But i got a question. Which would be better to use, xPath or regular expression? Which is faster to perform?

The HTML page would consist of hundreds of elements which contain same text, and what i basically want to do is count how many elements are there.

I want my add-on to work as fast as possible and i do not know the mechanics behind regexp or xPath, so i don't know which is more effective.

Hope i was clear. Thanks

解决方案

Whenever you are dealing with XML, use XPath (or XSLT, XQuery, SAX, DOM or any other XML-aware method to go through your data). Do never use regular expressions for this task.

Why? XML processing is intricate and dealing with all its oddities, external/parsed/unparsed entities, DTD's, processing instructions, whitespace handling, collapsing, unicode normalization, CDATA sections etc makes it very hard to create a reliable regex-way of getting your data. Just consider that it has taken the industry years to learn how to best parse XML, should be enough reason not to try to do this by yourself.

Answering your q.: when it comes to speed (which should not be your primary concern here), it highly depends on the implementation of either the XPath or Regex compiler / processor. Sometimes, XPath will be faster (i.e., when using keys, if possible, or compiled XSLT), other times, regexes will be faster (if you can use a precompiled regex and your query is easy). But regexes are never easy with HTML/XML simply because of the matching nested parentheses (tags) problem, which cannot be reliably solved with regexes alone.

If input is huge, regex will tend to be faster, unless the XPath implementation can do streaming processing (which I believe is not the method inside Firefox).

You wrote:

"which is more effective"*

the one that brings you quickest to a reliable and stable implementation that's comparatively speedy. Use XPath. It's what's used inside Firefox and other browsers as well if you need your code to run from a browser.

这篇关于哪个更快,XPath或Regexp?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆