您能实现一个轻量级XML解析器与< regex>? [英] Would you implement a lightweight XML parser with <regex>?
问题描述
如果你必须实现一个轻量级的XML解析器,你会选择使用regex吗?
If you had to implement a lightweight XML parser, would you choose to use regex?
我的case中的XML解析将被最简化:只有标签和文本内容。没有命名空间,没有属性,没有模式支持(在一开始肯定,但也许...)。
The XML parsing in my case would be most simplified: only tags and text content. No namespaces, no attributes, no schema support (at the beginning surely, but maybe...).
我认为这是一个很好的练习,让我学习新的C ++ 0x< regex>库。
不过,我想知道XML解析是否不会超出正常的正则表达式限制。
I think it would be a good exercise for me to learn the new C++0x <regex> library. However, I was wondering if XML parsing wouldn't be above decent regex limits.
推荐答案
没有。 XML不是常规语言。
In a word: no. XML is not a regular language.
UPDATE (根据下面的评论中的讨论展开)
UPDATE (To expand, based on the discussion in the comments below)
XML不是常规的,因此您不能希望使用正则表达式对整个文件/字符串执行某种一键式解析/拆分操作。
XML is not regular, so you cannot hope to use regexes to perform some sort of one-hit parse/split operation on the entire file/string.
虽然你可以编写一个基于状态机的解析器,它使用正则表达式来执行词法/标记化,但是IMHO的效率要低于使用工具意味着工作。正如其他人所说,Flex / Bison是一个选择。
Whilst you could write a state-machine-based parser that uses regexes to perform the lexing/tokenisation, IMHO this would be less efficient, and more error-prone, than using a tool that's meant for the job. As others have said, Flex/Bison is one option.
这篇关于您能实现一个轻量级XML解析器与< regex>?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!