拆分忽略引用节的字符串 [英] Split a string ignoring quoted sections
问题描述
给定一个这样的字符串:
Given a string like this:
a,string,with,各种值和一些 / p>
a,"string, with",various,"values, and some",quoted
什么是一个好的算法,基于逗号分割,而忽略引号部分中的逗号?
What is a good algorithm to split this based on commas while ignoring the commas inside the quoted sections?
输出应该是一个数组:
[a,string,with ,values,and some,quoted]
[ "a", "string, with", "various", "values, and some", "quoted" ]
推荐答案
没有提供一个方法来做这个没有想到我最初会考虑两个选项作为简单的方法出来:
If my language of choice didn't offer a way to do this without thinking then I would initially consider two options as the easy way out:
-
-parse并用另一个控制字符替换字符串中的逗号,然后拆分它们,然后在数组上进行后解析,以用逗号替换以前使用的控制字符。
Pre-parse and replace the commas within the string with another control character then split them, followed by a post-parse on the array to replace the control character used previously with the commas.
或者,将它们分割在逗号上,然后将结果数组解析为另一个数组,检查每个数组条目的前导引号,并连接条目,直到我到达终止引号。
Alternatively split them on the commas then post-parse the resulting array into another array checking for leading quotes on each array entry and concatenating the entries until I reached a terminating quote.
然而,这些都是黑客,如果这是一个纯粹的精神练习,那么我怀疑他们将证明是无益的。如果这是一个现实世界的问题,那么它将有助于知道语言,以便我们可以提供一些具体的建议。
These are hacks however, and if this is a pure 'mental' exercise then I suspect they will prove unhelpful. If this is a real world problem then it would help to know the language so that we could offer some specific advice.
这篇关于拆分忽略引用节的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!