解析字符串,空格和引号(带引号保留) [英] Parse string with whitespace and quotation mark (with quotation mark retained)

查看:511
本文介绍了解析字符串,空格和引号(带引号保留)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有一个字符串像这样

create myclass "56, 'for the better or worse', 54.781"

如何解析它,这样的结果将有三个串字,它具有以下内容:

How can I parse it such that the result would be three string "words" which have the following content:

[0] create
[1] myclass
[2] "56, 'for the better or worse', 54.781"

编辑2:注意引号是要保留

起初,我试图用 string.Split(' '),但我注意到,它将使第三字符串打破,其他几个字符串。

At first, I attempted by using string.Split(' '), but I noticed that it would make the third string broken to few other strings.

我尝试使用它的计数参数为 3,以限制拆分结果来解决这个问题。而且它是确定这种情况下,但是当给定的字符串是

I try to limit the Split result by using its count argument as 3 to solve this. And is it ok for this case, but when the given string is

create myclass false "56, 'for the better or worse', 54.781" //or
create myclass "56, 'for the better or worse', 54.781" false

然后拆分失败,因为最后两个词将被合并。

Then the Split fails because the last two words will be combined.

我还创建了类似 ReadInBetweenSameDepth 来获得字符串在引号之间

I also created something like ReadInBetweenSameDepth to get the string in between the quotation mark

下面是我的 ReadInBetweenSameDepth

//Examples:
    //[1] (2 + 1) * (5 + 6) will return 2 + 1
    //[2] (2 * (5 + 6) + 1) will return 2 * (5 + 6) + 1
public static string ReadInBetweenSameDepth(string str, char delimiterStart, char delimiterEnd) {
  if (delimiterStart == delimiterEnd || string.IsNullOrWhiteSpace(str) || str.Length <= 2)
    return null;
  int delimiterStartFound = 0;
  int delimiterEndFound = 0;
  int posStart = -1;
  for (int i = 0; i < str.Length; ++i) {
    if (str[i] == delimiterStart) {
      if (i >= str.Length - 2) //delimiter start is found in any of the last two characters
        return null; //it means, there isn't anything in between the two
      if (delimiterStartFound == 0) //first time
        posStart = i + 1; //assign the starting position only the first time...
      delimiterStartFound++; //increase the number of delimiter start count to get the same depth
    }
    if (str[i] == delimiterEnd) {
      delimiterEndFound++;
      if (delimiterStartFound == delimiterEndFound && i - posStart > 0)
        return str.Substring(posStart, i - posStart); //only successful if both delimiters are found in the same depth
    }
  }
  return null;
}



不过,虽然这个功能是工作,我发现它相当困难的结果结合起来与 string.Split ,使我想正确的解析

编辑2:在我解决不好,我需要稍后重新加了引号

有没有更好的方式来做到这一点?如果我们使用正则表达式,做我们如何做到这一点。

Is there any better way to do this? If we use Regex, how do we do this?

编辑:

老实说,我不知道我这个问题可以解决的方式为CSV格式的文本一样。无论我才知道,这个问题并不一定是正则表达式解决(因此我标记它是这样)。我真诚的歉意,那些谁认为这是重复的职位。

I honestly am unaware that this problem could be solved the same way as the CSV formatted text. Neither did I know that this problem is not necessarily solved by Regex (thus I labelled it as such). My sincere apology to those who see this as duplicate post.

编辑2:

这是我的项目更多的工作后,我意识到有什么问​​题,我的问题(也就是说,我不包括引号) - 的我的道歉,先前最好的回答者,添Schmelter先生的。然后看欺骗链接之后,我发现它不提供这个答案不是。

After working more on my project, I realized that there was something wrong with my question (that is, I did not include quotation mark) - My apology to the previously best answerer, Mr. Tim Schmelter. And then after looking at the dupe-link, I noticed that it doesn't provide the answer for this either.

推荐答案

< A HREF =htt​​ps://regex101.com/r/xL8lO7/1相对=nofollow> 正则表达式演示

(\w+|"[^"]*")

获取第一个捕捉组中的比赛。

Get the matches in the first capture group.


  1. \w + :匹配字母数字字符和下划线一次或多次

  2. [^] *:匹配任何被包裹在双引号

  3. | :或正则表达式条件

  1. \w+: Matches alphanumeric characters and underscore one or more times
  2. "[^"]*": Matches anything that is wrapped in double quotes
  3. |: OR condition in regex

这篇关于解析字符串,空格和引号(带引号保留)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆