解析字符串不拆分 [英] Parse without string split

查看:150
本文介绍了解析字符串不拆分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是从一些其他的问题。

假如我有解析很长串的数量巨大。每个字符串包含由空格分隔的序列双 S(在文本表示,当然)。我需要解析双击 s转换为列表<双重方式>

Suppose I've got to parse a huge number of very long strings. Each string contains a sequence of doubles (in text representation, of course) separated by whitespace. I need to parse the doubles into a List<double>.

标准解析技术(使用 string.Split + double.TryParse )似乎是相当缓慢:为每一个数字,我们需要分配一个字符串

The standard parsing technique (using string.Split + double.TryParse) seems to be quite slow: for each of the numbers we need to allocate a string.

我试图使它旧的C类方式:计算的开始​​索引和子串的结束包含数字和到位解析它,而无需创建附加的字符串。 (请参阅 http://ideone.com/Op6h0 时,以下所示的有关部分。)

I tried to make it old C-like way: compute the indices of the beginning and the end of substrings containing the numbers, and parse it "in place", without creating additional string. (See http://ideone.com/Op6h0, below shown the relevant part.)

int startIdx, endIdx = 0;
while(true)
{
    startIdx = endIdx;
    // no find_first_not_of in C#
    while (startIdx < s.Length && s[startIdx] == ' ') startIdx++;
    if (startIdx == s.Length) break;
    endIdx = s.IndexOf(' ', startIdx);
    if (endIdx == -1) endIdx = s.Length;
    // how to extract a double here?
}

有是超载 string.IndexOf ,只在给定的子字符串中进行搜索,但我没能找到解析双从子,而不首先提取子字符串的方法。

There is an overload of string.IndexOf, searching only within a given substring, but I failed to find a method for parsing a double from substring, without actually extracting that substring first.

请问人有一个想法?

推荐答案

没有管理API 解析从子双。我的猜测是分配字符串相比double.Parse所有的浮点运算是微不足道的。

There is no managed API to parse a double from a substring. My guess is that allocating the string will be insignificant compared to all the floating point operations in double.Parse.

无论如何,你可以通过创建一个缓冲保存配置字符串长度100只由空格的一次。然后,要分析每一个字符串,你字符复制到这个缓冲区字符串中使用的不安全的代码即可。您填写的空白与缓冲区字符串。 。而对于解析您可以使用NumberStyles.AllowTrailingWhite这将导致尾随的空白被忽略

Anyway, you can save the allocation by creating a "buffer" string once of length 100 consisting of whitespace only. Then, for every string you want to parse, you copy the chars into this buffer string using unsafe code. You fill the buffer string with whitespace. And for parsing you can use NumberStyles.AllowTrailingWhite which will cause trailing whitespace to be ignored.

获得一个指向字符串实际上是完全支持的操作:

Getting a pointer to string is actually a fully supported operation:

    string l_pos = new string(' ', 100); //don't write to a shared string!
    unsafe 
    {
        fixed (char* l_pSrc = l_pos)
        {               
              // do some work
        }
    }

C#有特殊的语法来绑定一个字符串,一个char *。

C# has special syntax to bind a string to a char*.

这篇关于解析字符串不拆分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆