解析字符串非常快 [英] Parse string to double really fast

查看:62
本文介绍了解析字符串非常快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我想知道是否有人知道库或函数

解析包含3个双号的字符串,如

xxxx.yyyyyyyyy xxxx.yyyyyyyyy xxxx.yyyyyyyyy真快。

目前我正在使用sscanf(line.c_str(),"%lf%lf%lf,& x ,& y,

& z);这有点慢。


提前致谢,

Thomas Kowalski

Hi,
I would like to know whether someone knows a library or function that
parses a string containing 3 double numbers in the form like
xxxx.yyyyyyyyy xxxx.yyyyyyyyy xxxx.yyyyyyyyy really fast.
Currently I am using "sscanf(line.c_str(), "%lf %lf %lf", &x, &y,
&z);" which is kind of slow.

Thanks in advance,
Thomas Kowalski

推荐答案



" Thomas Kowalski" < th *** @ gmx.dewrote in message

news:11 ********************* @ o5g2000hsb.googlegrou ps。 com ...

"Thomas Kowalski" <th***@gmx.dewrote in message
news:11*********************@o5g2000hsb.googlegrou ps.com...



我想知道是否有人知道库或函数

解析包含3个双号的字符串,例如

xxxx.yyyyyyyyy xxxx.yyyyyyyyy xxxx.yyyyyyyyy真的很快。

目前我正在使用sscanf(行。 c_str(),"%lf%lf%lf",& x,& y,

& z);"这有点慢。


提前致谢,

Thomas Kowalski
Hi,
I would like to know whether someone knows a library or function that
parses a string containing 3 double numbers in the form like
xxxx.yyyyyyyyy xxxx.yyyyyyyyy xxxx.yyyyyyyyy really fast.
Currently I am using "sscanf(line.c_str(), "%lf %lf %lf", &x, &y,
&z);" which is kind of slow.

Thanks in advance,
Thomas Kowalski



#include < iostream>

#include< sstream>

#include< string>


int main()

{

std :: string line(" 123.456 987.654 3.1416");

std :: istringstream iss(line);

double x(0);

double y(0);

double z(0);

if(! (iss> x> y> z))

std :: cerr<< 转换错误\ n;

其他

std :: cout<< x<< ''\ n''<< y<< ''\ n''<< z<< ''\ n'';

返回0;

}


-Mike

#include <iostream>
#include <sstream>
#include <string>

int main()
{
std::string line("123.456 987.654 3.1416");
std::istringstream iss(line);
double x(0);
double y(0);
double z(0);
if(!(iss >x >y >z))
std::cerr << "Conversion error\n";
else
std::cout << x << ''\n'' << y << ''\n'' << z << ''\n'';
return 0;
}

-Mike


6月6日下午2:18,Thomas Kowalski< t ... @ gmx.dewrote:
On Jun 6, 2:18 pm, Thomas Kowalski <t...@gmx.dewrote:



我想知道是否有人知道库或函数

解析包含3个双号的字符串,如

xxxx.yyyyyyyyy xxxx。 yyyyyyyyy xxxx.yyyyyyyyy真的很快。

目前我正在使用sscanf(line.c_str(),"%lf%lf%lf",& x,& y,

& z);这有点慢。


在此先感谢,

Thomas Kowalski
Hi,
I would like to know whether someone knows a library or function that
parses a string containing 3 double numbers in the form like
xxxx.yyyyyyyyy xxxx.yyyyyyyyy xxxx.yyyyyyyyy really fast.
Currently I am using "sscanf(line.c_str(), "%lf %lf %lf", &x, &y,
&z);" which is kind of slow.

Thanks in advance,
Thomas Kowalski



如果他们完全是xxxx.yyyyyyyyy的形式然后你可以做

类似的事情:


双转换(const char * str)

{


返回convert_char(str [0],1000)+

convert_char(str [1],100)+

convert_char(str [2],10)+

convert_char(str [3],1)+

convert_char(str [5],0.1)+

....其余的?

}


其中convert_char检查''''或isdigit并执行相应的

的事情。


G

If they are exactly of the form xxxx.yyyyyyyyy then you can do
somthing like:

double convert( const char * str )
{

return convert_char( str[0], 1000 ) +
convert_char( str[1], 100 ) +
convert_char( str[2], 10 ) +
convert_char( str[3], 1 ) +
convert_char( str[5], 0.1 ) +
.... get the rest ?
}

Where convert_char checks for '' '' or isdigit and does the appropriate
thing.

G


嗨Mike,

谢谢你的快速回答。我已经使用了流式方法。

我的问题是对一些真正的自定义解析器的暗示。

更多关于我尝试的内容:

1.方法)使用字符串流进行解析。对于我的文件(大约400.000

行有3个双打)需要大约30秒才能解析。

2.方法)使用sscanf花了大约13秒。

3.方法)使用atof和strchr解析大约需要8s。


我认为改进表明IO还不是

限制因素。解析过程中CPU仍然处于100%的忙碌状态。


因为atof正在使用本地信息(我们总是使用。作为

分隔符并且还应该能够解析不同的表示

的浮点数,我想还有很大的提升空间。在

的自定义解析器的情况下,搜索也没有必要,因为我们

知道下一个double将在结束后直接跟随一个char
$ b最后的$ b。


有没有人有这种优化的经验?


问候,

Thomas Kowalski

Hi Mike,
thank you for your quick answer. I used the stream-approach already.
My question is rather a hint to some really custom parser.
More about what I tried:
1. approach) Using stringstreams to parse. For my file (about 400.000
lines with 3 doubles) it took about 30s to parse.
2. approach) Using sscanf which took about 13s.
3. approach) Using atof and strchr to parse took about 8s.

The improvements in my opinion show that the IO is not yet the
limiting factor. The CPU is still busy at 100% during the parsing.

Since atof is using local information (we always use the "." as
separator) and also should be able to parse different representations
of float numbers, I guess there is plenty of room for improvement. In
case of a custom parser the search is also not necessary since the we
know that the next double will follow directly one char after the end
of the last.

Does anyone have experience with such optimizations?

Regards,
Thomas Kowalski


这篇关于解析字符串非常快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆