在C ++ 11中,将引用/指针返回到std :: string中某个位置的最有效方式是什么? [英] In C++11 what is the most performant way to return a reference/pointer to a position in a std::string?

查看:76
本文介绍了在C ++ 11中,将引用/指针返回到std :: string中某个位置的最有效方式是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个文本解析器,该文本解析器使用 std :: string 作为字符串的核心存储。



我知道这不是最佳选择,并且编译器内部的解析器为此使用了优化的方法。在我的项目中,我不介意失去一些性能以换取更多的清晰度和更轻松的维护。



开始时,我将大量文本读入内存,然后对每个文本进行扫描字符来构建一组有序的令牌,它是一个简单的词法分析器。目前,我正在使用 std :: string 表示令牌的文本,但是我想通过在原始文本中使用引用/指针来对此进行一些改进。 / p>

根据我所读的内容,返回并保留给迭代器是一种不好的做法,而引用 std也是一种不好的做法:字符串内部缓冲区。



关于如何以干净方式完成此操作的任何建议?

解决方案

有人建议在即将到来的标准中向C ++添加 string_view



A string_view 是一个不带字符的可重复范围,它具有字符类所需的许多实用程序和属性,但不能插入/删除字符(并且编辑字符通常在



我建议尝试这种方法-编写自己的(在自己的实用程序名称空间中)。 (无论如何,您都应该拥有自己的实用程序名称空间,以用于可重复使用的代码段。)



核心数据是一对 char * pr std :: string :: iterator s(或 const 版本)。如果用户需要一个终止为null的缓冲区,则 to_string 方法将分配一个缓冲区。我将从非可变( const )字符数据开始。不要忘记开始 end :这使您的视图可以通过 for(:)循环。



此设计存在以下危险:原始的 std :: string 必须



如果出于安全考虑愿意放弃一些性能,请让该视图拥有一个 std: :shared_ptr< const std :: string> ,它可以将 std :: string 移入其中,并作为第一步将整个缓冲区移入它,然后开始砍/解析它。 (子视图为指向相同数据的新共享指针)。然后,您的视图类更像是具有共享存储空间的不可更改字符串。



shared_ptr< const> 版本具有安全性,视图的生存期更长(不再存在生存期依赖性),您可以轻松地将 const substring类型方法转发给 std :: string ,因此您可以编写更少的代码。



缺点包括可能与传入标准one 1不兼容和较低的性能,因为您拖动了 shared_ptr



我怀疑视图和范围将在现代C ++中,随着该语言的即将出现和最近的改进,它变得越来越重要。



boost :: string_ref 显然是对C ++ 1y标准。






1 尽管在模板元编程中添加功能有多么简单,但对视图类型使用资源所有者模板参数可能是一个不错的设计决策。然后,您可以拥有和拥有 string_view s,它们具有相同的语义...


I'm building a text parser that uses std::string as the core storage for strings.

I know this is not optimal and that parsers inside compilers use optimzed approaches for this. In my project I don't mind losing some performance in exchange for more clarity and easier maintenance.

At the beginning I read a huge text into memory and then I scan each character to build a ordered set of tokens, its a simple lexer. Currently I'm using std::string to represent the text of a token but I would like to improve this a bit by using a reference/pointer into the original text.

From what I have read it is a bad practice to return and hold to iterators and it is also a bad practice to refer to the std::string internal buffer.

Any suggestions on how to accomplish this in a "clean" way?

解决方案

There are proposals to add string_view to C++ in an upcoming standard.

A string_view is a non-owning iterable range over characters with many of the utilities and properties you'd expect of a string class, except you cannot insert/delete characters (and editing characters is often blocked in some subtypes).

I would advise trying that approach -- write your own (in your own utility namespace). (You should have your own utility namespace for reusable code snippets anyhow).

The core data is a pair of char* pr std::string::iterators (or const versions). If the user needs a null terminated buffer, a to_string method allocates one. I would start with non-mutable (const) character data. Do not forget begin and end: that makes your view iterable with for(:) loops.

This design has the danger that the original std::string has to persist long enough to outlast all of the views.

If you are willing to give up some performance for safety, have the view own a std::shared_ptr<const std::string> that it can move a std::string into, and as a first step move the entire buffer into it, and then start chopping/parsing it down. (child views make a new shared pointer to same data). Then your view class is more like a non-mutable string with shared storage.

The upsides to the shared_ptr<const> version include safety, longer lifetime of the views (there is no more lifetime dependency), and you can easily forward your const "substring" type methods to the std::string so you can write less code.

Downsides include possible incompatibility with incoming standard one1, and lower performance because you are dragging a shared_ptr around.

I suspect views and ranges are going to be increasingly important in modern C++ with the upcoming and recent improvements to the language.

boost::string_ref is apparently an implementation of a proposal to the C++1y standard.


1 however, given how simple it is to add capabilities in template metaprogramming, having a "resource owner" template argument to a view type might be a good design decision. Then you can have owning and non-owning string_views with otherwise identical semantics...

这篇关于在C ++ 11中,将引用/指针返回到std :: string中某个位置的最有效方式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆