操纵字符串的最佳方法是什么? [英] What is the best way to manipulate string?

查看:62
本文介绍了操纵字符串的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,



我正在编写一个需要大量字符串操作的应用程序。

我试图使用std :: string,std :: wstring(unicode)和CString。有时候,我会一起使用它们。



- 使用它们的优点和缺点是什么?

- 我是好奇 CString :: GetBuffer(nMinBufLength)。这里,nMinBufLength实际上是什么?它是否适合一些内存优化?

- 旧功能和后缀为_s的新功能的功能差异。例如,strcpy和strcpy_s。

- 我正在尝试将日志放到本地文件中。使用 CFile,FILE *,HANDLE(ReadFile / WriteFile)之间最快捷有效的方式是什么?



标准太多了C ++语言。我想知道选择最佳速度的标准。



谢谢。

Hello,

I''m writing an application which needs a lot of string manipulation.
I tried to use std::string, std::wstring(unicode) and CString. Sometimes, I use them together.

- What are the pros and the cons of using each of them?
- I''m curious about CString::GetBuffer(nMinBufLength). Here, what is nMinBufLength actually for? Is it for some memory optimization?
- Difference of functions of old function and new functino postfixed by _s. For example, strcpy and strcpy_s.
- I''m trying to put log to the local file. What is the fastest and efficient way between using CFile, FILE *, HANDLE (ReadFile/WriteFile) ?

There are too many standards in C++ language. I''m wondering which standard to choose for best speed.

Thank you.

推荐答案

At您拥有核心C库函数的最低级别,如 strcpy memmove ,它们处理字节并且快速且不安全因为几乎没有检查任何东西,你必须正确地获取参数以避免崩溃,你可能必须自己管理诸如NULL之类的东西并非常小心地终止字符串。



然后有一些稍高级的C库函数,如 tcsncmp strnicmp_s ,它们处理2字节甚至可变字节形成UTF-8或UNICODE字符串的字符,在 _s 的情况下,添加了额外的检查以尝试防止缓冲区溢出和由于参数错误导致的其他可怕事故。 _s 函数是Microsoft特定的,因此您无法在Linux或OSX上找到它们。



下一步规模是处理字符串的正确C ++类。 C ++标准库有std :: string,MFC和ATL有CString及其后代,Qt有QString,大多数其他框架都有自己的可能构建在std :: string上并可能忽略它,可能在内部使用C库函数CString曾经和可能仍然或可能以自己的方式重新实现相同类型的东西。



哪个最好?



这取决于你在做什么以及在性能与安全性和高级功能方面你想要的是什么。

你是否正在开发一个嵌入式系统,你的字符串真的很短传输到设备和文件的内存字节数组?使用核心C库函数。



您是否管理复杂UI的字符串,您需要18种语言的翻译? Qt将通过QString和相关类的高级功能使这更容易。



您是否正在使用现有的MFC / ATL应用程序?坚持使用CString及其ATL / WTL亲属来保持一致性。它很有效,只要你小心,性能并不可怕。



你是否正在开发一个新的C ++应用程序,其中性能很重要但你也想要在这一生中完成它?使用std :: string。这是安全的,它很快,它可以做大多数事情。





同样的注意事项适用于文件除了在Windows上,C库FILE函数实际上调用Win32函数,因此它总是更快,更高效,也更容易使用Win32文件函数或CFile这样的包装器,如果你的框架提供了一个。
At the lowest level you have the core C library functions like strcpy and memmove which deal with bytes and are both fast and unsafe in that there is little checking of anything, you have to get the parameters right to avoid crashing and you might have to manage things like NULL terminating strings yourself and very carefully.

Then there are slightly more advanced C library functions like tcsncmp and strnicmp_s which deal with 2-byte or even variable byte characters forming UTF-8 or UNICODE strings and in the case of _s functions extra checking is added to try and prevent buffer overruns and other horrible accidents that result from bad parameters. The _s functions are Microsoft specific so you won''t find them on Linux or OSX.

Next up the scale are proper C++ classes for dealing with strings. The C++ standard library has std::string, MFC and ATL have CString and its descendents, Qt has QString and most other frameworks will have their own possibly building on std::string and possibly ignoring it, possibly using the C library functions internally as CString used to and may still or possibly reimplementing the same sort of things their own way.

Which is best?

That depends what you''re doing and what you want in terms of performance vs safety vs advanced functionality.
Are you working on an embedded system where you''re strings are really short array of memory bytes being transferred to and from devices and files? Use the core C library functions.

Are you managing strings for a complex UI where you''ll need translations in 18 languages? Qt will make this much easier with the advanced features of QString and related classes.

Are you working on an existing MFC/ATL application? Stick with CString and its ATL/WTL relatives for consistency. It works and the performance is not terrible as long as you''re careful.

Are you working on a new C++ application where performance is important but you also want to get it done in this lifetime? Use std::string. It''s safe, it''s fast and it does most stuff OK.


The same sort of considerations apply to files except that on Windows the C library FILE functions actually call the Win32 functions so it''s always faster, more efficient and also easier to use the Win32 file functions or a wrapper like CFile if your framework provides one.


解决方案#1很好(但恕我直言并没有回答所有问题)。



我使用了CString和std :: string我的旅行和我的感受是std :: string比CString更新,但在我看来并不那么健壮。 CString仅限Microsoft(因此不可移植,但如果您将代码库移动到另一个平台,则可以在大约30分钟内编写完整的Linux CString类)。这两个类都为你提供了一个类,它包含一个const char *数组,可以通过各种方式进行操作(CString在功能上获胜)。



CString会给你开箱即用的const char *引用但是使用std :: string,你必须用c_str()方法来询问它。



CString也允许你使用GetBuffer()API引用char *(即写访问)。如果你只使用CString :: GetBuffer(),那么你将当前对象作为非const字符串。你有责任知道它有多长。如果为GetBuffer()赋值,则将缓冲区的大小调整为该值的最小值。换句话说:



Solution #1 is good (but IMHO didn''t answer all the questions).

I have used both CString and std::string in my travels and my feelings are that std::string is newer than CString but in my observation not as robust. CString is Microsoft-only (so not portable, but if you move your code base to another platform, you could write a complete linux CString class in about 30 minutes). Both classes give you a class that packages an array of "const char*" that can be manipulated in various ways (CString wins on features).

CString will give you a reference to const char* out of the box but with std::string, you have to ask for it with the c_str() method.

CString also allows you to reference a "char*" (ie. write access) with the GetBuffer() API. If you just use CString::GetBuffer(), then you are handed the current object as a non-const string. It is your responsibility to know how long it is. If you give a value to GetBuffer() then the buffer is resized to a minimum of that value. In other words:

CString str1 = "123";  // at least 4 bytes long
CString str2 = "123";  // at least 4 bytes long
char* p1 = str1.GetBuffer();  // You can write up to 4 chars
char* p2 = str2.GetBuffer(256);  // You can write up to 256 chars





CString具有更好的国际化特征,即std :: string。



std :: string被认为更现代化,不受约束MFC和ATL,所以你会得到很多人告诉你CString是坏而没有很好的解释。 [另外,CString最初与MFC捆绑在一起,但在VS4.2附近被分成了自己的库。如果有人告诉你CString是一个MFC类,那么从2001年开始它们就错了。]



Joseph Newcomer编写了一个优秀的CString资源,可以找到这里 [ ^ ]。



我不确定std :: string但CString是引用计数,这样你就可以做几个有趣的事情。例如:





CString has better internationalization characteristics that std::string.

std::string is considered more modern and unafflicted with association with MFC and ATL, so you will get a lot of people telling you that CString is "bad" without very good explanations. [Also, CString was originally bundled with MFC, but was separated into its own library somewhere around VS4.2. If anybody tells you that CString is an MFC class, they have been wrong since about 2001.]

Joseph Newcomer has written an excellent CString resource which can be found here[^].

I''m not sure about std::string but CString is reference counted so that you can do several interesting things with it. eg.:

CString GimmeAString()
{
   CString str = "Blah Blah";  // on the stack
   return str;
}





请注意,str在堆栈上创建,然后返回。这通常是糟糕的juju,但由于堆栈对象指向堆上的字符和引用计数,引用计数变为2,堆栈对象被删除(但不是堆分配),并且方法返回有效字符串,即使堆栈对象消失了。我认为这是一个值得怀疑的功能,因为它可以帮助你避免在做一些愚蠢的事情时失败。



至于strcpy()vs strcpy_s()函数,原来的crt提供strcpy(),它是标准的和可移植的。 _s函数仅限Microsoft,并提供内存移动的长度检查。例如:





Note that str is created on the stack, then returned. This is normally bad juju but since the stack object points to chars on the heap and reference counted, the ref count goes to 2, the stack object is deleted (but not the heap allocation) and a valid string is returned by the method even though the stack object is gone. I consider this a questionable feature since it saves you from failure in the case of doing something dumb.

As for the strcpy() vs strcpy_s() functions, the original crt provides strcpy(), which is standard and portable. The "_s" functions are Microsoft only and provide length checking for memory movement. For example:

char s1[5];
strcpy(s1, "Hello World");  // no compile or runtime check, stack overrun
strcpy_s(s1, "Hello World");  // compile time error message

char* p1 = new char[5];
strcpy(p1, "Hello World");  // no compile or runtime check
strcpy_s(p1, 5, "Hello World");  // runtime exception


这篇关于操纵字符串的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆