如何从c ++中的字符串中剥离所有非字母数字字符? [英] How to strip all non alphanumeric characters from a string in c++?
问题描述
我在写一个软件,它需要我处理从libcurl的网页中获取的数据。当我得到数据,由于某种原因,它有额外的换行符。我需要找出一种方法,只允许字母,数字和空格。并删除一切,包括换行符。有什么简单的方法吗?感谢。
I am writing a piece of software, and It require me to handle data I get from a webpage with libcurl. When I get the data, for some reason it has extra line breaks in it. I need to figure out a way to only allow letters, numbers, and spaces. And remove everything else, including line breaks. Is there any easy way to do this? Thanks.
推荐答案
编写一个需要 char
code> true 如果要删除该字符,或 false
,如果要保留:
Write a function that takes a char
and returns true
if you want to remove that character or false
if you want to keep it:
bool my_predicate(char c);
然后使用 std :: remove_if
算法从字符串中删除不需要的字符:
Then use the std::remove_if
algorithm to remove the unwanted characters from the string:
std::string s = "my data";
s.erase(std::remove_if(s.begin(), s.end(), my_predicate), s.end());
根据您的要求,您可以使用标准库谓词之一,例如 std :: isalnum
,而不是写你自己的谓词(你说你需要匹配字母数字字符和空格,所以也许这不完全适合你需要的)。
Depending on your requirements, you may be able to use one of the Standard Library predicates, like std::isalnum
, instead of writing your own predicate (you said you needed to match alphanumeric characters and spaces, so perhaps this doesn't exactly fit what you need).
如果你想使用标准库 std :: isalnum
函数,你需要一个cast来消除<$在C标准库头< cctype>
(这是您要使用的)中的c $ c> std :: isalnum 在C ++标准库头中的 std :: isalnum
< locale>
(这不是你想要的使用,除非您要执行特定于语言环境的字符串处理):
If you want to use the Standard Library std::isalnum
function, you will need a cast to disambiguate between the std::isalnum
function in the C Standard Library header <cctype>
(which is the one you want to use) and the std::isalnum
in the C++ Standard Library header <locale>
(which is not the one you want to use, unless you want to perform locale-specific string processing):
s.erase(std::remove_if(s.begin(), s.end(), (int(*)(int))std::isalnum), s.end());
这对任何序列容器同样有效(包括 std :: string
, std :: vector
和 std :: deque
)。这个成语通常被称为擦除/移除成语。 std :: remove_if
算法也适用于普通数组。 std :: remove_if
只对序列进行一次遍历,因此它具有线性时间复杂性。
This works equally well with any of the sequence containers (including std::string
, std::vector
and std::deque
). This idiom is commonly referred to as the "erase/remove" idiom. The std::remove_if
algorithm will also work with ordinary arrays. The std::remove_if
makes only a single pass over the sequence, so it has linear time complexity.
这篇关于如何从c ++中的字符串中剥离所有非字母数字字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!