C ++字符串面试问题 [英] C++ String Interview Question

查看：138 发布时间：2016/10/23 13:55:20 c++ optimization string processing-efficiency

本文介绍了C ++字符串面试问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最近在一个C ++技术面试中，我被给了一些简单的字符串操作代码，这是为了接受一个字符串并返回一个由第一个和最后一个n字符组成的字符串，然后继续纠正任何错误，并使功能尽可能高效，我想出了下面的解决方案，但访问者声称有一个更快更好的方法：

原始代码：

  std :: string first_last_n（int n，std :: string s）
 {
 std :: string first_n = s.substr（0，n）; 
 std :: string last_n = s.substr（s.size（） -  n-1，n）; 
 return first_n + last_n; 
}

我的代码：

  bool first_last_n（const std :: size_t& n，const std :: string& s，std :: string& r）
 {
 if size（）< n）
 return false; 
 r.reserve（2 * n）; 
 r.resize（0）; 
 r.append（s.data（），s.data（）+ n）; 
 r.append（s.data（）+（s.size（） -  n），s.data（）+ s.size（））; 
 return true; 
}

我的更改摘要：

更改接口以接受返回字符串作为引用（假设RVO和rvalues尚未可用）

修复了last_n字符串中的off-by-1错误

将每个字符触及一次的次数减少了一次，或两次（在重叠情况下）

在字符串的大小小于n的情况下检查，返回false失败。

假设只允许使用本地C ++，还有其他方法可以更有效或更好地执行上述操作吗？ p>

注意1：原始输入字符串实例不可修改。

b>注意2：所有解决方案必须通过以下测试用例，否则它们无效。

  void test （）
 {
 {
 std :: string s =0123456789; 
 std :: string r = first_last_n（10，s）; 
 assert（r ==01234567890123456789）; 
} 
 
 {
 std :: string s =0123456789ABC0123456789; 
 std :: string r = first_last_n（10，s）; 
 assert（r ==01234567890123456789）; 
} 
 
 {
 std :: string s =1234321; 
 std :: string r = first_last_n（5，s）; 
 assert（r ==1234334321）; 
} 
 
}

解决方案

此实现应该是快速的：

  inline std :: string first_last_n（std :: string :: size_type n，const std :: string& s）
 {
n = std :: min（n，s.size（））; 
 std :: string ret; 
 ret.reserve（2 * n）; 
 ret.append（s.begin（），s.begin（）+ n）; 
 ret.append（s.end（） -  n，s.end（））; 
 return ret; 
}

它通过所有三个单元测试。

当使用GNU libstdc ++时，声明& initializes ret 非常快，因为libstdc ++使用全局空字符串变量。因此，它只是一个指针副本。调用开始和结束在 s 它们将解析为 begin 和 end ， begin（）const 和 end（）const ，因此 s 的内部表示不是leaked。使用libstdc ++， std :: string :: const_iterator 是 const char * ，它是一个指针类型和随机访问迭代器。因此，当 std :: string :: append< const char *>（const char *，const char *）调用 std :: distance 以获取输入范围的长度，是指针差分运算。另外， std :: string :: append< const char *>（const char *，const char *）会导致类似 memmove 。最后， reserve 操作确保有足够的内存可用于返回值。

 
 
   EDIT： / strong> 
对于好奇，这里是MinGW g ++ 4.5.0的汇编输出中 ret 的初始化：
  movl $ __ ZNSs4_Rep20_S_empty_rep_storageE + 12，（％ebx）
  
 
 $ b b 
它只是将指针复制到全局的空表示。
 
 
   EDIT2： 
好吧。我现在测试了四个变体与g ++ 4.5.0和Visual C + + 16.00.30319.01：
 
 
 变体1（c_str变体）：
  inline std :: string first_last_n（std :: string :: size_type n，const std :: string& s）
 {
 std :: string :: size_type s_size = s.size（）; 
 n = std :: min（n，s_size）; 
 std :: string ret; 
 ret.reserve（2 * n）; 
 const char * s_cStr = s.c_str（），* s_cStr_end = s_cStr + s_size; 
 ret.append（s_cStr，s_cStr + n）; 
 ret.append（s_cStr_end  -  n，s_cStr_end）; 
 return ret; 
} 
  
变体2（数据字符串变体）：
  inline std :: string first_last_n（std :: string :: size_type n，const std :: string& s）
 {
 std :: string :: size_type s_size = s.size（）; 
 n = std :: min（n，s_size）; 
 std :: string ret; 
 ret.reserve（2 * n）; 
 const char * s_data = s.data（），* s_data_end = s_data + s_size; 
 ret.append（s_data，s_data + n）; 
 ret.append（s_data_end  -  n，s_data_end）; 
 return ret; 
} 
  
变式3：
  inline std :: string first_last_n（std :: string :: size_type n，const std :: string& s）
 {
 std :: string :: size_type s_size = s.size（）; 
 n = std :: min（n，s_size）; 
 std :: string ret（s）; 
 std :: string :: size_type d = s_size  -  n; 
 return ret.replace（n，d，s，d，n）; 
} 
  
变体4（我的原始代码）：
 内联std :: string first_last_n（std :: string :: size_type n，const std :: string& s）
 {
n = std :: min（n，s.size（））; 
 std :: string ret; 
 ret.reserve（2 * n）; 
 ret.append（s.begin（），s.begin（）+ n）; 
 ret.append（s.end（） -  n，s.end（））; 
 return ret; 
} 
  
 g ++ 4.5.0的结果是：
 
 
  
 变体4是最快的
 
 变体3是第二个（比变体4慢5％）
 
 变体1是第三个（比变体3慢2％）
 
 变体2是第四个（比变体1慢0.2％）
 
 
 
 
  VC ++ 16.00.30319.01的结果是：
 
 
  
 变体1是最快的
 
 
变体2是第二个变体（比变体1慢3％）
 
 变体4是第三个$ b 
变体3是第四种（比变体4慢17％）
 
 
 
 
 不出意料，编译器。但是，不知道使用哪个编译器我认为我的变体是最好的，因为它是一个熟悉的风格的C ++，它是使用g ++的最快，并且它没有比使用VC ++变体1或2慢得多。 / p> 
 
 
 VC ++结果有趣的一点是，使用 c_str 而不是 data 更快。也许这就是为什么你的访问者说你的实现方式比实现更快。
 
 
   EDIT3： 
 
 
 其实，我只是想到另一个变种：
 
 
 变式5：
  inline std :: string first_last_n（std :: string :: size_type n，const std :: string& s）
 {
n = std :: min（n，尺寸（））; 
 std :: string ret; 
 ret.reserve（2 * n）; 
 std :: string :: const_iterator s_begin = s.begin（），s_end = s.end（）; 
 ret.append（s_begin，s_begin + n） 
 ret.append（s_end  -  n，s_end）; 
 return ret; 
} 
  就像变体4，除了<$ c的开始和结束迭代器$ c> s 保存。
 
 
 当测试变体5时，使用VC ++时，它实际上是变异2 ：
 
 
  
 变体1是最快的
 
 变体5是第二个）
 
 变体2是第三种（比变体5慢1.4％）
 
 变体4是第三种> 
 
变体3是第四种（比变体4慢17％）
 
 
 
I was recently in a C++ technical interview, where I was given a bit of simple string manipulation code, which is intended to take a string and return a string that is comprised of the first and last n-characters, and then proceed to correct any bugs and to also make the function as efficient as possible, I came up with the solution below, however the interviewer claimed there was an even faster more optimal way:

Original code:
std::string first_last_n(int n, std::string s)
{
   std::string first_n = s.substr(0,n);
   std::string last_n = s.substr(s.size()-n-1,n);
   return first_n + last_n;
}
My code:
bool first_last_n(const std::size_t& n, const std::string& s, std::string& r)
{
   if (s.size() < n)
      return false;
   r.reserve(2 * n);
   r.resize(0);
   r.append(s.data(),s.data() + n);
   r.append(s.data() + (s.size() - n), s.data() + s.size());
   return true;
}
Summary of my changes:


Changed the interface to take a return string as reference (assuming RVO and rvalues are not available yet)
Removed temporary strings being constructed via substr
Passed input string as a const reference inorder to get around temporary instantiation of input
Fixed off-by-1 error in last_n string
Reduced the number of times each character is touched down to once, or twice (in the event of an overlapping scenario)
Placed a check in the event the string s's size is less than n, returning false for failure.


Assuming only native C++ is allowed, is there some other way of doing the above more efficiently or optimally?

Note 1: The original input string instance is not to be modified.

Note 2: All solutions must pass the following test case, otherwise they are not valid.
void test()
{
   {
      std::string s = "0123456789";
      std::string r = first_last_n(10,s);
      assert(r == "01234567890123456789");
   }

   {
      std::string s = "0123456789ABC0123456789";
      std::string r = first_last_n(10,s);
      assert(r == "01234567890123456789");
   }

   {
      std::string s = "1234321";
      std::string r = first_last_n(5,s);
      assert(r == "1234334321");
   }

}

 解决方案 
This implementation should be fast:
inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
    n = std::min(n, s.size());
    std::string ret;
    ret.reserve(2*n);
    ret.append(s.begin(), s.begin() + n);
    ret.append(s.end() - n, s.end());
    return ret;
}
It passes all three unit tests.

When using GNU libstdc++, the line that declares & initializes ret is extremely fast because libstdc++ uses a global "empty string" variable. Thus, it's simply a pointer copy. Calls to begin and end on s are also fast because they will resolve to the const versions of begin and end, begin() const and end() const, so the internal representation of s is not "leaked". With libstdc++, std::string::const_iterator is const char*, which is a pointer type and random access iterator. Thus, when std::string::append<const char*>(const char*, const char*) calls std::distance to obtain the length of the input range, it is a pointer difference operation. Also, std::string::append<const char*>(const char*, const char*) results in something like a memmove. Finally, the reserve operation ensures that enough memory is available for the return value.

EDIT:
For the curious, here is the initialization of ret in the assembly output of MinGW g++ 4.5.0:
    movl    $__ZNSs4_Rep20_S_empty_rep_storageE+12, (%ebx)
It's simply copying the pointer to the global "empty representation".

EDIT2:
Okay. I have now tested four variants with g++ 4.5.0 and Visual C++ 16.00.30319.01:

Variant 1 (the "c_str variant"):
inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   std::string::size_type s_size = s.size();
   n = std::min(n, s_size);
   std::string ret;
   ret.reserve(2*n);
   const char *s_cStr = s.c_str(), *s_cStr_end = s_cStr + s_size;
   ret.append(s_cStr, s_cStr + n);
   ret.append(s_cStr_end - n, s_cStr_end);
   return ret;
}
Variant 2 (the "data string" variant):
inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   std::string::size_type s_size = s.size();
   n = std::min(n, s_size);
   std::string ret;
   ret.reserve(2*n);
   const char *s_data = s.data(), *s_data_end = s_data + s_size;
   ret.append(s_data, s_data + n);
   ret.append(s_data_end - n, s_data_end);
   return ret;
}
Variant 3:
inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   std::string::size_type s_size = s.size();
   n = std::min(n, s_size);
   std::string ret(s);
   std::string::size_type d = s_size - n;
   return ret.replace(n, d, s, d, n);
}
Variant 4 (my original code):
inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   n = std::min(n, s.size());
   std::string ret;
   ret.reserve(2*n);
   ret.append(s.begin(), s.begin() + n);
   ret.append(s.end() - n, s.end());
   return ret;
}
The results for g++ 4.5.0 are:


Variant 4 is the fastest
Variant 3 is second (5% slower than variant 4)
Variant 1 is third (2% slower than variant 3)
Variant 2 is fourth (0.2% slower than variant 1)


The results for VC++ 16.00.30319.01 are:


Variant 1 is the fastest
Variant 2 is second (3% slower than variant 1)
Variant 4 is third (4% slower than variant 2)
Variant 3 is fourth (17% slower than variant 4)


Unsurprisingly, the variant that is fastest depends on the compiler. However, not knowing which compiler will be used I think that my variant is best because it is a familiar style of C++, it is the fastest when using g++, and it is not that much slower than variants 1 or 2 when using VC++.

One thing interesting from the VC++ results is that using c_str rather than data is faster. Perhaps that is why your interviewer said that there is a faster way than your implementation.

EDIT3:

Actually, I just thought about another variant:

Variant 5:
inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   n = std::min(n, s.size());
   std::string ret;
   ret.reserve(2*n);
   std::string::const_iterator s_begin = s.begin(), s_end = s.end();
   ret.append(s_begin, s_begin + n);
   ret.append(s_end - n, s_end);
   return ret;
}
It's just like variant 4 except that the begin and end iterators for s are saved.

When variant 5 is tested, it actually beats out variant 2 (the data string variant) when using VC++:


Variant 1 is the fastest
Variant 5 is second (1.6% slower than variant 1)
Variant 2 is third (1.4% slower than variant 5)
Variant 4 is third (4% slower than variant 2)
Variant 3 is fourth (17% slower than variant 4)


                        这篇关于C ++字符串面试问题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

C ++字符串面试问题 [英] C++ String Interview Question

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

C ++字符串面试问题 [英] C++ String Interview Question

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭