C ++字符串面试问题 [英] C++ String Interview Question

查看:138
本文介绍了C ++字符串面试问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在一个C ++技术面试中,我被给了一些简单的字符串操作代码,这是为了接受一个字符串并返回一个由第一个和最后一个n字符组成的字符串,然后继续纠正任何错误,并使功能尽可能高效,我想出了下面的解决方案,但访问者声称有一个更快更好的方法:



原始代码:

  std :: string first_last_n(int n,std :: string s)
{
std :: string first_n = s.substr(0,n);
std :: string last_n = s.substr(s.size() - n-1,n);
return first_n + last_n;
}

我的代码:

  bool first_last_n(const std :: size_t& n,const std :: string& s,std :: string& r)
{
if size()< n)
return false;
r.reserve(2 * n);
r.resize(0);
r.append(s.data(),s.data()+ n);
r.append(s.data()+(s.size() - n),s.data()+ s.size());
return true;
}

我的更改摘要:




  • 更改接口以接受返回字符串作为引用(假设RVO和rvalues尚未可用)


  • p>

  • 修复了last_n字符串中的off-by-1错误


  • 将每个字符触及一次的次数减少了一次,或两次(在重叠情况下)


  • 在字符串的大小小于n的情况下检查,返回false失败。




假设只允许使用本地C ++,还有其他方法可以更有效或更好地执行上述操作吗? p>

注意1:原始输入字符串实例不可修改。



b>注意2:所有解决方案必须通过以下测试用例,否则它们无效。

  void test ()
{
{
std :: string s =0123456789;
std :: string r = first_last_n(10,s);
assert(r ==01234567890123456789);
}

{
std :: string s =0123456789ABC0123456789;
std :: string r = first_last_n(10,s);
assert(r ==01234567890123456789);
}

{
std :: string s =1234321;
std :: string r = first_last_n(5,s);
assert(r ==1234334321);
}

}


解决方案

此实现应该是快速的:

  inline std :: string first_last_n(std :: string :: size_type n,const std :: string& s)
{
n = std :: min(n,s.size());
std :: string ret;
ret.reserve(2 * n);
ret.append(s.begin(),s.begin()+ n);
ret.append(s.end() - n,s.end());
return ret;
}

通过所有三个单元测试



当使用GNU libstdc ++时,声明& initializes ret 非常快,因为libstdc ++使用全局空字符串变量。因此,它只是一个指针副本。调用开始结束在 s 它们将解析为 begin end begin()const end()const ,因此 s 的内部表示不是leaked。使用libstdc ++, std :: string :: const_iterator const char * ,它是一个指针类型和随机访问迭代器。因此,当 std :: string :: append< const char *>(const char *,const char *)调用 std :: distance 以获取输入范围的长度,是指针差分运算。另外, std :: string :: append< const char *>(const char *,const char *)会导致类似 memmove 。最后, reserve 操作确保有足够的内存可用于返回值。



EDIT: / strong>
对于好奇,这里是MinGW g ++ 4.5.0的汇编输出中 ret 的初始化:

  movl $ __ ZNSs4_Rep20_S_empty_rep_storageE + 12,(%ebx)


$ b b

它只是将指针复制到全局的空表示。



EDIT2:
好​​吧。我现在测试了四个变体与g ++ 4.5.0和Visual C + + 16.00.30319.01:



变体1(c_str变体):

  inline std :: string first_last_n(std :: string :: size_type n,const std :: string& s)
{
std :: string :: size_type s_size = s.size();
n = std :: min(n,s_size);
std :: string ret;
ret.reserve(2 * n);
const char * s_cStr = s.c_str(),* s_cStr_end = s_cStr + s_size;
ret.append(s_cStr,s_cStr + n);
ret.append(s_cStr_end - n,s_cStr_end);
return ret;
}

变体2(数据字符串变体):

  inline std :: string first_last_n(std :: string :: size_type n,const std :: string& s)
{
std :: string :: size_type s_size = s.size();
n = std :: min(n,s_size);
std :: string ret;
ret.reserve(2 * n);
const char * s_data = s.data(),* s_data_end = s_data + s_size;
ret.append(s_data,s_data + n);
ret.append(s_data_end - n,s_data_end);
return ret;
}

变式3:

  inline std :: string first_last_n(std :: string :: size_type n,const std :: string& s)
{
std :: string :: size_type s_size = s.size();
n = std :: min(n,s_size);
std :: string ret(s);
std :: string :: size_type d = s_size - n;
return ret.replace(n,d,s,d,n);
}

变体4(我的原始代码):

 内联std :: string first_last_n(std :: string :: size_type n,const std :: string& s)
{
n = std :: min(n,s.size());
std :: string ret;
ret.reserve(2 * n);
ret.append(s.begin(),s.begin()+ n);
ret.append(s.end() - n,s.end());
return ret;
}

g ++ 4.5.0的结果是:




  • 变体4是最快的

  • 变体3是第二个(比变体4慢5%)

  • 变体1是第三个(比变体3慢2%)

  • 变体2是第四个(比变体1慢0.2%)



VC ++ 16.00.30319.01的结果是:




  • 变体1是最快的

  • 变体2是第二个变体(比变体1慢3%)

  • 变体4是第三个$ b
  • 变体3是第四种(比变体4慢17%)



不出意料,编译器。但是,不知道使用哪个编译器我认为我的变体是最好的,因为它是一个熟悉的风格的C ++,它是使用g ++的最快,并且它没有比使用VC ++变体1或2慢得多。 / p>

VC ++结果有趣的一点是,使用 c_str 而不是 data 更快。也许这就是为什么你的访问者说你的实现方式比实现更快。



EDIT3:



其实,我只是想到另一个变种:



变式5:

  inline std :: string first_last_n(std :: string :: size_type n,const std :: string& s)
{
n = std :: min(n,尺寸());
std :: string ret;
ret.reserve(2 * n);
std :: string :: const_iterator s_begin = s.begin(),s_end = s.end();
ret.append(s_begin,s_begin + n)
ret.append(s_end - n,s_end);
return ret;
}

就像变体4,除了<$ c的开始和结束迭代器$ c> s 保存。



当测试变体5时,使用VC ++时,它实际上是变异2 :




  • 变体1是最快的

  • 变体5是第二个)

  • 变体2是第三种(比变体5慢1.4%)

  • 变体4是第三种>
  • 变体3是第四种(比变体4慢17%)


I was recently in a C++ technical interview, where I was given a bit of simple string manipulation code, which is intended to take a string and return a string that is comprised of the first and last n-characters, and then proceed to correct any bugs and to also make the function as efficient as possible, I came up with the solution below, however the interviewer claimed there was an even faster more optimal way:

Original code:

std::string first_last_n(int n, std::string s)
{
   std::string first_n = s.substr(0,n);
   std::string last_n = s.substr(s.size()-n-1,n);
   return first_n + last_n;
}

My code:

bool first_last_n(const std::size_t& n, const std::string& s, std::string& r)
{
   if (s.size() < n)
      return false;
   r.reserve(2 * n);
   r.resize(0);
   r.append(s.data(),s.data() + n);
   r.append(s.data() + (s.size() - n), s.data() + s.size());
   return true;
}

Summary of my changes:

  • Changed the interface to take a return string as reference (assuming RVO and rvalues are not available yet)

  • Removed temporary strings being constructed via substr

  • Passed input string as a const reference inorder to get around temporary instantiation of input

  • Fixed off-by-1 error in last_n string

  • Reduced the number of times each character is touched down to once, or twice (in the event of an overlapping scenario)

  • Placed a check in the event the string s's size is less than n, returning false for failure.

Assuming only native C++ is allowed, is there some other way of doing the above more efficiently or optimally?

Note 1: The original input string instance is not to be modified.

Note 2: All solutions must pass the following test case, otherwise they are not valid.

void test()
{
   {
      std::string s = "0123456789";
      std::string r = first_last_n(10,s);
      assert(r == "01234567890123456789");
   }

   {
      std::string s = "0123456789ABC0123456789";
      std::string r = first_last_n(10,s);
      assert(r == "01234567890123456789");
   }

   {
      std::string s = "1234321";
      std::string r = first_last_n(5,s);
      assert(r == "1234334321");
   }

}

解决方案

This implementation should be fast:

inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
    n = std::min(n, s.size());
    std::string ret;
    ret.reserve(2*n);
    ret.append(s.begin(), s.begin() + n);
    ret.append(s.end() - n, s.end());
    return ret;
}

It passes all three unit tests.

When using GNU libstdc++, the line that declares & initializes ret is extremely fast because libstdc++ uses a global "empty string" variable. Thus, it's simply a pointer copy. Calls to begin and end on s are also fast because they will resolve to the const versions of begin and end, begin() const and end() const, so the internal representation of s is not "leaked". With libstdc++, std::string::const_iterator is const char*, which is a pointer type and random access iterator. Thus, when std::string::append<const char*>(const char*, const char*) calls std::distance to obtain the length of the input range, it is a pointer difference operation. Also, std::string::append<const char*>(const char*, const char*) results in something like a memmove. Finally, the reserve operation ensures that enough memory is available for the return value.

EDIT: For the curious, here is the initialization of ret in the assembly output of MinGW g++ 4.5.0:

    movl    $__ZNSs4_Rep20_S_empty_rep_storageE+12, (%ebx)

It's simply copying the pointer to the global "empty representation".

EDIT2: Okay. I have now tested four variants with g++ 4.5.0 and Visual C++ 16.00.30319.01:

Variant 1 (the "c_str variant"):

inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   std::string::size_type s_size = s.size();
   n = std::min(n, s_size);
   std::string ret;
   ret.reserve(2*n);
   const char *s_cStr = s.c_str(), *s_cStr_end = s_cStr + s_size;
   ret.append(s_cStr, s_cStr + n);
   ret.append(s_cStr_end - n, s_cStr_end);
   return ret;
}

Variant 2 (the "data string" variant):

inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   std::string::size_type s_size = s.size();
   n = std::min(n, s_size);
   std::string ret;
   ret.reserve(2*n);
   const char *s_data = s.data(), *s_data_end = s_data + s_size;
   ret.append(s_data, s_data + n);
   ret.append(s_data_end - n, s_data_end);
   return ret;
}

Variant 3:

inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   std::string::size_type s_size = s.size();
   n = std::min(n, s_size);
   std::string ret(s);
   std::string::size_type d = s_size - n;
   return ret.replace(n, d, s, d, n);
}

Variant 4 (my original code):

inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   n = std::min(n, s.size());
   std::string ret;
   ret.reserve(2*n);
   ret.append(s.begin(), s.begin() + n);
   ret.append(s.end() - n, s.end());
   return ret;
}

The results for g++ 4.5.0 are:

  • Variant 4 is the fastest
  • Variant 3 is second (5% slower than variant 4)
  • Variant 1 is third (2% slower than variant 3)
  • Variant 2 is fourth (0.2% slower than variant 1)

The results for VC++ 16.00.30319.01 are:

  • Variant 1 is the fastest
  • Variant 2 is second (3% slower than variant 1)
  • Variant 4 is third (4% slower than variant 2)
  • Variant 3 is fourth (17% slower than variant 4)

Unsurprisingly, the variant that is fastest depends on the compiler. However, not knowing which compiler will be used I think that my variant is best because it is a familiar style of C++, it is the fastest when using g++, and it is not that much slower than variants 1 or 2 when using VC++.

One thing interesting from the VC++ results is that using c_str rather than data is faster. Perhaps that is why your interviewer said that there is a faster way than your implementation.

EDIT3:

Actually, I just thought about another variant:

Variant 5:

inline std::string first_last_n(std::string::size_type n, const std::string& s)
{
   n = std::min(n, s.size());
   std::string ret;
   ret.reserve(2*n);
   std::string::const_iterator s_begin = s.begin(), s_end = s.end();
   ret.append(s_begin, s_begin + n);
   ret.append(s_end - n, s_end);
   return ret;
}

It's just like variant 4 except that the begin and end iterators for s are saved.

When variant 5 is tested, it actually beats out variant 2 (the data string variant) when using VC++:

  • Variant 1 is the fastest
  • Variant 5 is second (1.6% slower than variant 1)
  • Variant 2 is third (1.4% slower than variant 5)
  • Variant 4 is third (4% slower than variant 2)
  • Variant 3 is fourth (17% slower than variant 4)

这篇关于C ++字符串面试问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆