如何将 PHP 中的字符串截断为最接近一定数量字符的单词? [英] How to Truncate a string in PHP to the word closest to a certain number of characters?

查看:20
本文介绍了如何将 PHP 中的字符串截断为最接近一定数量字符的单词?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用 PHP 编写的代码片段,它从数据库中提取一段文本并将其发送到网页上的小部件.原始文本块可以是一篇长文章,也可以是一两句短句;但是对于这个小部件,我不能显示超过 200 个字符.我可以使用 substr() 在 200 个字符处切掉文本,但结果会在单词中间被切掉——我真正想要的是在最后一个 单词的末尾切掉文本 在 200 个字符之前.

I have a code snippet written in PHP that pulls a block of text from a database and sends it out to a widget on a webpage. The original block of text can be a lengthy article or a short sentence or two; but for this widget I can't display more than, say, 200 characters. I could use substr() to chop off the text at 200 chars, but the result would be cutting off in the middle of words-- what I really want is to chop the text at the end of the last word before 200 chars.

推荐答案

通过使用 wordwrap 功能.它将文本分成多行,这样最大宽度就是您指定的宽度,在单词边界处中断.拆分后,您只需取第一行:

By using the wordwrap function. It splits the texts in multiple lines such that the maximum width is the one you specified, breaking at word boundaries. After splitting, you simply take the first line:

substr($string, 0, strpos(wordwrap($string, $your_desired_width), "
"));

oneliner 无法处理的一件事是文本本身短于所需宽度的情况.要处理这种边缘情况,应该执行以下操作:

One thing this oneliner doesn't handle is the case when the text itself is shorter than the desired width. To handle this edge-case, one should do something like:

if (strlen($string) > $your_desired_width) 
{
    $string = wordwrap($string, $your_desired_width);
    $string = substr($string, 0, strpos($string, "
"));
}

<小时>

上述解决方案存在如果文本在实际剪切点之前包含换行符会过早剪切文本的问题.这是解决此问题的版本:


The above solution has the problem of prematurely cutting the text if it contains a newline before the actual cutpoint. Here a version which solves this problem:

function tokenTruncate($string, $your_desired_width) {
  $parts = preg_split('/([s

]+)/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
  $parts_count = count($parts);

  $length = 0;
  $last_part = 0;
  for (; $last_part < $parts_count; ++$last_part) {
    $length += strlen($parts[$last_part]);
    if ($length > $your_desired_width) { break; }
  }

  return implode(array_slice($parts, 0, $last_part));
}

此外,这里是用于测试实现的 PHPUnit 测试类:

Also, here is the PHPUnit testclass used to test the implementation:

class TokenTruncateTest extends PHPUnit_Framework_TestCase {
  public function testBasic() {
    $this->assertEquals("1 3 5 7 9 ",
      tokenTruncate("1 3 5 7 9 11 14", 10));
  }

  public function testEmptyString() {
    $this->assertEquals("",
      tokenTruncate("", 10));
  }

  public function testShortString() {
    $this->assertEquals("1 3",
      tokenTruncate("1 3", 10));
  }

  public function testStringTooLong() {
    $this->assertEquals("",
      tokenTruncate("toooooooooooolooooong", 10));
  }

  public function testContainingNewline() {
    $this->assertEquals("1 3
5 7 9 ",
      tokenTruncate("1 3
5 7 9 11 14", 10));
  }
}

不处理像à"这样的特殊 UTF8 字符.在 REGEX 末尾添加 'u' 来处理它:

EDIT :

Special UTF8 characters like 'à' are not handled. Add 'u' at the end of the REGEX to handle it:

$parts = preg_split('/([s ]+)/u', $string, null, PREG_SPLIT_DELIM_CAPTURE);

$parts = preg_split('/([s ]+)/u', $string, null, PREG_SPLIT_DELIM_CAPTURE);

这篇关于如何将 PHP 中的字符串截断为最接近一定数量字符的单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆