修剪不能使用MySQL中的数组获取字符串 [英] Trim Not Working with Array from MySQL fetched String

查看:121
本文介绍了修剪不能使用MySQL中的数组获取字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想要做的是取一块html,去掉所有html标签,并将每行文本放入一个PHP数组中。



我只是试着用一个模块来测试(因此我的mysql查询中的 WHERE ID ='2409')。



ID 2409 的HTML部分如下所示:

 < table class =description-table> 
< tbody>
< tr>< td> Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370状态错误Sed 9.9< / td>< / tr>
< tr>< td>描述< / td>< / tr>
< >< td>< / td>
< td>< p>< / p>< p>< / p>
< strong>< br>< / strong>< strong>< br>< / strong>< strong> Donec Rem< / strong>< br>
< >
< strong> Animam Urgebat< br>
< br>< / strong>< strong>< br>
< br>
Rerum Sed 8613 - 3669 8358& 6699<峰; br>
< br>
1.mE(magNA)QUO Ad Nominum Statum Massa< br>
ab SEM Autem Reddet Habitu Sit< br>
< br>< / strong> <强> PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM< / strong> <强><峰; br>< /强> <强><峰; br>< /强> < / strong>< / strong>< strong>< br>< strong>< strong>< strong>< br>< strong> ad Quisque Modeste< / strong>< strong> ac Rem Wisi< / strong>< strong> ex Hac Congue mus Leo< / strong>< strong> ab 7/92Alias< / strong>< strong>广告2/73Adverso& ERAT< /强><强>我Personom Eget< / strong>< strong> ad Viribus Fuga Fuga< / strong>< strong> ab Louor-Sit Molles< / strong>< strong class =c2> 3x Block-Off Plates< / strong>< strong class =c2> ad Facunda< / strong>< strong class =c2> ab Personas Diam< br>
NUNC< br>
ex Teniet te Palmam Eaque< br>
me Versus Urna中的Teniet< / strong> <强><峰; br>< /强><峰; br>
< strong class =c3> ** CONDEMNENDUS REM CUM MAGNORUM **< / strong>< strong>< / strong>< br>
< / td>
< / table>

这里是我的PHP脚本,用于解析这个

  //连接到mysqli 

$ results = $ mysqli->查询(SELECT ID,post_content
FROM wp_posts'
WHERE ID ='2409';);

while $($ row = $ results-> fetch_array()){
$ htmlarray2 = preg_split('/<。+?> /',$ row ['post_content' ]);
$ htmlarray = array_values(array_filter(array_map('trim',$ htmlarray2)));
echo'< pre>';
print_r($ htmlarray);
echo'< / pre>';
。 。 。
}

这会产生这样的输出

 阵列

[0] => Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370状态错误Sed 9.9 $ b $动画Urgebat
[3] => Rerum Sed 8613 - 3669 8358& 6699
[4] => 1.mE(magNA)QUO Ad Nominum Statum Massa
[5] => ab SEM Autem Reddet Habitu Sit
[6] => PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM
[7] =>对于类别hic rem quos colubros ullo comune!
[8] => ad Quisque Modeste
[9] => ac Rem Wisi
[10] => ex Hac Congue mus Leo
[11] => ab 7/92别名
[12] => ad 2/73Adverso& Erat
[13] =>我个人Eget
[14] => ad Viribus Fuga Fuga
[15] => ea Totam Poenam
[16] => ab Louor-Sit Molles
[17] => ad Facunda
[18] => ab Personas Diam
[19] => NUNC
[20] => ex Teniet te Palmam Eaque
[21] => me Teniet in Versus Urna
[22] => ** CONDEMNENDUS REM CUM MAGNORUM **

这没关系,但现在我' m在删除数组中的字符串之前和之后的空格时遇到问题。



让我们以节点 8 在数组中

 。 。 。 
$ arrayvalue = $ htmlarray2 ['8'];

这样回声


  ad Quisque Modeste 



$ b $现在,我试图做的是显然修剪数组的每个元素,但为了测试,我只是使用这个变量 $ arrayvalue



我的问题是 trim()不适用于这个MySQL提取的变量。意思是添加 trim($ arrayvalue); 没有任何影响,并且与上述相同。

我知道这是与我通过我的查询获取数组有关的事情,因为如果我只是在它自己的PHP脚本中正常测试这个变量

  $ string ='ad Quisque Modeste'; 
echo trim($ string);

工作正常,echo输出只是 ad Quisque Modeste

为什么不是 trim()在字符串之前或之后没有空格c>在我的中工作,而循环?
修剪元素中前后空格的技巧是什么?



编辑:这是我的完整while循环。这与上面的例子有点不同(我一直在做很多修改,试图自己解决这个问题,所以它不断变化),但是现在我已经完成了:

  while($ row = $ results-> fetch_array()){
$ id = $ row ['ID'];
echo'ID:'。 $ ID;
echo'< br />';

//替换& nbsp;使用空格
$ converted = strtr($ row ['post_content'],array_flip(get_html_translation_table(HTML_ENTITIES,ENT_QUOTES)));
trim($转换,chr(0xC2).chr(0xA0));

//移除html元素
$ htmlarray = preg_split('/<。+?> /',$ converted);

//删除空数组元素并重新索引数组
$ htmlarray2 = array_values(array_filter(array_map('trim',$ htmlarray)));

//通过从数组中获取单值来测试
$ arrayvalue = $ htmlarray2 ['9'];

//我试图在while循环中修剪字符串
trim($ arrayvalue);

//不修剪
echo'< hr>'。 $ arrayvalue。 < HR>;

//把这个放在这里,这样我就可以看到完整的数组
echo'< pre>;
print_r($ htmlarray2);
echo'< / pre>';





根据要求,这里是 var_export( $ row ['post_content']);

 '< table class =product-description -table> 
< tbody>
< tr>
< td class =itemcolspan =3> Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370状态错误Sed 9.9< / td>
< / tr>
< tr>
< td class =titlecolspan =3>< / td>
< / tr>
< tr>
< td class =content>< br>
< br>
< p class =c1>< / p>
< p class =c1>< / p>
< strong>< br>< / strong> <强><峰; br>< /强> < strong> Donec Rem& nbsp;< / strong>< br>
< br>
< strong> Animam Urgebat< br>
< br>< / strong> <强><峰; br>
< br>
Rerum Sed 8613 - 3669 8358& 6699<峰; br>
< br>
1.mE(magNA)QUO Ad Nominum Statum Massa< br>
ab SEM Autem Reddet Habitu Sit< br>
< br>< / strong> < strong>& nbsp; PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM< / strong> <强><峰; br>< /强> <强><峰; br>< /强> < strong>< strong>< strong>< strong>< br>< / strong>< strong>  ; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & nbsp; ad Quisque Modeste< / strong>< strong>& nbsp; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & nbsp; ac Rem Wisi< / strong>< strong>& nbsp; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & nbsp; ex Hac Congue mus Leo< / strong>< strong>& nbsp; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & nbsp; ab 7/92Alias< / strong>< strong>& nbsp;& nbsp;& nbsp;& nbsp;& nbsp;& nbsp;& nbsp;  & nbsp;                  & nbsp;& nbsp;& nbsp;& nbsp;& nbsp;& nbsp;& nbsp; ad 2/73Adverso& ERAT< /强><强>&安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & n我个人Eget< / strong>< strong>& nbsp; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP;   ad Viribus Fuga Fuga< / strong>< strong>& nbsp; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & nbsp; ab Louor-Sit Molles< / strong>< strong class =c2>& nbsp; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & nbsp; 3x Block-Off Plates< / strong>< strong class =c2>& nbsp; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & nbsp; ad Facunda< / strong>< strong class =c2>& nbsp; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; &安培; NBSP; & nbsp; ab Personas Diam< br>
NUNC< br>
ex Teniet te Palmam Eaque< br>
me Versus Urna中的Teniet< / strong> <强><峰; br>< /强><峰; br>
< strong class =c3> ** CONDEMNENDUS REM CUM MAGNORUM **< / strong>< strong>& nbsp;< / strong>< br>< / td>
< td class =product-content-border>< / td>
< / tr>
< tr>
< td class =gallerycolspan =3>
< table>
< tbody>
< tr>
< td>< / td>
< td>< / td>
< / tr>
< tr>
< td>< / td>
< td>< / td>
< / tr>
< tr>
< td>< / td>
< td>< / td>
< / tr>
< tr>
< td>< / td>
< td>< / td>
< / tr>
< tr>
< td>< / td>
< td>< / td>
< / tr>
< tr>
< td>< / td>
< td>< / td>
< / tr>
< tr>
< td>< / td>
< td>< / td>
< / tr>
< tr>
< td>< / td>
< td>< / td>
< / tr>
< / tbody>
< / table>
< / td>
< / tr>
< tr>
< td>< / td>
< / tr>
< tr>
< td class =spacercolspan =3>< / td>
< / tr>
< tr>
< td class =product-content-border>< / td>
< / tr>
< / tbody>
< / table>
< br>
< br>
< br>
< p class =c4>< / p>'

编辑:):

下面发布一个解决方案。如果熟悉正则表达式的人可以帮助解释所有这些背后的苦难以及为什么这个正则表达式公式 / [ \s] + / mu 或者 $ clean_htmlarray = preg_replace('/ [\s] + / mu','',$ htmlarray); 解决了这个问题,我很乐意接受这个问题作为正确的答案和解释。 解决方案

正则表达式模式可以解决你的问题:
$ b $ p $ / [\s] + / 表示寻找一个或多个(包括:
'','\r','\\\
','\t','\f','\v')。多行修饰符/标记不是必须的,因为您不使用锚点( ^ $ unicode
修饰符/标志在你的情况中是绝对危险的,因为你的html文本字符串包含许多小恶魔调用。 ..


NO-BREAK SPACE,是unicode字符的组合 194 160 表示为 \x {00A0} 请参阅高亮显示 here。


没有 u 标志, NO-BREAK SPACE 字符仍然存在,需要额外过滤才能将其删除。






尽管您最终将代码转换为正确的输出。我很乐意提供一个更精简的单步模式,它可以让你更快地使用 preg_split()

  while($ row = $ results-> fetch_array()){
$ texts = preg_split('/ \ * *'[>] +> \ s * / u',$ row ['post_content'],null,PREG_SPLIT_NO_EMPTY);
var_export($ texts);
}

以下是一个工作 demo



这个新的拆分模式仍然会查找您的代码,但效率更高因为在< / code>和> 之间,我只是要求匹配所有not > 通过使用 [^>] + 。对于引擎而言,要比从表示的长字符列表匹配要简单得多。

再加上

,我为你的unicode扩展空白字符添加了匹配项。 \s * 将在每个标签之后的AND之前匹配零个或多个空格字符。



最后,我应该解释 preg_split()上的附加参数。 null 表示find unlimited matches - 这是默认行为,但我必须使用 null -1 作为它的值来保持它的位置以确保使用最终参数。 PREG_SPLIT_NO_EMPTY 您不必再采取额外的步骤,稍后再使用 array_filter()。它省略了分割中产生的任何空元素,所以你只能得到好东西。



我希望你发现这个有帮助/教育。祝你的项目好运。


What I'm trying to do is take a block of html, strip out all the html tags, and put each line of text into a PHP array.

I'm just trying it with one block to test (hence the WHERE ID = '2409' in my mysql query.

The HTML portion for ID 2409 looks like this:

<table class="description-table">
<tbody>
<tr><td>Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370 Status Error Sed 9.9</td></tr>
<tr><td>Description</td></tr>
<tr><td></td>
<td><br>
<br><p></p><p></p>
<strong><br></strong> <strong><br></strong> <strong>Donec Rem </strong><br>
<br>
<strong>Animam Urgebat<br>
<br></strong> <strong><br>
<br>
Rerum Sed 8613 - 3669 8358 & 6699<br>
<br>
1.mE (magNA) QUO Ad Nominum Statum Massa<br>
ab SEM Autem Reddet Habitu Sit<br>
<br></strong> <strong> PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM</strong> <strong><br></strong> <strong><br></strong> <strong>Lius typi sit nec quo adversis cras ministri oppressa, versus class hic rem quos colubros ullo commune!economy!</strong><strong><br></strong><strong>                                                           ad Quisque Modeste</strong><strong>                                                           ac Rem Wisi</strong><strong>                                                           ex Hac Congue mus Leo</strong><strong>                                                           ab 7/92" Alias</strong><strong>                                                           ad 2/73" Adverso & Erat</strong><strong>                                                           me Personom Eget</strong><strong>                                                           ad Viribus Fuga Fuga</strong><strong>                                                           ab Louor-Sit Molles</strong><strong class="c2">                                                           3x Block-Off Plates</strong><strong class="c2">                                                           ad Facunda</strong><strong class="c2">                                                           ab Personas Diam<br>
NUNC<br>
ex Teniet te Palmam Eaque<br>
me Teniet in Versus Urna<br></strong> <strong><br></strong><br>
<strong class="c3">**CONDEMNENDUS REM CUM MAGNORUM**</strong><strong></strong><br>
</td>
</table>

And here's my PHP script designed to parse this

//connect to mysqli

$results = $mysqli->query("SELECT ID, post_content
FROM wp_posts'
WHERE ID = '2409';");

while($row = $results->fetch_array()) {
    $htmlarray2 = preg_split('/<.+?>/', $row['post_content']);
    $htmlarray = array_values(array_filter(array_map('trim', $htmlarray2)));
    echo '<pre>';
        print_r($htmlarray);
    echo '</pre>';
    . . . 
}

This produces an output like this

Array
(
[0] => Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370 Status Error Sed 9.9
[1] => Donec Rem 
[2] => Animam Urgebat
[3] => Rerum Sed 8613 - 3669 8358 & 6699
[4] => 1.mE (magNA) QUO Ad Nominum Statum Massa
[5] => ab SEM Autem Reddet Habitu Sit
[6] =>  PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM
[7] => Lius typi sit nec quo adversis cras ministri oppressa, versus class hic rem quos colubros ullo commune!
[8] =>                                                            ad Quisque Modeste
[9] =>                                                            ac Rem Wisi
[10] =>                                                            ex Hac Congue mus Leo
[11] =>                                                            ab 7/92" Alias
[12] =>                                                            ad 2/73" Adverso & Erat
[13] =>                                                            me Personom Eget
[14] =>                                                            ad Viribus Fuga Fuga
[15] =>                                                            ea Totam Poenam
[16] =>                                                            ab Louor-Sit Molles
[17] =>                                                            ad Facunda
[18] =>                                                            ab Personas Diam
[19] => NUNC
[20] => ex Teniet te Palmam Eaque
[21] => me Teniet in Versus Urna
[22] => **CONDEMNENDUS REM CUM MAGNORUM**
)

This is okay, but now I'm having issue with removing the white-spaces before and after the strings in the array.

Let's take an example for the node 8 in the array

. . .
$arrayvalue = $htmlarray2['8'];

which echoes like this

                                                       ad Quisque Modeste

Now, what I'm trying to do is obviously trim each element of the array, but for testing I'm just working with this one variable $arrayvalue.

My issue is that trim() isn't working with this MySQL fetched variable. Meaning adding trim($arrayvalue); has no affect and echoes out the same way as above.

I know this is something to do with me fetching the array via my query, because if I just test this variable out normally in it's own php script

$string = '                                                            ad Quisque Modeste  ';
echo trim($string);

It works fine, and echo outputs just simply ad Quisque Modeste with the desired no white-spaces before or after the string.

Why isn't trim() working in my while loop? What's the trick to trimming the leading and trailing white-spaces from the elements?

Edit: Here's my full while loop as requested. It's a bit different then the above example (I've been doing a lot of modifications trying to solve this myself so it's constantly changing), but here is what I have right now in full:

while($row = $results->fetch_array()) {
    $id = $row['ID'];
    echo 'ID: ' . $id;
    echo '<br  />';

    //replace &nbsp; with white space
    $converted = strtr($row['post_content'],array_flip(get_html_translation_table(HTML_ENTITIES, ENT_QUOTES))); 
    trim($converted, chr(0xC2).chr(0xA0));

    //remove html elements
    $htmlarray = preg_split('/<.+?>/', $converted);

    // remove empty array elements and re-index array
    $htmlarray2 = array_values(array_filter(array_map('trim', $htmlarray)));

    // test by getting single value from array
    $arrayvalue = $htmlarray2['9'];

    // my attempt to trim string in while loop
    trim($arrayvalue);

    // doesn't trim
    echo '<hr>' . $arrayvalue . '<hr>';

    // put this here so I can see the full array
    echo '<pre>';
        print_r($htmlarray2);
    echo '</pre>';
}

As requested, here is the results of var_export($row['post_content']);

'<table class="product-description-table">
<tbody>
<tr>
<td class="item" colspan="3">Saepe Encomia 2.aD NEC Mirum Populo Soluni Iis 8679-1370 Status Error Sed 9.9</td>
</tr>
<tr>
<td class="title" colspan="3"></td>
</tr>
<tr>
<td class="content"><br>
<br>
<p class="c1"></p>
<p class="c1"></p>
<strong><br></strong> <strong><br></strong> <strong>Donec Rem&nbsp;</strong><br>
<br>
<strong>Animam Urgebat<br>
<br></strong> <strong><br>
<br>
Rerum Sed 8613 - 3669 8358 & 6699<br>
<br>
1.mE (magNA) QUO Ad Nominum Statum Massa<br>
ab SEM Autem Reddet Habitu Sit<br>
<br></strong> <strong>&nbsp;PRAEDAM ACCUMSAN PERSONARUM DENEGARE AC DUORUM</strong> <strong><br></strong> <strong><br></strong> <strong>Lius typi sit nec quo adversis cras ministri oppressa, versus class hic rem quos colubros ullo commune!economy!</strong><strong><br></strong><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ad Quisque Modeste</strong><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ac Rem Wisi</strong><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ex Hac Congue mus Leo</strong><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ab 7/92" Alias</strong><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ad 2/73" Adverso & Erat</strong><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;me Personom Eget</strong><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ad Viribus Fuga Fuga</strong><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ab Louor-Sit Molles</strong><strong class="c2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;3x Block-Off Plates</strong><strong class="c2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ad Facunda</strong><strong class="c2">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ab Personas Diam<br>
NUNC<br>
ex Teniet te Palmam Eaque<br>
me Teniet in Versus Urna<br></strong> <strong><br></strong><br>
<strong class="c3">**CONDEMNENDUS REM CUM MAGNORUM**</strong><strong>&nbsp;</strong><br></td>
<td class="product-content-border"></td>
</tr>
<tr>
<td class="gallery" colspan="3">
<table>
<tbody>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td class="spacer" colspan="3"></td>
</tr>
<tr>
<td class="product-content-border"></td>
</tr>
</tbody>
</table>
<br>
<br>
<br>
<p class="c4"></p>'

Final Edit :):

Posted a solution below. Not going to accept my own answer.

If anyone familiar with regex can help explain the tribulation behind all this and why this regex formula : /[\s]+/mu or rather $clean_htmlarray = preg_replace('/[\s]+/mu', ' ', $htmlarray); fixed this issue I'll gladly accept that as a proper answer and explanation.

解决方案

Here's your requested explanation on the regex pattern that solved your issue:

/[\s]+/ says "look for one or more white-space characters (this includes: ' ','\r','\n','\t','\f','\v'). The multi-line modifier/flag is not necessary because you are not using anchors (^ $) in your pattern. The unicode modifier/flag is absolutely critial in your case because your string of html text contains many little devils called...

"NO-BREAK SPACE" and is a combination of unicode characters 194 and 160 represented as \x{00A0} See them highlighted here.

Without the u flag, the NO-BREAK SPACE characters remain and additional filtering will be required to remove them.


While you eventually got your code to the right output. I'm happy to produce a leaner single-step pattern that will get you there faster purely using preg_split().

while($row=$results->fetch_array()){
    $texts=preg_split('/\s*<[^>]+>\s*/u',$row['post_content'],null,PREG_SPLIT_NO_EMPTY);
    var_export($texts);
}

Here is a working demo.

This new splitting pattern still looks for your tags, but it is more efficient because between the < and >, I merely ask to match all characters that are "not >" by using [^>]+. This is much simpler for the engine versus asking to match from the long list of characters that . represents.

Furthermore, I included matching for your unicode-extended white-space characters. \s* will match zero or more white-space characters before AND after each tag.

Finally, I should explain the additional parameters on preg_split(). The null says "find unlimited matches" -- this is the default behavior, but I must use null or -1 as its value to hold its place to ensure that the final parameter is used. PREG_SPLIT_NO_EMPTY spares you having to take the extra step of using array_filter() later. It omits any empty elements generated from the split, so you only get the good stuff.

I hope you found this helpful/educational. Good luck with your project.

这篇关于修剪不能使用MySQL中的数组获取字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆