如何用段落标签包围所有文本片段? [英] How do I surround all text pieces with paragraph tags?

查看:35
本文介绍了如何用段落标签包围所有文本片段?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在任何文本项周围放置段落标签.因此,它应该避免表格和其他元素.我怎么做?我想它可以用 preg_replace 来制作吗?

I want to put paragraph tags around any text items. It should therefore avoid tables and other elements. How do I do that? I guess it somehow can be made with preg_replace?

推荐答案

这里有几个函数可以帮助你做你想做的事:

Here are a couple of functions that should help you to do what you want to do:

// nl2p
// This function will convert newlines to HTML paragraphs
// without paying attention to HTML tags. Feed it a raw string and it will
// simply return that string sectioned into HTML paragraphs

function nl2p($str) {
    $arr=explode("\n",$str);
    $out='';

    for($i=0;$i<count($arr);$i++) {
        if(strlen(trim($arr[$i]))>0)
            $out.='<p>'.trim($arr[$i]).'</p>';
    }
    return $out;
}



// nl2p_html
// This function will add paragraph tags around textual content of an HTML file, leaving
// the HTML itself intact
// This function assumes that the HTML syntax is correct and that the '<' and '>' characters
// are not used in any of the values for any tag attributes. If these assumptions are not met,
// mass paragraph chaos may ensue. Be safe.

function nl2p_html($str) {

    // If we find the end of an HTML header, assume that this is part of a standard HTML file. Cut off everything including the
    // end of the head and save it in our output string, then trim the head off of the input. This is mostly because we don't
    // want to surrount anything like the HTML title tag or any style or script code in paragraph tags. 
    if(strpos($str,'</head>')!==false) {
        $out=substr($str,0,strpos($str,'</head>')+7);
        $str=substr($str,strpos($str,'</head>')+7);
    }

    // First, we explode the input string based on wherever we find HTML tags, which start with '<'
    $arr=explode('<',$str);

    // Next, we loop through the array that is broken into HTML tags and look for textual content, or
    // anything after the >
    for($i=0;$i<count($arr);$i++) {
        if(strlen(trim($arr[$i]))>0) {
            // Add the '<' back on since it became collateral damage in our explosion as well as the rest of the tag
            $html='<'.substr($arr[$i],0,strpos($arr[$i],'>')+1);

            // Take the portion of the string after the end of the tag and explode that by newline. Since this is after
            // the end of the HTML tag, this must be textual content.
            $sub_arr=explode("\n",substr($arr[$i],strpos($arr[$i],'>')+1));

            // Initialize the output string for this next loop
            $paragraph_text='';

            // Loop through this new array and add paragraph tags (<p>...</p>) around any element that isn't empty
            for($j=0;$j<count($sub_arr);$j++) {
                if(strlen(trim($sub_arr[$j]))>0)
                    $paragraph_text.='<p>'.trim($sub_arr[$j]).'</p>';
            }

            // Put the text back onto the end of the HTML tag and put it in our output string
            $out.=$html.$paragraph_text;
        }

    }

    // Throw it back into our program
    return $out;
}

其中的第一个 nl2p() 将字符串作为输入并将其转换为数组,只要有换行符 ("\n") 字符.然后它遍历每个元素,如果找到一个不为空的元素,它将用 <p></p> 标签包裹它并将其添加到一个字符串中,该字符串在函数结束.

The first of these, nl2p(), takes a string as an input and converts it to an array wherever there is a newline ("\n") character. Then it goes through each element and if it finds one that isn't empty, it will wrap <p></p> tags around it and add it to a string, which is returned at the end of the function.

第二个,nl2p_html(),是前者更复杂的版本.将 HTML 文件的内容作为字符串传递给它,它将用 <p></p> 标签包裹任何非 HTML 文本.它通过将一个字符串分解成一个数组来实现这一点,其中分隔符是 < 字符,它是任何 HTML 标签的开始.然后,遍历这些元素中的每一个,代码将查找 HTML 标记的末尾并将其后的任何内容放入一个新字符串中.这个新字符串本身将分解成一个数组,其中分隔符是一个换行符 ("\n").循环遍历这个新数组,代码查找非空元素.当它找到一些数据时,它会将其包装在段落标记中并将其添加到输出字符串中.当这个循环结束时,这个字符串将被添加回 HTML 代码,这将一起被修改为一个输出缓冲区字符串,该字符串在函数完成后返回.

The second, nl2p_html(), is a more complicated version of the former. Pass an HTML file's contents to it as a string and it will wrap <p> and </p> tags around any non-HTML text. It does this by exploding a string into an array where the delimiter is the < character, which is the start of any HTML tag. Then, iterating through each of these elements, the code will look for the end of the HTML tag and take anything that comes after it into a new string. This new string will itself be exploded into an array where the delimiter is a newline ("\n"). Looping through this new array, the code looks for elements that are not empty. When it finds some data, it will wrap it in paragraph tags and add it to an output string. When this loop is finished, this string will be added back onto the HTML code and this together will be amended to an output buffer string which is returned once the function has completed.

tl;dr: nl2p() 将字符串转换为 HTML 段落而不留下任何空段落,nl2p_html() 将段落标记包裹在 HTML 文档正文的内容周围.

tl;dr: nl2p() will convert a string to HTML paragraphs without leaving any empty paragraphs and nl2p_html() will wrap paragraph tags around the contents of the body of an HTML document.

我在几个小的示例 HTML 文件上对此进行了测试,以确保间距和其他东西不会破坏输出.由 nl2p_html() 生成的代码也可能不符合 W3C,因为它将围绕段落等包裹锚点,而不是相反.

I tested this on a couple of small example HTML files to make sure that spacing and other things don't ruin the output. The code that's generated by nl2p_html() may not be W3C-compliant, either, as it will wrap anchors around paragraphs and the like rather than the other way around.

希望这会有所帮助.

这篇关于如何用段落标签包围所有文本片段?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆