使用JavaScript从HTML字符串中提取文本 [英] Extract the text out of HTML string using JavaScript

查看：112 发布时间：2018/6/15 10:35:11 javascript html string text extract

本文介绍了使用JavaScript从HTML字符串中提取文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图使用JS函数（字符串作为参数传递）来获取HTML字符串的内部文本。这里是代码：

  function extractContent（value）{
 var content_holder =; 
 $ b $ for（var i = 0; i< value.length; i ++）{
 if（value.charAt（i）==='>'）{
 continue ; 
 while（value.charAt（i）！='<'）{
 content_holder + = value.charAt（i）; 
} 
} 
 
} 
 console.log（content_holder）; 
 
 
 extractContent（< p> Hello< / p>< a href ='http：//w3c.org'> W3C< / a>）;

问题是控制台上没有打印任何东西（ content_holder ）。我认为这个问题是由===操作符引起的。

解决方案

创建一个元素，在其中存储HTML ，并获取它的 textContent ：

 function extractContent（s）{var span = document.createElement（'跨度'）; span.innerHTML = s;返回span.textContent || span.innerText;}; alert（extractContent（< p> Hello< / p>< a href ='http：//w3c.org'> W3C< / a>））;

以下是一个允许节点之间有空格的版本，虽然您可能只想要块级元素：

 function extractContent（s，space）{var span = document.createElement跨度'）; span.innerHTML = s; if（space）{var children = span.querySelectorAll（'*'）; for（var i = 0; i< children.length; i ++）{if（children [i] .textContent）children [i] .textContent + =''; else children [i] .innerText + =''; }} return [span.textContent || span.innerText] .toString（）。replace（/ + / g，''）;}; console.log（extractContent（< p> Hello< / p>< a href ='http：//w3c.org'> W3C< / a> ;.很高兴< em>请参阅< / em>< ; strong>< em> you！< / em>< / strong>））; console.log（extractContent（< p> Hello< / p>< a href ='http：// w3c .org'>< />< / a> ;. code>

 
I am trying to get the inner text of HTML string, using a JS function(the string is passed as an argument). Here is the code:
function extractContent(value) {
    var content_holder = "";

    for(var i=0;i<value.length;i++) {
        if(value.charAt(i) === '>') {
            continue;
            while(value.charAt(i) != '<') {
                content_holder += value.charAt(i);
            }
        }

    }
    console.log(content_holder);
}

extractContent("<p>Hello</p><a href='http://w3c.org'>W3C</a>");
The problem is that nothing gets printed on the console(content_holder stays empty). I think the problem is caused by the "===" operator..
 解决方案 
Create an element, store the HTML in it, and get its textContent:



function extractContent(s) {
  var span= document.createElement('span');
  span.innerHTML= s;
  return span.textContent || span.innerText;
};
    
alert(extractContent("<p>Hello</p><a href='http://w3c.org'>W3C</a>"));







Here's a version that allows you to have spaces between nodes, although you'd probably want that for block-level elements only:



function extractContent(s, space) {
  var span= document.createElement('span');
  span.innerHTML= s;
  if(space) {
    var children= span.querySelectorAll('*');
    for(var i = 0 ; i < children.length ; i++) {
      if(children[i].textContent)
        children[i].textContent+= ' ';
      else
        children[i].innerText+= ' ';
    }
  }
  return [span.textContent || span.innerText].toString().replace(/ +/g,' ');
};
    
console.log(extractContent("<p>Hello</p><a href='http://w3c.org'>W3C</a>.  Nice to <em>see</em><strong><em>you!</em></strong>"));

console.log(extractContent("<p>Hello</p><a href='http://w3c.org'>W3C</a>.  Nice to <em>see</em><strong><em>you!</em></strong>",true));





                        这篇关于使用JavaScript从HTML字符串中提取文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

使用JavaScript从HTML字符串中提取文本 [英] Extract the text out of HTML string using JavaScript

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

使用JavaScript从HTML字符串中提取文本 [英] Extract the text out of HTML string using JavaScript

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭