将HTML映射到JSON [英] Map HTML to JSON

查看:74
本文介绍了将HTML映射到JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将HTML映射到JSON,并保持结构完整。有没有这样做的图书馆,还是我需要写我自己的图书馆?我想,如果没有html2json库,我可以将xml2json库作为开始。毕竟,html只是xml的一个变种吗?



更新:好的,我应该举个例子。我想要做的是以下几点。解析一个html字符串:

 < div> 
< span>文字< / span> Text2
< / div>

放入像这样的json对象:

 {
type:div,
content:[
{
type:span,
content:[
Text2
]
},
Text2
]
}

注意:如果您没有注意到标签,我正在寻找Javascript解决方案

解决方案

我只是写了这个函数,做你想做的,试试看,让我知道它是否无法正确工作你:

  //用一个元素进行测试。 
var initElement = document.getElementsByTagName(html)[0];
var json = mapDOM(initElement,true);
console.log(json);

//用字符串测试。
initElement =< div>< span>文字< / span> Text2< / div>;
json = mapDOM(initElement,true);
console.log(json);

函数mapDOM(element,json){
var treeObject = {};

//如果字符串转换为文档节点
if(typeof element ===string){
if(window.DOMParser){
parser = new的DOMParser();
docNode = parser.parseFromString(element,text / xml);
} else {// Microsoft再次触发
docNode = new ActiveXObject(Microsoft.XMLDOM);
docNode.async = false;
docNode.loadXML(element);
}
element = docNode.firstChild;
}

//递归循环DOM元素并将属性赋值给对象
函数treeHTML(element,object){
object [type] = element。节点名称;
var nodeList = element.childNodes;
if(nodeList!= null){
if(nodeList.length){
object [content] = [];
for(var i = 0; i< nodeList.length; i ++){
if(nodeList [i] .nodeType == 3){
object [content]。push (节点列表[I] .nodeValue);
} else {
object [content]。push({});
treeHTML(nodeList [i],object [content] [object [content]。length -1]);



$ b if(element.attributes!= null){
if(element.attributes.length){
object [attributes] = {};
for(var i = 0; i< element.attributes.length; i ++){
object [attributes] [element.attributes [i] .nodeName] = element.attributes [i] .nodeValue;
}
}
}
}
treeHTML(element,treeObject);

return(json)? JSON.stringify(treeObject):treeObject;
}

工作示例: http://jsfiddle.net/JUSsf/ (在Chrome中测试过,我无法保证完整的浏览器支持 - 您将不得不测试它)。



它创建一个包含HTML页面树形结构的对象,格式为您请求的格式,然后使用 JSON.stringify()这是包括在大多数现代浏览器(IE8 +,Firefox 3 +等);如果您需要支持旧版浏览器,则可以添加 json2.js



它可以包含有效的XHTML作为参数的 DOM元素字符串 (我相信,我不确定在某些情况下 DOMParser()是否会因为设置为text / xml code>或者它只是不提供错误处理。不幸的是,text / html浏览器支持不佳。)



通过传递不同的值作为元素,您可以轻松更改此函数的范围。无论您传递什么值,都将是您的JSON地图的根源。



享受

I'm attempting map HTML into JSON with structure intact. Are there any libraries out there that do this or will I need to write my own? I suppose if there are no html2json libraries out there I could take an xml2json library as a start. After all, html is only a variant of xml anyway right?

UPDATE: Okay, I should probably give an example. What I'm trying to do is the following. Parse a string of html:

<div>
  <span>text</span>Text2
</div>

into a json object like so:

{
  "type" : "div",
  "content" : [
    {
      "type" : "span",
      "content" : [
        "Text2"
      ]
    },
    "Text2"
  ]
}

NOTE: In case you didn't notice the tag, I'm looking for a solution in Javascript

解决方案

I just wrote this function that does what you want, try it out let me know if it doesn't work correctly for you:

// Test with an element.
var initElement = document.getElementsByTagName("html")[0];
var json = mapDOM(initElement, true);
console.log(json);

// Test with a string.
initElement = "<div><span>text</span>Text2</div>";
json = mapDOM(initElement, true);
console.log(json);

function mapDOM(element, json) {
    var treeObject = {};

    // If string convert to document Node
    if (typeof element === "string") {
        if (window.DOMParser) {
              parser = new DOMParser();
              docNode = parser.parseFromString(element,"text/xml");
        } else { // Microsoft strikes again
              docNode = new ActiveXObject("Microsoft.XMLDOM");
              docNode.async = false;
              docNode.loadXML(element); 
        } 
        element = docNode.firstChild;
    }

    //Recursively loop through DOM elements and assign properties to object
    function treeHTML(element, object) {
        object["type"] = element.nodeName;
        var nodeList = element.childNodes;
        if (nodeList != null) {
            if (nodeList.length) {
                object["content"] = [];
                for (var i = 0; i < nodeList.length; i++) {
                    if (nodeList[i].nodeType == 3) {
                        object["content"].push(nodeList[i].nodeValue);
                    } else {
                        object["content"].push({});
                        treeHTML(nodeList[i], object["content"][object["content"].length -1]);
                    }
                }
            }
        }
        if (element.attributes != null) {
            if (element.attributes.length) {
                object["attributes"] = {};
                for (var i = 0; i < element.attributes.length; i++) {
                    object["attributes"][element.attributes[i].nodeName] = element.attributes[i].nodeValue;
                }
            }
        }
    }
    treeHTML(element, treeObject);

    return (json) ? JSON.stringify(treeObject) : treeObject;
}

Working example: http://jsfiddle.net/JUSsf/ (Tested in chrome, I can't guarantee full browser support - you will have to test this).

​It creates an object that contains the tree structure of the HTML page in the format you requested and then uses JSON.stringify() which is included in most modern browsers (IE8+, Firefox 3+ .etc); If you need to support older browsers you can include json2.js.

It can take either a DOM element or a string containing valid XHTML as an argument (I believe, I'm not sure whether the DOMParser() will choke in certain situations as it is set to "text/xml" or whether it just doesn't provide error handling. Unfortunately "text/html" has poor browser support).

You can easily change the range of this function by passing a different value as element. Whatever value you pass will be the root of your JSON map.

Enjoy

这篇关于将HTML映射到JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆