如何稳健地解析文档的任何标题并构建 <ul>只是那些标题的树 [英] how to robustly parse a document for any headings and build a <ul> tree of just those headings

查看:12
本文介绍了如何稳健地解析文档的任何标题并构建 <ul>只是那些标题的树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我解析了一个文档,以便使用 stackHeadings() 获取所有标题.我这样做是为了使用 buildNav() 构建 Microsoft Word 样式的文档地图.这目前工作正常,但它不是很健壮,并且在标题不遵循严格顺序的任何时候都会中断......例如(如果你从 H2 开始它会中断,如果你在下面嵌套一个 H3 并且 H1 它会中断,等等......)

So I parse through a document in order to grab all the headings with stackHeadings(). I do this in order to build a Microsoft Word style document map with buildNav(). This currently works OK but its not very robust and breaks anytime the headings do not follow a strict order... e.g. (If you start with an H2 it breaks, if you nest a H3 under and H1 it breaks, etc...)

我无法确定解决此问题的最佳方法(使其更强大).我正在利用 jQuery 的 `nextUntil' 函数来查找两个 h1 之间的所有 h2.

I can't quite figure out the best way to fix this (make it more robust). I'm taking advantage of jQuery's `nextUntil' function to find all the h2s between two h1s.

一种可能性是替换:

elem.nextUntil( 'h' + cur, 'h' + next )

elem.nextUntil( 'h' + cur, 'h' + next + ',h' + (next + 1) + ',h' + (next + 2) ... )

查找同一级别的两个标题之间的所有子标题.但是现在 h1s 的 h3 子节点只会嵌套一层而不是两层.

to find ALL subheadings between two headings of the same level. But now h3 children of h1s would only be nested one level rather than two.

因此,您必须将当前标题级别与父标题级别进行比较,如果跳跃超过一个 (h1 -> h3),则必须在它们之间创建一个空子作为缺少 h2 的嵌套占位符.

So then you'd have to compare the current heading level with the parent heading level, and if there's a jump of more than one (h1 -> h3), you'd have to create an empty child between them as a nesting placeholder for the missing h2.

任何想法或解决方案将不胜感激!

Any ideas or solutions would be greatly appreciated!

stackHeadings = (items, cur, counter) ->

    cur = 1 if cur == undefined
    counter ?= 1
    next = cur + 1
    for elem, index in items
      elem = $(elem)
      children  =  filterHeadlines( elem.nextUntil( 'h' + cur, 'h' + next ) )
      d.children = stackHeadings( children, next, counter ) if children.length > 0
      d


filterHeadlines = ( $hs ) ->
    _.filter( $hs, ( h ) -> $(h).text().match(/[^s]/) )

buildNav = ( ul, items ) ->
    for child, index in items
        li = $( "<li>" )
        $( ul ).append( li )
        $a = $("<a/>")
        $a.attr( "id", "nav-title-" + child.id )

        li.append( $a )

        if child.children
            subUl = document.createElement( 'ul' )
            li.append( subUl )
            buildNav( subUl, child.children )

items = stackHeadings( filterHeadlines( source.find( 'h1' ) ) )
ul = $('<ul>')
buildNav( ul, items)

推荐答案

我拼凑了一些 JavaScript,可以满足你的需求 http://jsfiddle.net/fA4EW/

I threw together some JavaScript that will do what you want http://jsfiddle.net/fA4EW/

这是一个相当简单的递归函数,它使用一组元素(节点)并相应地构建 UL 结构.为了与问题保持一致,当您从 H1 到 H3 等时,我添加了占位符(空)列表元素.

It's a fairly straightforward recursive function that consumes an array of elements (nodes) and builds the UL structure accordingly. To be consistent with the question I add the placeholder (empty) list elements when you from an H1 to an H3 etc.

function buildRec(nodes, elm, lv) {
    var node;
    // filter
    do {
        node = nodes.shift();
    } while(node && !(/^h[123456]$/i.test(node.tagName)));
    // process the next node
    if(node) {
        var ul, li, cnt;
        var curLv = parseInt(node.tagName.substring(1));
        if(curLv == lv) { // same level append an il
            cnt = 0;
        } else if(curLv < lv) { // walk up then append il
            cnt = 0;
            do {
                elm = elm.parentNode.parentNode;
                cnt--;
            } while(cnt > (curLv - lv));
        } else if(curLv > lv) { // create children then append il
            cnt = 0;
            do {
                li = elm.lastChild;
                if(li == null)
                    li = elm.appendChild(document.createElement("li"));
                elm = li.appendChild(document.createElement("ul"));
                cnt++;
            } while(cnt < (curLv - lv));
        }
        li = elm.appendChild(document.createElement("li"));
        // replace the next line with archor tags or whatever you want
        li.innerHTML = node.innerHTML;
        // recursive call
        buildRec(nodes, elm, lv + cnt);
    }
}
// example usage
var all = document.getElementById("content").getElementsByTagName("*");
var nodes = []; 
for(var i = all.length; i--; nodes.unshift(all[i]));
var result = document.createElement("ul");
buildRec(nodes, result, 1);
document.getElementById("outp").appendChild(result);

这篇关于如何稳健地解析文档的任何标题并构建 &lt;ul&gt;只是那些标题的树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆