可以获取HtmlNode的位置&原始输入中的长度? [英] Possible to get HtmlNode's position & length within original input?

查看:53
本文介绍了可以获取HtmlNode的位置&原始输入中的长度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下HTML片段(_用于空格):

Consider the following HTML fragment (_ is used for whitespace):

<head>
    ...
    <link ... ___/>
    <!-- ... -->
    ...
</head>

我正在使用HTML Agility Pack(HAP)读取HTML文件/片段并去除链接.我想做的是找到LINK(和一些其他)元素,然后将其替换为空格,如下所示:

I'm using Html Agility Pack (HAP) to read HTML files/fragments and to strip out links. What I want to do is find the LINK (and some other) elements and then replace them with whitespace, like so:

<head>
    ...
    ____________
    <!-- ... -->
    ...
</head>

到目前为止,解析部分似乎一直在工作,我得到了我要寻找的节点.但是,HAP会尝试修复HTML内容,而我需要一切都完全相同,除了我要进行的更改之外.另外,在回写先前读取的内容时,HAP似乎有很多错误,因此我要采用的方法是让HAP解析输入,然后回到原始输入并替换我所输入的内容不想.

The parsing part seems to be working so far, I get the nodes I'm looking for. However, HAP tries to fix the HTML content while I need everything to be exactly the same, except for the changes I'm trying to make. Plus, HAP seems to have quite a few bugs when it comes to writing back content that was read in previously, so the approach I want to take is let HAP parse the input and then I go back to the original input and replace content that I don't want.

问题是,HtmlNode似乎没有输入长度属性.它的StreamPosition似乎指示在输入中从哪里开始读取节点的内容,但是我找不到长度属性,该长度属性告诉我构建该节点要消耗多少字符.

The problem is, HtmlNode doesn't seem to have an input length property. It has StreamPosition which seems to indicate where reading of the node's content started within the input but I couldn't find a length property that'd tell me how many characters were consumed to build the node.

我尝试使用OuterHtml属性,但是不幸的是,HAP尝试通过删除___/部分(不应关闭LINK元素)来修复LINK.因此,OuterHtml.Length返回错误的长度.

I tried using the OuterHtml propety but, unfortunately, HAP tries to fix the LINK by removing the ___/ part (a LINK element is not supposed to be closed). Because of this, OuterHtml.Length returns the wrong length.

HAP中是否有一种获取此信息的方法?

Is there a way in HAP to get this information?

推荐答案

我最终修改了HtmlAgilityPack的代码,以暴露一个新属性,该属性返回HtmlNode的私有_outerlength字段.

I ended up modifying the code of HtmlAgilityPack to expose a new property that returns the private _outerlength field of HtmlNode.

public virtual int OuterLength
{
    get
    {
        return ( _outerlength );
    }
}

到目前为止,看来一切正常.

This seems to be working fine so far.

这篇关于可以获取HtmlNode的位置&amp;原始输入中的长度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆