存在“二进制转储”或“获得二进制表示”函数在LibXML2? [英] There are a "binary dump" or "get binary representation" function in LibXML2?

查看:253
本文介绍了存在“二进制转储”或“获得二进制表示”函数在LibXML2?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要访问加载的XML DOM的内部二进制表示法... 有一些转储函数,但我没有看到像二进制缓冲区(只有XML缓冲区)。



我的最后一个目标是比较byte-by (当前和缓存)表示,直接使用它们的二进制(当前和缓存)表示,而不转换(到XML文本表示)的同一文档 ...所以,问题,

在LibXML2中有一个二进制表示(内存结构),比较 dump和current <





$ b

>

详情



这不是比较两个不同的DOM对象,但是一些更容易,因为没有更改ID等,不需要规范表示(!),只需要访问内部表示,因为是非常快的转换成文本。



前后之间有 black-box procedure ,ex。影响(或不影响)某些节点或属性的 XSLT标识转换



替代解决方案...


  1. 。为LibXML2开发一个C函数,用于比较两个树的逐个节点,并且如果它们不同则返回false:在树遍历期间,如果树结构改变或一些nodeValue改变,则算法停止比较(返回false )


  2. ...不是理想的,但是有助于其他一些算法:如果我可以访问(在LibXML2中) / em>或总长大小 md5 sha1 ...仅优化频繁对于我的应用程序),其中比较将返回false,避免完整的比较过程。







注意



相关问题







问题是比较之前与后箱操作之前,但

或使用已知的库。您必须知道,您的黑盒不会更改属性顺序或ID内容或非正规化空格(等)。
  • 全自由的,如使用外部编辑器在线编辑更改XHTML),用户和软件可以在其中执行任何操作。



  • 我将在知名黑盒的上下文中使用解决方案。所以,我在上面的详细信息部分的评论是有效的。



    在全免费背景的上下文中,不能使用比较二进制转储,因为只有规范表示(C14N)才能进行比较。为了通过C14N标准进行比较,只有备选解决方案(以上注释)是可能的。对于备选-1,您必须(除其他外)在比较一组属性节点之前进行排序。对于备用2(此处也讨论),生成C14N转储。




    当然,使用C14N标准是主观的,取决于应用:if,p。例如,对于你的应用程序,change attribute order是一个有效/重要的更改,比较检测它是有效的(!)。

    解决方案

    这里是相关的libxml2方法:



    有一个 base64 编码方法:

     
    功能:xmlTextWriterWriteBase64

    int xmlTextWriterWriteBase64 ,
    const char * data,
    int start,
    int len)

    编写一个base64编码的xml文本。
    writer:xmlTextWriterPtr
    data:binary data
    start:要编码的第一个字节的数据内的位置
    len:要编码的字节数
    返回:写入的字节(可能由于缓冲而为0)或在出现错误时为-1

    BinHex 编码方法:

     
    功能: xmlTextWriterWriteBinHex
    int xmlTextWriterWriteBinHex(xmlTextWriterPtr writer,
    const char * data,
    int start,
    int len)

    编写一个BinHex编码的xml文本。
    writer:xmlTextWriterPtr
    data:binary data
    start:要编码的第一个字节的数据内的位置
    len:要编码的字节数
    返回:写入的字节(可能由于缓冲而为0)或者出现错误时为-1

    参考 / p>


    I need to access the internal binary representation of a loaded XML DOM... There are some dump functions, but I not see something like "binary buffer" (there are only "XML buffers").

    My last objective is to compare byte-by-byte, the same document, before and after some black-box procedure, directly with their binary (current and cached) representations, without convertion (to XML-text representation)... So, the question,

    There are a binary representation (in-memory structures) in LibXML2, to compare dump with current representations?

    I need only to check if current and dumped DOMs are equivalent.


    Details

    It is not a problem of comparing two distinct DOM objects, but something more easy, because not change IDs, etc. not need canonical representation (!), only need access to internal representation, because is very faster than convert to text.

    Between "before and after" there are a black-box procedure, ex. a XSLT Identity transform that affects (or not) some nodes or attributes.

    Alternative solution...

    1. ... To develop a C function for LibXML2 that compares node-by-node the two trees, and return false if they are different: during the tree traversal, if tree structure changes, or some nodeValue changes, the algorithm stops the comparison (returning false).

    2. ... Not the ideal, but helps some other algorithms: if I can access (in LibXML2) the total number of nodes or the total length or size or md5 or sha1... Only to optimize frequent cases (for my application) where the comparison will returns false, avoiding the complete comparison-procedure.


    NOTES

    Related questions

    Warning for reader using answered solutions

    The problem is about "to compare before with after a back-box operation", but there are two kinds of back-boxes here:

    • Well-known and controllable ones, like XSLT transforms or use of a known library. You must known that your black-boxes will not change attribute order or ID content or denormalize spaces (or etc.).
    • Full-free ones, like use of a external editor (ex. online-editor changing a XHTML), where user and software can do anything.

    I will use a solution in a context of "well-known" black-box. So, my comments at "Details" section above, are valid.

    In a context of "full-free" back-boxes, you can not to use a "comparison of binary dumps", because only a canonical representation (C14N) is valid to compare. To compare by C14N-criteria, only "Alternative solutions" (commented above) are possible. For alternative-1, you must, among other things, sort before compare a set of attribute-nodes. For alternative-2 (also discussed here), to generate the C14N dumps.


    PS: of course, use of the C14N criteria is subjective, depends on application: if, p. ex., for your appication "change attribute order" is a valid/important change, the comparasion that detects it is valid (!).

    解决方案

    Here are the relevant libxml2 methods:

    There is a base64 encoding method:

    Function: xmlTextWriterWriteBase64
    
    int xmlTextWriterWriteBase64    (xmlTextWriterPtr writer, 
                         const char * data, 
                         int start, 
                         int len)
    
    Write an base64 encoded xml text.
    writer: the xmlTextWriterPtr
    data:   binary data
    start:  the position within the data of the first byte to encode
    len:    the number of bytes to encode
    Returns:    the bytes written (may be 0 because of buffering) or -1 in case of error
    

    and a BinHex encoding method:

    Function: xmlTextWriterWriteBinHex
    int xmlTextWriterWriteBinHex    (xmlTextWriterPtr writer, 
                         const char * data, 
                         int start, 
                         int len)
    
    Write a BinHex encoded xml text.
    writer: the xmlTextWriterPtr
    data:   binary data
    start:  the position within the data of the first byte to encode
    len:    the number of bytes to encode
    Returns:    the bytes written (may be 0 because of buffering) or -1 in case of error
    

    References

    这篇关于存在“二进制转储”或“获得二进制表示”函数在LibXML2?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆