完全相同的文件和code。那么,为什么我的docx文件的二进制最终总是不同? [英] Exact same file and code. So why does the binary of my docx file always end differently?

查看:231
本文介绍了完全相同的文件和code。那么,为什么我的docx文件的二进制最终总是不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们从我们的服务器需要一个(非损坏).docx文件,并通过HTT prequest发布它的API。当从API下载它弄出来损坏。我99%肯定这是下降到code这帖子的文件,而不是API。

We take a (non-corrupted) .docx file from our server and post it via httprequest to an API. When downloading it from the API it comes out corrupted. I 99% sure that this is down to the code that posts the file, not the API.

原来损坏的文件有二进制一些额外的人物 - 我认为这将是pretty容易发现他们来自何处,并删除它们。男孩是我错了。

It turns out the corrupted file had some extra characters in the binary - I thought it would be pretty easy to find out where they came from and remove them. Boy was I wrong.

因为我已经意识到,我们每次发布文件时,二进制结局略有不同。我们使用的精确相同的文件,使用确切相同code。

I've since realised that every time we post the file, the binary ending is slightly different. We're using the exact same file, using the exact same code.

可以解释什么造成这种差异?

示例二进制结局

0015 e88a 5060 0700 00da 3b00 000f 0000
0000 0000 0000 0000 0000 0060 1d00 0077
6f72 642f 7374 796c 6573 2e78 6d6c 504b
0506 0000 0000 0b00 0b00 

30秒后​​:

30 seconds later:

0015 e88a 5060 0700 00da 3b00 000f 0000
0000 0000 0000 0000 0000 0060 1d00 0077
6f72 642f 7374 796c 6573 2e78 6d6c 504b
0506 0000 0000 0b00 0b00 c102 00

30秒后​​:

Another 30 seconds later:

0015 e88a 5060 0700 00da 3b00 000f 0000
0000 0000 0000 0000 0000 0060 1d00 0077
6f72 642f 7374 796c 6573 2e78 6d6c 504b
0506 0000 0000 0b00 0b00 c102 0000 ed24

发帖code

Sub PostTheFile(CVFile, fullFilePath, PostToURL)

    strBoundary = "---------------------------9849436581144108930470211272"
    strRequestStart = "--" & strBoundary & vbCrlf &_
        "Content-Disposition: attachment; name=""file""; filename=""" & CVFile & """" & vbcrlf & vbcrlf
    strRequestEnd = vbCrLf & "--" & strBoundary & "--" 

    Set stream = Server.CreateObject("ADODB.Stream")
        stream.Type = adTypeBinary 
        stream.Mode = adModeReadWrite     
        stream.Open
        stream.Write StringToBinary(strRequestStart)
        stream.Write ReadBinaryFile(fullFilePath)
        stream.Write StringToBinary(strRequestEnd)
        stream.Position = 0
        BINARYPOST= stream.read
        stream.Close

    Set stream = Nothing    

    Set httpRequest = Server.CreateObject("MSXML2.ServerXMLHTTP.6.0")
        httpRequest.Open "PATCH", PostToURL, False, "username", "pw"
        httpRequest.setRequestHeader "Content-Type", "multipart/form-data; boundary=""" & strBoundary & """"
        httpRequest.Send BINARYPOST
        Response.write "httpRequest.status: " & httpRequest.status 
    Set httpRequest = Nothing   
End Sub


Function StringToBinary(input)
    dim stream
    set stream = Server.CreateObject("ADODB.Stream")
        stream.Charset = "UTF-8"
        stream.Type = adTypeText 
        stream.Mode = adModeReadWrite 
        stream.Open
        stream.WriteText input
        stream.Position = 0
        stream.Type = adTypeBinary 
        StringToBinary = stream.Read
        stream.Close
    set stream = Nothing
End Function

Function ReadBinaryFile(fullFilePath) 
    dim stream
    set stream = Server.CreateObject("ADODB.Stream")
        stream.Type = 1
        stream.Open()
        stream.LoadFromFile(fullFilePath)
        ReadBinaryFile = stream.Read()
        stream.Close
    set stream = nothing
end function 

更新

我们打了几个不同的界限和集。

We played with a few different boundaries and Charsets.

有一些额外的东西BOM使用UTF-8回事。

There was some additional BOM stuff going on with UTF-8.

http://wikipedia.org/wiki/Byte_order_mark

现在的问题是清楚的增加(一个看似随意量)NULL /零填充。

Now the issue is clearly the addition of (a seemingly random amount of) NULL / zero padding.

例如。它第一次增加了13套00。点击刷新,第二次将增加8.第三次将其添加7.每次使用完全相同的文件,code。

E.g. The first time it adds 13 sets of "00". Hit refresh and the second time it will add 8. A third time it adds 7. Each time with the exact same file and code.

建议 - ?怎么可能是这样

该职位目标URL为https - 所以朋友建议我们的服务器可能已经认识到这一点,并添加随机填充作为加密的一部分。这听起来有点不可能给我,但我没有什么更好的建议。

The destination URL for the post is https - so a friend suggested that our server may have recognised this and added random padding as part of the encryption. This sounds kind of unlikely to me, but I don't have any better suggestions.

推荐答案

我发现了一个类似的问题:

I have found a similar question:

错误下载的PDF文件 - ASP经典

下面有一些提示,从那里来的:

Here are some tips that come from there:


  • 设置流.Mode属性为3

  • 设置Response.ContentType为XXX / XXX

  • 开始之前加入响应头,你应该调用Response.Clear(只是要确保你不发送额外的标记)(这似乎很相似)

  • set Stream .Mode property to 3
  • set Response.ContentType to "xxx/xxx"
  • Before you start adding Response Headers, you should call Response.Clear (just to be sure you're not sending extra markup) (This seems very similar)

希望这有助于: - )

Hope this helps :-)

这篇关于完全相同的文件和code。那么,为什么我的docx文件的二进制最终总是不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆