如何调试损坏的docx文件? [英] How can I debug a corrupt docx file?

查看:238
本文介绍了如何调试损坏的docx文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个问题,其中.doc和.pdf文件出来确定,但.docx文件正在出现损坏。



为了解决我正在尝试调试为什么.docx被破坏。



我了解到与.pdf或.doc相比,docx格式对于额外的字符要严格得多。因此,我已经在docx文件中搜索各种xml文件,寻找无效的XML。但我找不到任何东西。这一切都是正常的。





任何人都可以为我现在调查方向?



更新: p>

文件夹中文件的完整列表如下:

  _rels 
.rels

/ customXml
/ _rels
.rels
item1.xml
itemProps1.xml

/ docProps
app.xml
core.xml

/ word
/ _rels
document.xml.rels
/ media
image1.jpeg
/ theme
theme1.xml
document.xml
fontTable.xml
numbering.xml
settings.xml
style.xml
stylesWithEffects.xml
webSettings.xml

[Content_Types] .xml

更新2:



我也应该提到腐败的原因几乎我当然是一个糟糕的二进制文件POST。



为什么docx文件被二进制文件损坏,但是.doc和.pdf是不错的?



更新3:



我已经尝试过各种docx修复工具。他们似乎都修复了这个文件,但是没有提出错误原因的线索。



我的下一步是使用修复版本检查损坏的文件的内容。



如果有人知道一个给出一个体面的错误信息的docx修复工具,我很乐意听到。实际上,我可能会将其作为一个单独的问题发布。



更新4(2017)



我从来没有解决过这个问题。我已经尝试了以下答案中提到的所有工具,但没有一个为我工作。



自从在Sublime Text中打开.docx之后,我进一步发现了一个 0000 。这里有新问题的更多细节:什么在httpwebrequest中导致.docx文件中的这种损坏?

解决方案

我使用了Open XML SDK 2.5生产力工具( http://www.microsoft.com/en-us /download/details.aspx?id=30425 )找到一个破坏的超链接引用的问题。



您必须先下载/安装SDK,那么工具。该工具将打开并分析文档的问题。


I have an issue where .doc and .pdf files are coming out OK but a .docx file is coming out corrupt.

In order to solve that I am trying to debug why the .docx is corrupt.

I learned that the docx format is much stricter with regard to extra characters than either .pdf or .doc. Therefore I have searched the various xml files WITHIN the docx file looking for invalid XML. But I can't find any. It all validates fine.

Could anyone suggest directions for me to investigate now?

UPDATE:

The full listing of files inside the folder is as follows:

/_rels
    .rels

/customXml
    /_rels
        .rels
    item1.xml
    itemProps1.xml

/docProps
    app.xml
    core.xml

/word
    /_rels
        document.xml.rels
    /media
        image1.jpeg
    /theme
        theme1.xml
    document.xml
    fontTable.xml
    numbering.xml
    settings.xml
    styles.xml
    stylesWithEffects.xml
    webSettings.xml

[Content_Types].xml

UPDATE 2:

I should also have mentioned that the reason for corruption is almost certainly a bad binary file POST on my behalf.

why are docx files corrupted by binary post, but .doc and .pdf are fine?

UPDATE 3:

I have tried the demo various docx repair tools. They all seem to repair the file ok but give no clue as to the cause of the error.

My next step is to examine the contents of the corrupted file with the repaired version.

If anybody knows of a docx repair tool that gives a decent error message I'd appreciate hearing about it. In fact I might post that as a separate question.

UPDATE 4 (2017)

I never solved this problem. I have tried all the tools suggested in the answers below but none of them worked for me.

I have since progressed a little further and found a block of 0000 missing when opening the .docx in Sublime Text. More details in the new question here: What could be causing this corruption in .docx files during httpwebrequest?

解决方案

I used the "Open XML SDK 2.5 Productivity Tool" (http://www.microsoft.com/en-us/download/details.aspx?id=30425) to find a problem with a broken hyperlink reference.

You have to download/install the SDK first, then the tool. The tool will open and analyze the document for problems.

这篇关于如何调试损坏的docx文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆