下载并解压缩XML文件 [英] Download and Unzip XML file

查看:259
本文介绍了下载并解压缩XML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想解压缩并解析位于此处的xml文件>

I would like to unzip and parse an xml file located here

这是我的代码:

HttpClientHandler handler = new HttpClientHandler()
{
    CookieContainer = new CookieContainer(),
    UseCookies = true,
    AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate,
   // | DecompressionMethods.None,

};

using (var http = new HttpClient(handler))
{

    var response =
         http.GetAsync(@"https://login.tradedoubler.com/report/published/aAffiliateEventBreakdownReportWithPLC_806880712_4446152766894956100.xml.zip").Result;

    Stream streamContent = response.Content.ReadAsStreamAsync().Result;

    using (var gZipStream = new GZipStream(streamContent, CompressionMode.Decompress))
    {
        var settings = new XmlReaderSettings()
        {
             DtdProcessing = DtdProcessing.Ignore
         };

         var reader = XmlReader.Create(gZipStream, settings);
         reader.MoveToContent();

         XElement root = XElement.ReadFrom(reader) as XElement;
     }
}

我在XmlReader.Create(gZipStream,设置)

I get an exception on XmlReader.Create(gZipStream, settings)

GZip标头中的幻数不正确。确保您正在传递GZip流

要再次检查我是否正在从网络中获取格式正确的数据,请抓取该流并将其保存到一个文件:

To double check that I am getting properly formatted data from the web, I grab the stream and save it to a file:

byte[] byteContent = response.Content.ReadAsByteArrayAsync().Result;
File.WriteAllBytes(@"C:\\temp\1111.zip", byteContent);

我检查1111.zip后,它看起来像是格式正确的zip文件,其中包含我需要的xml 。

After I examine 1111.zip, it appears as a well formatted zip file with the xml that I need.

我被告知此处根本不需要GZipStream,但是如果我从代码中完全删除压缩流,并将streamContent直接传递给xml阅读器,则会出现异常:

I was advised here that I do not need GZipStream at all but if I remove compression stream from the code completely, and pass streamContent directly to xml reader, I get an exception:

根级别无效。行1,位置1。

无论压缩还是不压缩,我仍然无法解析该文件。我在做什么错?

Either compressed or not compressed, I still fail to parse this file. What am I doing wrong?

推荐答案

相关文件的编码方式为 PKZip 格式,而不是 GZip 格式。

The file in question is encoded in PKZip format, not GZip format.

您需要使用其他库来解压缩它,例如 System.IO.Compression.ZipFile

You'll need a different library to decompress it, such as System.IO.Compression.ZipFile.

通常可以通过文件扩展名区分编码。 PKZip文件经常使用 .zip ,而GZip文件经常使用 .gz

You can typically tell the encoding by the file extension. PKZip files often use .zip while GZip files often use .gz.

请参阅:以编程方式在.net中解压缩文件

这篇关于下载并解压缩XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆