下载并解压缩XML文件 [英] Download and Unzip XML file
问题描述
我想解压缩并解析位于此处的xml文件>
I would like to unzip and parse an xml file located here
这是我的代码:
HttpClientHandler handler = new HttpClientHandler()
{
CookieContainer = new CookieContainer(),
UseCookies = true,
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate,
// | DecompressionMethods.None,
};
using (var http = new HttpClient(handler))
{
var response =
http.GetAsync(@"https://login.tradedoubler.com/report/published/aAffiliateEventBreakdownReportWithPLC_806880712_4446152766894956100.xml.zip").Result;
Stream streamContent = response.Content.ReadAsStreamAsync().Result;
using (var gZipStream = new GZipStream(streamContent, CompressionMode.Decompress))
{
var settings = new XmlReaderSettings()
{
DtdProcessing = DtdProcessing.Ignore
};
var reader = XmlReader.Create(gZipStream, settings);
reader.MoveToContent();
XElement root = XElement.ReadFrom(reader) as XElement;
}
}
我在XmlReader.Create(gZipStream,设置)
I get an exception on XmlReader.Create(gZipStream, settings)
GZip标头中的幻数不正确。确保您正在传递GZip流
要再次检查我是否正在从网络中获取格式正确的数据,请抓取该流并将其保存到一个文件:
To double check that I am getting properly formatted data from the web, I grab the stream and save it to a file:
byte[] byteContent = response.Content.ReadAsByteArrayAsync().Result;
File.WriteAllBytes(@"C:\\temp\1111.zip", byteContent);
我检查1111.zip后,它看起来像是格式正确的zip文件,其中包含我需要的xml 。
After I examine 1111.zip, it appears as a well formatted zip file with the xml that I need.
我被告知此处根本不需要GZipStream,但是如果我从代码中完全删除压缩流,并将streamContent直接传递给xml阅读器,则会出现异常:
I was advised here that I do not need GZipStream at all but if I remove compression stream from the code completely, and pass streamContent directly to xml reader, I get an exception:
根级别无效。行1,位置1。
无论压缩还是不压缩,我仍然无法解析该文件。我在做什么错?
Either compressed or not compressed, I still fail to parse this file. What am I doing wrong?
推荐答案
相关文件的编码方式为 PKZip 格式,而不是 GZip 格式。
The file in question is encoded in PKZip format, not GZip format.
您需要使用其他库来解压缩它,例如 System.IO.Compression.ZipFile 。
You'll need a different library to decompress it, such as System.IO.Compression.ZipFile.
通常可以通过文件扩展名区分编码。 PKZip文件经常使用 .zip
,而GZip文件经常使用 .gz
。
You can typically tell the encoding by the file extension. PKZip files often use .zip
while GZip files often use .gz
.
请参阅:以编程方式在.net中解压缩文件
这篇关于下载并解压缩XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!