如何反序列化大型XML文件到C＃类的一部分？ [英] How to deserialize only part of a large xml file to c# classes?

查看：270 发布时间：2016/9/23 23:39:19 c# xml deserialization xml-deserialization

本文介绍了如何反序列化大型XML文件到C＃类的一部分？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经阅读如何反序列化XML，但还没有想通了，我应该写代码来符合我的需求的方式的一些帖子和文章，所以...我道歉，另一个问题有关反序列化的XML））

I've already read some posts and articles on how to deserialize xml but still haven't figured out the way I should write the code to match my needs, so.. I'm apologizing for another question about deserializing xml ))

我有我需要反序列化一个大的（50 MB）的XML文件。我用XSD.EXE获得文档的比自动生成的C＃类，我把我的项目文件XSD架构。我想从这个XML文件中的一些（不是全部）的数据，并把它变成我的SQL数据库。

I have a large (50 MB) xml file which I need to deserialize. I use xsd.exe to get xsd schema of the document and than autogenerate c# classes file which I put into my project. I want to get some (not all) data from this xml file and put it into my sql database.

下面是该文件的层次结构（简化，XSD是非常大的）：

Here is the hierarchy of the file (simplified, xsd is very large):

public class yml_catalog 
{
    public yml_catalogShop[] shop { /*realization*/ }
}

public class yml_catalogShop
{
    public yml_catalogShopOffersOffer[][] offers { /*realization*/ }
}

public class yml_catalogShopOffersOffer
{
    // here goes all the data (properties) I want to obtain ))
}

这里是我的代码：

第一种方式：

yml_catalogShopOffersOffer catalog;
var serializer = new XmlSerializer(typeof(yml_catalogShopOffersOffer));
var reader = new StreamReader(@"C:\div_kid.xml");
catalog = (yml_catalogShopOffersOffer) serializer.Deserialize(reader);//exception occures
reader.Close();

我得到InvalidOperationException异常：有一个错误的XML（3,2）文档

I get InvalidOperationException: There is an error in the XML(3,2) document

第二种方式：

XmlSerializer ser = new XmlSerializer(typeof(yml_catalogShopOffersOffer));
yml_catalogShopOffersOffer result;
using (XmlReader reader = XmlReader.Create(@"C:\div_kid.xml"))          
{
    result = (yml_catalogShopOffersOffer)ser.Deserialize(reader); // exception occures
}

InvalidOperationException异常：有一个在XML（误差为0， 0）文件

InvalidOperationException: There is an error in the XML(0,0) document

第三：我想反序列化整个文件：

third: I tried to deserialize the entire file:

 XmlSerializer ser = new XmlSerializer(typeof(yml_catalog)); // exception occures
 yml_catalog result;
 using (XmlReader reader = XmlReader.Create(@"C:\div_kid.xml"))          
 {
     result = (yml_catalog)ser.Deserialize(reader);
 }

和我得到了以下内容：

error CS0030: The convertion of type "yml_catalogShopOffersOffer[]" into "yml_catalogShopOffersOffer" is not possible.

error CS0029: The implicit convertion of type "yml_catalogShopOffersOffer" into "yml_catalogShopOffersOffer[]" is not possible.

那么，如何解决（或覆盖）的代码无法得到异常？

So, how to fix (or overwrite) the code to not get the exceptions?

编辑：此外，当我写的：

XDocument doc = XDocument.Parse(@"C:\div_kid.xml");

的

XmlException occures：在根级别未经许可的数据串1，位置1

The XmlException occures: unpermitted data on root level, string 1, position 1.

下面是XML文件的第一个字符串：

Here is the first string of the xml file:

<?xml version="1.0" encoding="windows-1251"?>

编辑2：
中的xml文件简短的例子：

edits 2: The xml file short example:

<?xml version="1.0" encoding="windows-1251"?>
<!DOCTYPE yml_catalog SYSTEM "shops.dtd">
<yml_catalog date="2012-11-01 23:29">
<shop>
   <name>OZON.ru</name>
   <company>?????? "???????????????? ??????????????"</company>
   <url>http://www.ozon.ru/</url>
   <currencies>
     <currency id="RUR" rate="1" />
   </currencies>
   <categories>
      <category id=""1126233>base category</category>
      <category id="1127479" parentId="1126233">bla bla bla</category>
      // here goes all the categories
   </categories>
   <offers>
      <offer>
         <price></price>
         <picture></picture>
      </offer>
      // other offers
   </offers>
</shop>
</yml_catalog>

PS
我已经acccepted答案（这是完善）。但现在我需要找到基地类别的使用的categoryId每个发售。数据是分层和碱类是没有parentId的属性的类别。所以，我写了一个递归方法找到了基础类，但它永远不会完成。好像algorythm不是非常快））结果
这里是我的代码（在main（）方法）

P.S. I've already acccepted the answer (it's perfect). But now I need to find "base category" for each Offer using categoryId. The data is hierarchical and the base category is the category that has no "parentId" attribute. So, I wrote a recursive method to find the "base category", but it never finishes. Seems like the algorythm is not very fast))
Here is my code: (in the main() method)

var doc = XDocument.Load(@"C:\div_kid.xml");
var offers = doc.Descendants("shop").Elements("offers").Elements("offer");
foreach (var offer in offers.Take(2))
        {
            var category = GetCategory(categoryId, doc);
            // here goes other code
        }

Helper方法：

Helper method:

public static string GetCategory(int categoryId, XDocument document)
    {
        var tempId = categoryId;
            var categories = document.Descendants("shop").Elements("categories").Elements("category");
            foreach (var category in categories)
            {
                if (category.Attribute("id").ToString() == categoryId.ToString())
                {
                    if (category.Attributes().Count() == 1)
                    {
                        return category.ToString();
                    }
                    tempId = Convert.ToInt32(category.Attribute("parentId"));
                }
            }
        return GetCategory(tempId, document);
    }

我可以用递归在这样的情况呢？如果不是，怎么回事我能找到基地类别？

Can I use recursion in such situation? If not, how else can I find the "base category"?

推荐答案

提供的LINQ to XML格式的尝试。 的XElement结果= XElement.Load（@C：\div_kid.xml）;

Give LINQ to XML a try. XElement result = XElement.Load(@"C:\div_kid.xml");

在查询LINQ很有才华，但在一开始承认这一点都不奇怪。你像语法，或者使用lambda表达式一个SQL选择文档节点。然后创建匿名对象（或使用现有的类）包含您所感兴趣的数据。

Querying in LINQ is brilliant but admittedly a little weird at the start. You select nodes from the Document in a SQL like syntax, or using lambda expressions. Then create anonymous objects (or use existing classes) containing the data you are interested in.

最好是看到它在行动。

的 LINQ杂例子XML

的使用XQuery和lambda表达式

的样本表示命名空间

有万吨以上MSDN上。搜索的LINQ to XML

miscellaneous examples of LINQ to XML
simple sample using xquery and lambdas
sample denoting namespaces
There is tons more on msdn. Search for LINQ to XML.

根据您的示例XML和代码，这里有一个具体的例子：

Based on your sample XML and code, here's a specific example:

var element = XElement.Load(@"C:\div_kid.xml");
var shopsQuery =
    from shop in element.Descendants("shop")
    select new
    {
        Name = (string) shop.Descendants("name").FirstOrDefault(),
        Company = (string) shop.Descendants("company").FirstOrDefault(),
        Categories = 
            from category in shop.Descendants("category")
            select new {
                Id = category.Attribute("id").Value,
                Parent = category.Attribute("parentId").Value,
                Name = category.Value
            },
        Offers =
            from offer in shop.Descendants("offer")
            select new { 
                Price = (string) offer.Descendants("price").FirstOrDefault(),
                Picture = (string) offer.Descendants("picture").FirstOrDefault()
            }

    };

foreach (var shop in shopsQuery){
    Console.WriteLine(shop.Name);
    Console.WriteLine(shop.Company);
    foreach (var category in shop.Categories)
    {
        Console.WriteLine(category.Name);
        Console.WriteLine(category.Id);
    }
    foreach (var offer in shop.Offers)
    {
        Console.WriteLine(offer.Price);
        Console.WriteLine(offer.Picture);
    }
}

作为一个额外的：这里是如何反序列化的树类别从平类别元素。
你需要一个合适的类来容纳他们，因为儿童的名单必须有一个类型：

As an extra: Here's how to deserialize the tree of categories from the flat category elements. You need a proper class to house them, for the list of Children must have a type:

class Category
{
    public int Id { get; set; }
    public int? ParentId { get; set; }
    public List<Category> Children { get; set; }
    public IEnumerable<Category> Descendants {
        get
        {
            return (from child in Children
                    select child.Descendants).SelectMany(x => x).
                    Concat(new Category[] { this });
        }
    }
}

要创建一个包含所有列表不同类别的文件中：

To create a list containing all distinct categories in the document:

var categories = (from category in element.Descendants("category")
                    orderby int.Parse( category.Attribute("id").Value )
                    select new Category()
                    {
                        Id = int.Parse(category.Attribute("id").Value),
                        ParentId = category.Attribute("parentId") == null ?
                            null as int? : int.Parse(category.Attribute("parentId").Value),
                        Children = new List<Category>()
                    }).Distinct().ToList();

然后将它们组织成一棵树（来自的平列表尊卑）：

var lookup = categories.ToLookup(cat => cat.ParentId);
foreach (var category in categories)
{
    category.Children = lookup[category.Id].ToList();
}
var rootCategories = lookup[null].ToList();

要找到其中包含根 theCategory ：

var root = (from cat in rootCategories
            where cat.Descendants.Contains(theCategory)
            select cat).FirstOrDefault();

这篇关于如何反序列化大型XML文件到C＃类的一部分？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何反序列化大型XML文件到C＃类的一部分？ [英] How to deserialize only part of a large xml file to c# classes?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

如何反序列化大型XML文件到C＃类的一部分？ [英] How to deserialize only part of a large xml file to c# classes?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭