Xlsx with OpenXml:将列匹配到数组 [英] Xlsx with OpenXml: Match columns to array

查看:100
本文介绍了Xlsx with OpenXml:将列匹配到数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述





让我解释标题:

我正在读* .xmls文件。我需要在下一步之前将单元格放入对象[] []中。我一直按索引顺序添加每一行,但我发现如果一个单元格留空,则它不包含在行对象中:



Hi,

Let me explain the title:
I am reading in a *.xmls file. I need to get put the cells into an object[][] before the next step. I had been adding each row in index order but I have found that if a cell is left blank then it is not included in the row object:

foreach (OpenXmlWorksheetSelector selector in sheetIndexes)
{
    WorksheetPart wsPart =
        (WorksheetPart)(wbPart.GetPartById(selector.Id));

    if (wsPart != null)
    {
        object[][] cells =
            wsPart.Worksheet.Descendants<Row>()
            .Select(row => row.Descendants<Cell>().ToArray())
            .Select(cellsa => cellsa.Select(cellsaCell => GetCellValue(cellsaCell, wbPart)).ToArray())
            .Where(row => row.Any(c => !c.ToString().IsEmpty()))
            .Select(row => row
                .Concat(new object[] { selector.WorkBookName, selector.Name, selector.Id.ToString() }).ToArray())
            .ToArray();

        var p = Products(columnDetailSelectors, cells);
        if (p != null)
            products.AddRange(p);
    }

}





这意味着我的列不匹配起来。每个行列位置的唯一标识符是单元格位置属性('A1','B1'等)。如果不解析每一行,你能想到一种方法,我可以让列匹配每行吗?



在此期间我会写一行逐行解析器来解决我的问题:S



提前谢谢

Andy ^ _ ^





PS:如果您不知道* .xmlx工作表格式是什么样的:以下是第2行的示例:



This means that my columns don't match up. The only identifier for each rows column position is the cell position attribute ('A1', 'B1', etc). Without parsing each row, can you think of a way that I could get the columns to match up for each row?

In the meantime I'll be writing a row-by-row parser to fix my issue :S

Thanks in advance
Andy ^_^


PS: in case you don't know what the *.xmlx sheet format looks like: here is an example of row 2:

<row r="2" spans="1:28" x14ac:dyDescent="0.25">
    <c r="A2" t="s">
        <v>28</v>
    </c><c r="B2">
        <v>0</v>
    </c><c r="C2">
        <v>8000</v>
    </c><c r="D2" t="s">
        <v>29</v>
    </c><c r="E2" s="1">
        <v>42313</v>
    </c><c r="G2">
        <v>1</v>
    </c><c r="H2" t="s">
        <v>30</v>
    </c><c r="I2" t="s">
        <v>31</v>
    </c><c r="J2" t="s">
        <v>32</v>
    </c><c r="K2" t="s">
        <v>33</v>
    </c><c r="L2" t="s">
        <v>34</v>
    </c><c r="M2">
        <v>1</v>
    </c><c r="N2">
        <v>15363200</v>
    </c><c r="O2">
        <v>76.13</v>
    </c><c r="P2">
        <v>9.44</v>
    </c><c r="Q2" t="s">
        <v>34</v>
    </c><c r="AB2" t="s">
        <v>35</v>
    </c>
</row>



正如您所见,列F,RZ和AA缺少


As you can see, columns F, R-Z and AA are missing

推荐答案

好的,所以我写了我的逐行解析器。我将处理1,000行的顺序,所以它足够快。关于StackOverflow的一些建议( datatable - 阅读Excel Open XML忽略了空白单元格 - 堆栈溢出 [ ^ ])可能更有效:





Ok, so I wrote my row-by-row parser. I'll be dealing with the order of 1,000's of rows so it's speedy enough. Some suggestions on StackOverflow (datatable - reading Excel Open XML is ignoring blank cells - Stack Overflow[^]) may be more efficient:


foreach (OpenXmlWorksheetSelector selector in sheetIndexes)
{
  WorksheetPart wsPart = (WorksheetPart)(wbPart.GetPartById(selector.Id));

  if (wsPart != null)
  {
    Row[] rows = wsPart.Worksheet.Descendants<Row>().ToArray();

    // in my case the max width will be the first (header) row
    var headerCells = rows.SelectMany(r => r.Descendants<Cell>()).ToArray();
    var maxColumnIndex = headerCells.Max(c => 
        // gets index from column letter
        GetColumnIndexByLetters(c.GetAttribute("r", "").Value));

    object[,] cells = new object[rows.Count(), maxColumnIndex + 4];

    foreach (Row row in rows)
    {
      foreach (Cell cell in row.Descendants<Cell>())
      {
        cells[
          int.Parse(row.GetAttribute("r", "").Value) - 1,
          GetColumnIndexByLetters(cell.GetAttribute("r", "").Value)] = GetCellValue(cell,
                                    wbPart);
      }
    }
    //ToEnumerable() is a function that, given an array on n dimensions, returns an enumerable<array[n-1>
    object[][] allObjects = cells.ToEnumerable()
        .Where(r => r.Any(c => !(c ?? "").ToString().IsEmpty()))
        .Select(r => r.Concat(new object[]
           {
                selector.WorkBookName, 
                selector.Name, 
                selector.Id.ToString()
           }).ToArray()).ToArray();
           
    var p = Products(columnDetailSelectors, allObjects);
  if (p != null)
     products.AddRange(p);
 }


这篇关于Xlsx with OpenXml:将列匹配到数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆