OpenXML可以从Excel中创建一个DataTable - 货币单元格值不正确 [英] OpenXML to Create a DataTable from Excel - Money Cell Value Incorrect

查看:583
本文介绍了OpenXML可以从Excel中创建一个DataTable - 货币单元格值不正确的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图创建使用OpenXML的Excel电子表格的数据表。当使用Cell.CellValue.innerXml获得一个行的单元值返回的值由用户与可见的电子表格上输入一个货币值不解释的值相同。



该电子表格单元格的格式为文本和单元格值是570.81。当获得OpenXML中的数据值被解释为570.80999999999995。



此方法被用于许多不同的excel进口,其中建立该表时由首标或列索引的单元中的数据类型是未知的。



我见过的对Ecma的的Office Open XML文件numFmtId的格式标准,并提了一些帖子。难道这是有价值的?



我认为,因为数据类型是文本和数字有两个小数,必须有一些假设电池已四舍五入(即使没有公式存在)。



我希望有人能提供正确解释数据的解决方案。



下面是GetCellValue方式:

 私有静态字符串GetCellValue(SharedStringTablePart stringTablePart,DocumentFormat.OpenXml.Spreadsheet.Cell细胞,DocumentFormat。 OpenXml.Spreadsheet.Stylesheet的styleSheet)
{
字符串值= cell.CellValue.InnerXml;

如果(cell.DataType = NULL&放大器;!&安培; cell.DataType.Value == DocumentFormat.OpenXml.Spreadsheet.CellValues​​.SharedString)
{
返回stringTablePart.SharedStringTable .ChildElements [Int32.Parse(值)的InnerText。
}
,否则
{

如果(cell.StyleIndex!= NULL)
{
DocumentFormat.OpenXml.Spreadsheet.CellFormat cellFormat = (DocumentFormat.OpenXml.Spreadsheet.CellFormat)styleSheet.CellFormats.ChildElements [(int)的cell.StyleIndex.Value];

INT formatId =(INT)cellFormat.NumberFormatId.Value;

如果(formatId == 14)// [H]:MM:SS
{
的DateTime newDate = DateTime.FromOADate(double.Parse(值));
值= newDate.Date.ToString(CultureInfo.InvariantCulture);
}
}
返回值;
}
}


解决方案

由于你指出你的问题,该格式使用数字格式样式表中的单元格的值分别存储。



您应该能够延长你有格式化代码追溯到包括格式化的数字。从本质上讲,你需要抓住的 NumberingFormat 对应于 cellFormat.NumberFormatId.Value 您已经阅读。在 NumberingFormat 可以在 styleSheet.NumberingFormats 元素中找到。



一旦你有了这个,你可以访问的 FormatCode 属性 NumberingFormat 然后你就可以用它来格式化你认为合适的数据。



不幸的是,格式并没有那么直接使用。首先,根据MSDN 这里并不是所有的格式写入文件,所以我想你必须有那些地方访问,他们根据加载 NumberFormatId 你。



其次格式字符串的格式不与C#compatable所以你需要做一些操作。格式布局的细节可以在MSDN上的这里



我敲起来,处理你在你的问题有货币的情况下一些示例代码,但你可能需要给予一定的更多的思考为Excel格式的字符串转换成C#之一的解析。

 私有静态字符串GetCellValue(SharedStringTablePart stringTablePart,DocumentFormat.OpenXml .Spreadsheet.Cell细胞,DocumentFormat.OpenXml.Spreadsheet.Stylesheet的styleSheet)
{
字符串值= cell.CellValue.InnerXml;

如果(cell.DataType = NULL&放大器;!&安培; cell.DataType.Value == DocumentFormat.OpenXml.Spreadsheet.CellValues​​.SharedString)
{
返回stringTablePart.SharedStringTable .ChildElements [Int32.Parse(值)的InnerText。
}
,否则
{
如果(cell.StyleIndex!= NULL)
{
DocumentFormat.OpenXml.Spreadsheet.CellFormat cellFormat =(DocumentFormat.OpenXml .Spreadsheet.CellFormat)styleSheet.CellFormats.ChildElements [(int)的cell.StyleIndex.Value];

INT formatId =(INT)cellFormat.NumberFormatId.Value;

如果(formatId == 14)// [H]:MM:SS
{
的DateTime newDate = DateTime.FromOADate(double.Parse(值));
值= newDate.Date.ToString(CultureInfo.InvariantCulture);
}
,否则
{
//找到数字格式
NumberingFormat格式= styleSheet.NumberingFormats.Elements< NumberingFormat>()
.FirstOrDefault(N = GT; n.NumberFormatId == formatId);
双温度;

如果(格式= NULL
和!&安培; format.FormatCode.HasValue
和;&安培; double.TryParse(价值,走出临时))
{
//我们有一个格式并可以被表示为双

串actualFormat = GetActualFormat(format.FormatCode,温度)的值;
值= temp.ToString(actualFormat);
}
}
}
返回值;
}
}

私人静态字符串GetActualFormat(的StringValue formatCode,双值)
{
//的格式实际上是4格式,采用半拆-colon
// 0为正,1负,2为零(我忽略了第四格式,它是文本)
的String [] = formatComponents formatCode.Value.Split(';' );

INT elementToUse =价值> 0? 0:(值℃,1:2);

串actualFormat = formatComponents [elementToUse]

actualFormat = RemoveUnwantedCharacters(actualFormat,_);
actualFormat = RemoveUnwantedCharacters(actualFormat,'*');

//反斜线似乎转义字符 - 我不理他们
返回actualFormat.Replace(\,);
}

私人静态字符串RemoveUnwantedCharacters(字符串excelFormat,CHAR字符)
{
/ *的_和*字符用来控制排队的字符
他们随后字符被操纵,所以我忽略
两个_ *和和字符立即跟了上去。
请注意,这是越野车,因为我不检查如前述
反斜杠转义字符其中我也许应该
* /
INT指数= excelFormat.IndexOf(字符);
INT次数= 0;(!指数= -1)
,而
$ { b $ b //使用字符串
excelFormat = excelFormat.Substring(0,指数)+ excelFormat.Substring(索引+ 2)指数更换次数;
++次数;
指数= excelFormat。的IndexOf(字符,指数);
}
返回excelFormat;
}



由于带有值表 570.80999999999995 使用货币(英国)的输出我得到的是£570.81


格式化

I am attempting to create a datatable from an Excel spreadsheet using OpenXML. When getting a row's cell value using Cell.CellValue.innerXml the value returned for a monetary value entered by the user and visible on the spreadsheet is not the same value interpreted.

The spreadsheet cell is formatted as Text and the cell value is 570.81. When obtaining the data in OpenXML the value is interpreted as 570.80999999999995.

This method is used for many different excel imports where the data type for a cell by header or column index is not known when building the table.

I've seen a few post about the Ecma Office Open XML File Formats Standard and mention of numFmtId. Could this be of value?

I assume that since the data type is text and the number has two decimal places that there must be some assumption that the cell has been rounded (even though no formula exists).

I am hopeful someone can offer a solution for properly interpreting the data.

Below is the GetCellValue method:

private static string GetCellValue(SharedStringTablePart stringTablePart, DocumentFormat.OpenXml.Spreadsheet.Cell cell,DocumentFormat.OpenXml.Spreadsheet.Stylesheet styleSheet)
{
    string value = cell.CellValue.InnerXml;

    if (cell.DataType != null && cell.DataType.Value == DocumentFormat.OpenXml.Spreadsheet.CellValues.SharedString)
    {
        return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;  
    }
    else
    {

        if (cell.StyleIndex != null)
        {
            DocumentFormat.OpenXml.Spreadsheet.CellFormat cellFormat = (DocumentFormat.OpenXml.Spreadsheet.CellFormat)styleSheet.CellFormats.ChildElements[(int)cell.StyleIndex.Value];

            int formatId = (int)cellFormat.NumberFormatId.Value;

            if (formatId == 14) //[h]:mm:ss
            {
                DateTime newDate = DateTime.FromOADate(double.Parse(value)); 
                value = newDate.Date.ToString(CultureInfo.InvariantCulture); 
            }
        }
        return value;
    }
}

解决方案

As you point out in your question, the format is stored separately from the cell value using number formats in the stylesheet.

You should be able to extend the code you have for formatting dates to include formatting for numbers. Essentially you need to grab the NumberingFormat that corresponds to the cellFormat.NumberFormatId.Value you are already reading. The NumberingFormat can be found in the styleSheet.NumberingFormats elements.

Once you have this you can access the FormatCode property of the NumberingFormat which you can then use to format your data as you see fit.

Unfortunately the format is not quite that straightforward to use. Firstly, according to MSDN here not all formats are written to the file so I guess you will have to have those somewhere accessible and load them depending on the NumberFormatId you have.

Secondly the format of the format string is not compatable with C# so you'll need to do some manipulation. Details of the format layout can be found on MSDN here.

I have knocked together some sample code that handles the currency situation you have in your question but you may need to give some more thought to the parsing of the excel format string into a C# one.

private static string GetCellValue(SharedStringTablePart stringTablePart, DocumentFormat.OpenXml.Spreadsheet.Cell cell, DocumentFormat.OpenXml.Spreadsheet.Stylesheet styleSheet)
{
    string value = cell.CellValue.InnerXml;

    if (cell.DataType != null && cell.DataType.Value == DocumentFormat.OpenXml.Spreadsheet.CellValues.SharedString)
    {
        return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
    }
    else
    {
        if (cell.StyleIndex != null)
        {
            DocumentFormat.OpenXml.Spreadsheet.CellFormat cellFormat = (DocumentFormat.OpenXml.Spreadsheet.CellFormat)styleSheet.CellFormats.ChildElements[(int)cell.StyleIndex.Value];

            int formatId = (int)cellFormat.NumberFormatId.Value;

            if (formatId == 14) //[h]:mm:ss
            {
                DateTime newDate = DateTime.FromOADate(double.Parse(value));
                value = newDate.Date.ToString(CultureInfo.InvariantCulture);
            }
            else
            {
                //find the number format
                NumberingFormat format = styleSheet.NumberingFormats.Elements<NumberingFormat>()
                                .FirstOrDefault(n => n.NumberFormatId == formatId);
                double temp;

                if (format != null 
                    && format.FormatCode.HasValue 
                    && double.TryParse(value, out temp))
                {
                    //we have a format and a value that can be represented as a double

                    string actualFormat = GetActualFormat(format.FormatCode, temp);
                    value = temp.ToString(actualFormat);
                }
            }
        }
        return value;
    }
}

private static string GetActualFormat(StringValue formatCode, double value)
{
    //the format is actually 4 formats split by a semi-colon
    //0 for positive, 1 for negative, 2 for zero (I'm ignoring the 4th format which is for text)
    string[] formatComponents = formatCode.Value.Split(';');

    int elementToUse = value > 0 ? 0 : (value < 0 ? 1 : 2);

    string actualFormat = formatComponents[elementToUse];

    actualFormat = RemoveUnwantedCharacters(actualFormat, '_');
    actualFormat = RemoveUnwantedCharacters(actualFormat, '*');

    //backslashes are an escape character it seems - I'm ignoring them
    return actualFormat.Replace("\"", ""); ;
}

private static string RemoveUnwantedCharacters(string excelFormat, char character)
{
    /*  The _ and * characters are used to control lining up of characters
        they are followed by the character being manipulated so I'm ignoring
        both the _ and * and the character immediately following them.
        Note that this is buggy as I don't check for the preceeding
        backslash escape character which I probably should
        */
    int index = excelFormat.IndexOf(character);
    int occurance = 0;
    while (index != -1)
    {
        //replace the occurance at index using substring
        excelFormat = excelFormat.Substring(0, index) + excelFormat.Substring(index + 2);
        occurance++;
        index = excelFormat.IndexOf(character, index);
    }
    return excelFormat;
}

Given a sheet with the value 570.80999999999995 formatted using currency (in the UK) the output I get is £570.81.

这篇关于OpenXML可以从Excel中创建一个DataTable - 货币单元格值不正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆