C#处理固定宽度的文件-解决方案不起作用 [英] C# Processing Fixed Width Files - Solution Not Working

查看:90
本文介绍了C#处理固定宽度的文件-解决方案不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里实现了Cuong的解决方案: C#处理定宽文件

I have implemented Cuong's solution here: C# Processing Fixed Width Files

这是我的代码:

        var lines = File.ReadAllLines(@fileFull);
        var widthList = lines.First().GroupBy(c => c)
        .Select(g => g.Count())
        .ToList();

        var list = new List<KeyValuePair<int, int>>();

        int startIndex = 0;

        for (int i = 0; i < widthList.Count(); i++)
        {
            var pair = new KeyValuePair<int, int>(startIndex, widthList[i]);
            list.Add(pair);

            startIndex += widthList[i];
        }

        var csvLines = lines.Select(line => string.Join(",",
        list.Select(pair => line.Substring(pair.Key, pair.Value))));

        File.WriteAllLines(filePath + "\\" + fileName + ".csv", csvLines);

@fileFull =文件路径&名称

@fileFull = File Path & Name

我遇到的问题是输入文件的第一行还包含数字.因此可能是AAAAAABBC111111111DD2EEEEEE等.出于某种原因,Cuong代码的输出为我提供了诸如1111RRRR和222223333之类的CSV标题.

The issue I have is the first line of the input file also contains digits. So it could be AAAAAABBC111111111DD2EEEEEE etc. For some reason the output from Cuong's code gives me CSV headings like 1111RRRR and 222223333.

有人知道这是为什么吗,我将如何解决?

Does anyone know why this is and how I would fix it?

标题行示例:

AAAAAAAAAAAAAAAABBBBBBBBBBCCCCCCCCDEFCCCCCCCCCGGGGGGGGHHHHHHHHIJJJJJJJJKKKKLLLLMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOPPPPQQQQ1111RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR222222222333333333444444444555555555666666666777777777888888888999999999S00001111TTTTTTTTTTTTUVWXYZ!"£$$$$$$%&  

已转换的标题行:

AAAAAAAAAAAAAAAA    BBBBBBBBBB  CCCCCCCCDEFCCCCCC   C   C   C   GGGGGGGG    HHHHHHHH    I   JJJJJJJJ    KKKK    LLLL    MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM  NNNNNNNNNNNNNNNNNNNNNNNNNNNNNN  OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO  PPPP    QQQQ    1111RRRR    RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR2222    222223333   333334444   444445555   555556666   666667777   777778888   888889999   99999S000   0   1111    TTTTTTTTTTTT    U   V   W   X   Y   Z   !   ",�,$$$$$$,%,&,"  


Jodrell-我实现了您的建议,但标头输出如下:


Jodrell - I implemented your suggestion but the header output is like:

BBBBBBBBBBCCCCCC    CCCCCCCCD   DEFCCCC             GGGGGGGG    HHHHHHH IJJJJJJ     KKKKLLL LLL MMM NNNNNNNNNNNNNNNNNNNNNNNNNNNNN   OOOOOOOOOOOOOOOOOOOOOOOOOOOOO   PPPPQQQQ1111RRRRRRRRRRRRRRRRR   QQQ 111 RRR 33333333    44444444    55555555    66666666    77777777    88888888    99999999    S0000111        111 TTT UVWXYZ!"�$$                                       %&

推荐答案

如Jodrell所述,您的代码不起作用,因为它假定代表每个列标题的字符是不同的.更改解析标头宽度的代码即可解决该问题.

As Jodrell already mentioned, your code doesn't work because it assumed that the character representing each column header is distinct. Change the code that parse the header widths would fix it.

替换:

var widthList = lines.First().GroupBy(c => c)
.Select(g => g.Count())
.ToList();

使用:

var widthList = new List<int>(); 
var header = lines.First().ToArray(); 
for (int i = 0; i < header.Length; i++) 
{ 
    if (i == 0 || header[i] != header[i-1]) 
        widthList.Add(0); 
    widthList[widthList.Count-1]++; 
}

已解析的标题列:

AAAAAAAAAAAAAAAA    BBBBBBBBBB  CCCCCCCC    D   E   F   CCCCCCCCC   GGGGGGGG    HHHHHHHH    I   JJJJJJJJ    KKKK    LLLL    MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM  NNNNNNNNNNNNNNNNNNNNNNNNNNNNNN  OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO  PPPP    QQQQ    1111    RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR    222222222   333333333   444444444   555555555   666666666   777777777   888888888   999999999   S   0000    1111    TTTTTTTTTTTT    U   V   W   X   Y   Z   !   "   £   $$$$$$  %   &

这篇关于C#处理固定宽度的文件-解决方案不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆