按日期和ID合并和排序CSV中的数据 [英] Merging and Sorting data in CSV by date and ID

查看:80
本文介绍了按日期和ID合并和排序CSV中的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

继续问题:聚合不同的文件CSV

我有很多文件的格式为YYYYMMDD_HHmmss_goals.csv

I have many files that have the format YYYYMMDD_HHmmss_goals.csv

现在,我有一个容器,其中有一天,并且所有文件列表都合并到一个文件中.

Now I have a container that have one day and all list of files merged in one file.

CSV具有不同ID,X,Y的倍数.现在,我想在一天之内将其与一个ID合并,然后将该ID的x,y相加,并在一天之内将它们存储起来.我也不希望将它们合并到一个文件中,而只是即时地将合并的Single Day-> ID,x,y保存为CVS格式的一行.有点像按一天和ID对它们进行排序,然后将它们的x,y相加,但仅适用于该ID.

The CSV has multiples of different ID, X,Y. Now I want to merge them by single day with one ID and sum x,y of that ID and store them by single day. I also don't want them to be merged in one file, just on the fly, save the merged Single Day --> ID,x,y in one line of CVS format. its kinda like sorting them by one day and ID and sum their x,y but for that ID only.

更新:

public class XY_Values
{
    public int X { get; set; }
    public int Y { get; set; }
}


 public class ImageKey
    {
        public int mLocationId;
        public int mFormatId;
        public int mEditionId;

        public ImageKey(int LocationId, int FormatId, int EditionId)
        {
            mLocationId = LocationId;
            mFormatId = FormatId;
            mEditionId = EditionId;
        }

        public bool Equals(ImageKey x, ImageKey y)
        {
            return x.mLocationId == y.mLocationId && x.mFormatId == y.mFormatId && x.mEditionId == y.mEditionId;
        }

        public int GetHashCode(ImageKey obj)
        {
            return obj.mLocationId ^ obj.mFormatId ^ obj.mEditionId;
        }

    }


    static void MergeFilesForDay(string dir, DateTime date, List<string> files)
{
    var idValues = new Dictionary<string, XY_Values>();
    foreach (string fn in files)
    {
        foreach (string line in File.ReadAllLines(fn))
        {
            string[] fields = line.Split(new string[] { "," }, StringSplitOptions.None);
            if (fields.Length < 5) continue; // skip invalid data

            int LocationId, FormatID,EditionId;
            int x, y;
            bool LocationIdValid = int.TryParse(fields[0].Trim(), out x);
            bool FormatIDValid = int.TryParse(fields[1].Trim(), out y);
            bool EditionIdValid = int.TryParse(fields[2].Trim(), out x);
            bool xValid = int.TryParse(fields[3].Trim(), out x);
            bool yValid = int.TryParse(fields[4].Trim(), out y);


            if (xValid && yValid && LocationIdValid &&FormatIDValid && EditionIdValid)
            {
               ImageKey key = new ImageKey(LocationId, FormatID, EditionId);
                            bool knownId = enteries.ContainsKey(key);
                            if (!knownId)
                            {
                                enteries.Add(key, new XY_Values());
                            }

                            XY_Values entry = enteries[key];
                            entry.X += x;
                            entry.Y += y;

            }
        }
    }

    // don't know how would I combine them like to output cvs
    //LocationId, FormatID, EditionID, x,y   ... items 
    //Date:  
}

推荐答案

此方法使用Dictionary<string, XY_Values>按ID进行分组:

This approach uses a Dictionary<string, XY_Values> to group by ID:

public class XY_Values
{
    public int X { get; set; }
    public int Y { get; set; }
}

static void MergeFilesForDay(string dir, DateTime date, List<string> files)
{
    var idValues = new Dictionary<string, XY_Values>();
    foreach (string fn in files)
    {
        foreach (string line in File.ReadAllLines(fn))
        {
            string[] fields = line.Split(new string[] { "," }, StringSplitOptions.None);
            if (fields.Length < 3) continue; // skip invalid data
            string id = fields[0].Trim();
            int x, y;
            bool xValid = int.TryParse(fields[1].Trim(), out x);
            bool yValid = int.TryParse(fields[2].Trim(), out y);
            if (xValid && yValid)
            {
                bool knownID = idValues.ContainsKey(id);
                if (!knownID) idValues.Add(id, new XY_Values());
                XY_Values values = idValues[id];
                values.X += x;
                values.Y += y;
            }
        }
    }

    string file = Path.Combine(dir, date.ToString("yyyyMMdd") + ".csv");
    using (var stream = File.CreateText(file))
    {
        foreach (KeyValuePair<string, XY_Values> idValue in idValues)
        {
            string line = string.Format("{0},{1},{2}", idValue.Key, idValue.Value.X, idValue.Value.Y);
            stream.WriteLine(line);
        }
    }
}

该方法替代了我的最后一个答案中的旧版.

The method replaces the old in my last answer.

这篇关于按日期和ID合并和排序CSV中的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆