Java:如何使用HashMaps根据第二列中的条件求和一列的所有值 [英] Java: How to sum all the values of one column based on the criteria in a second column using HashMaps

查看:152
本文介绍了Java:如何使用HashMaps根据第二列中的条件求和一列的所有值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个CSV文件,其中包含大约500,000行和22列的航班数据.第5列包含每个航班的每个飞机的机尾号.第22列包含每次飞行的行进距离.我试图将每个尾号(第5列)的总行驶距离(第22列)相加.

I have a CSV file that contains roughly 500,000 rows and 22 columns of flight data. The 5th Column contains the tail number of each plane for each flight. The 22nd column contains the distance traveled for each flight. I'm attempting to sum the total distance traveled (column 22) for each tail number (column 5).

我创建了一个HashMap,其中包含所有名为map1的数据.我创建了一个名为planeMileages的第二个HashMap,用于放置每个航班号及其​​总行驶距离.我正在使用嵌套的if语句遍历map1的每一行,并查看planeMileages中是否已包含尾号.如果它在planeMileages中,那么我想将该键添加到accumulatedMileages中.如果不包含它,我想输入密钥以及它的第一个距离值.

I created a HashMap containing all data named map1. I created a 2nd HashMap named planeMileages to place each flight number and its total distance traveled into. I'm using a nested if statement go through each line of map1 and see if the tail number is already contained in planeMileages. If it is in planeMileages, then I want to add on to the accumulatedMileagesfor that key. If it is not contained, I'd like to input the key along with it's first distance value.

我写的当前代码对我来说似乎是正确的,但是它产生了错误的结果,输出了不正确的尾号.您能否看一下并让我知道主要方法中忽略的内容?谢谢!

The current code that I've written seems sound to me, but it is producing the wrong result, outputting the incorrect tail number. Can you please take a look and let me know what I am overlooking in my main method? Thanks!

public class FlightData {

    HashMap<String,String[]>  dataMap;

        public static void main(String[] args) {

            FlightData map1 = new FlightData();
            map1.dataMap = map1.createHashMap();

            HashMap<String, Integer> planeMileages = new HashMap();
            //Filling the Array with all tail numbers
            for (String[] value : map1.dataMap.values()) {

                if(planeMileages.containsKey(value[4])) {  
                    int accumulatedMileage = planeMileages.get(value[4]) + Integer.parseInt(value[21]);
                    planeMileages.remove(value[4]);
                    planeMileages.put(value[4], accumulatedMileage);
                } 
                else {
                    planeMileages.put(value[4],Integer.parseInt(value[21]));
                }
            }


            String maxKey = Collections.max(planeMileages.entrySet(), Map.Entry.comparingByValue()).getKey();

            System.out.println(maxKey);


        }





       public HashMap<String,String[]> createHashMap() {
            File flightFile = new File("flights.csv");
            HashMap<String,String[]> flightsMap = new HashMap<String,String[]>();

            try {
            Scanner s = new Scanner(flightFile);
            while (s.hasNextLine()) {

                    String info = s.nextLine();
                    String [] piecesOfInfo = info.split(",");

                        String flightKey = piecesOfInfo[4] + "_" + piecesOfInfo[2] + "_" + piecesOfInfo[11]; //Setting the Key
                        String[] values = Arrays.copyOfRange(piecesOfInfo, 0, piecesOfInfo.length);

                        flightsMap.put(flightKey, values);


            }
            s.close();
            }


           catch (FileNotFoundException e)
           {
             System.out.println("Cannot open: " + flightFile);
           }

            return flightsMap;
        }
}

请在下面查看我的CSV文件的几行:

Please see a few lines of my CSV file below:

DayofMonth  DayOfWeek   FlightDate  UniqueCarrier   TailNum OriginAirportID Origin  OriginStateName DestAirportID   Dest    DestStateName   DepTime DepDelay    WheelsOff   WheelsOn    ArrTime ArrDelay    Cancelled   CancellationCode    Diverted    AirTime Distance
3   1   10/3/2016   AA  N786AA  10721   BOS Massachusetts   12478   JFK New York    556 -4  623 703 709 -6  0       0   40  187
4   2   10/4/2016   AA  N794AA  10721   BOS Massachusetts   12478   JFK New York    554 -6  615 703 712 -3  0       0   48  187
1   6   10/1/2016   AA  N783AA  12478   JFK New York    12892   LAX California  823 -7  844 1104    1111    -30 0       0   320 2475
2   7   10/2/2016   AA  N798AA  12478   JFK New York    12892   LAX California  847 17  904 1131    1159    18  0       0   327 2475
3   1   10/3/2016   AA  N786AA  12478   JFK New York    12892   LAX California  825 -5  838 1109    1131    -10 0       0   331 2475
4   2   10/4/2016   AA  N794AA  12478   JFK New York    12892   LAX California  826 -4  848 1114    1132    -9  0       0   326 2475

推荐答案

你可以检查一下吗?

public static void main(String[] args) throws  IOException {

    Map<String, String[]> map = createMap();
    Map<String, Long> planeMileages = map
            .entrySet()
            .stream()
            .collect(Collectors.groupingBy(o -> o.getValue()[4],
                    Collectors.collectingAndThen(
                            Collectors.summarizingInt(value ->
                                    Integer.parseInt(value.getValue()[21])), IntSummaryStatistics::getSum
                    )
            ));
    String maxKey = planeMileages.entrySet().stream().max(Comparator.comparing(Map.Entry::getValue)).get().getKey();
    System.out.println("max key: "+ maxKey);
}

public static Map<String, String[]> createMap() throws IOException {
    try (BufferedReader a = Files.newBufferedReader(Paths.get("flights.csv"))) {
        return a.lines().map(s -> s.split(","))
                .collect(Collectors.toMap(piecesOfInfo -> String.join("_", piecesOfInfo[4], piecesOfInfo[2], piecesOfInfo[11]), Function.identity()));
    }
}

public static Map<String, String[]> createMapLastDupWins() throws IOException {
    try (BufferedReader a = Files.newBufferedReader(Paths.get("flights.csv"))) {
        return a.lines().map(s -> s.split(","))
                .collect(Collectors.toMap(piecesOfInfo -> String.join("_", piecesOfInfo[4], piecesOfInfo[2], piecesOfInfo[11]), Function.identity(), (strings, strings2) -> {
                    //if this helps than data is duplicated
                    return strings2;
                }));
    }
}

这篇关于Java:如何使用HashMaps根据第二列中的条件求和一列的所有值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆