Java:如何使用HashMaps根据第二列中的条件求和一列的所有值 [英] Java: How to sum all the values of one column based on the criteria in a second column using HashMaps
问题描述
我有一个CSV文件,其中包含大约500,000行和22列的航班数据.第5列包含每个航班的每个飞机的机尾号.第22列包含每次飞行的行进距离.我试图将每个尾号(第5列)的总行驶距离(第22列)相加.
I have a CSV file that contains roughly 500,000 rows and 22 columns of flight data. The 5th Column contains the tail number of each plane for each flight. The 22nd column contains the distance traveled for each flight. I'm attempting to sum the total distance traveled (column 22) for each tail number (column 5).
我创建了一个HashMap
,其中包含所有名为map1
的数据.我创建了一个名为planeMileages
的第二个HashMap
,用于放置每个航班号及其总行驶距离.我正在使用嵌套的if语句遍历map1
的每一行,并查看planeMileages
中是否已包含尾号.如果它在planeMileages
中,那么我想将该键添加到accumulatedMileages
中.如果不包含它,我想输入密钥以及它的第一个距离值.
I created a HashMap
containing all data named map1
. I created a 2nd HashMap
named planeMileages
to place each flight number and its total distance traveled into. I'm using a nested if statement go through each line of map1
and see if the tail number is already contained in planeMileages
. If it is in planeMileages
, then I want to add on to the accumulatedMileages
for that key. If it is not contained, I'd like to input the key along with it's first distance value.
我写的当前代码对我来说似乎是正确的,但是它产生了错误的结果,输出了不正确的尾号.您能否看一下并让我知道主要方法中忽略的内容?谢谢!
The current code that I've written seems sound to me, but it is producing the wrong result, outputting the incorrect tail number. Can you please take a look and let me know what I am overlooking in my main method? Thanks!
public class FlightData {
HashMap<String,String[]> dataMap;
public static void main(String[] args) {
FlightData map1 = new FlightData();
map1.dataMap = map1.createHashMap();
HashMap<String, Integer> planeMileages = new HashMap();
//Filling the Array with all tail numbers
for (String[] value : map1.dataMap.values()) {
if(planeMileages.containsKey(value[4])) {
int accumulatedMileage = planeMileages.get(value[4]) + Integer.parseInt(value[21]);
planeMileages.remove(value[4]);
planeMileages.put(value[4], accumulatedMileage);
}
else {
planeMileages.put(value[4],Integer.parseInt(value[21]));
}
}
String maxKey = Collections.max(planeMileages.entrySet(), Map.Entry.comparingByValue()).getKey();
System.out.println(maxKey);
}
public HashMap<String,String[]> createHashMap() {
File flightFile = new File("flights.csv");
HashMap<String,String[]> flightsMap = new HashMap<String,String[]>();
try {
Scanner s = new Scanner(flightFile);
while (s.hasNextLine()) {
String info = s.nextLine();
String [] piecesOfInfo = info.split(",");
String flightKey = piecesOfInfo[4] + "_" + piecesOfInfo[2] + "_" + piecesOfInfo[11]; //Setting the Key
String[] values = Arrays.copyOfRange(piecesOfInfo, 0, piecesOfInfo.length);
flightsMap.put(flightKey, values);
}
s.close();
}
catch (FileNotFoundException e)
{
System.out.println("Cannot open: " + flightFile);
}
return flightsMap;
}
}
请在下面查看我的CSV文件的几行:
Please see a few lines of my CSV file below:
DayofMonth DayOfWeek FlightDate UniqueCarrier TailNum OriginAirportID Origin OriginStateName DestAirportID Dest DestStateName DepTime DepDelay WheelsOff WheelsOn ArrTime ArrDelay Cancelled CancellationCode Diverted AirTime Distance
3 1 10/3/2016 AA N786AA 10721 BOS Massachusetts 12478 JFK New York 556 -4 623 703 709 -6 0 0 40 187
4 2 10/4/2016 AA N794AA 10721 BOS Massachusetts 12478 JFK New York 554 -6 615 703 712 -3 0 0 48 187
1 6 10/1/2016 AA N783AA 12478 JFK New York 12892 LAX California 823 -7 844 1104 1111 -30 0 0 320 2475
2 7 10/2/2016 AA N798AA 12478 JFK New York 12892 LAX California 847 17 904 1131 1159 18 0 0 327 2475
3 1 10/3/2016 AA N786AA 12478 JFK New York 12892 LAX California 825 -5 838 1109 1131 -10 0 0 331 2475
4 2 10/4/2016 AA N794AA 12478 JFK New York 12892 LAX California 826 -4 848 1114 1132 -9 0 0 326 2475
推荐答案
你可以检查一下吗?
public static void main(String[] args) throws IOException {
Map<String, String[]> map = createMap();
Map<String, Long> planeMileages = map
.entrySet()
.stream()
.collect(Collectors.groupingBy(o -> o.getValue()[4],
Collectors.collectingAndThen(
Collectors.summarizingInt(value ->
Integer.parseInt(value.getValue()[21])), IntSummaryStatistics::getSum
)
));
String maxKey = planeMileages.entrySet().stream().max(Comparator.comparing(Map.Entry::getValue)).get().getKey();
System.out.println("max key: "+ maxKey);
}
public static Map<String, String[]> createMap() throws IOException {
try (BufferedReader a = Files.newBufferedReader(Paths.get("flights.csv"))) {
return a.lines().map(s -> s.split(","))
.collect(Collectors.toMap(piecesOfInfo -> String.join("_", piecesOfInfo[4], piecesOfInfo[2], piecesOfInfo[11]), Function.identity()));
}
}
public static Map<String, String[]> createMapLastDupWins() throws IOException {
try (BufferedReader a = Files.newBufferedReader(Paths.get("flights.csv"))) {
return a.lines().map(s -> s.split(","))
.collect(Collectors.toMap(piecesOfInfo -> String.join("_", piecesOfInfo[4], piecesOfInfo[2], piecesOfInfo[11]), Function.identity(), (strings, strings2) -> {
//if this helps than data is duplicated
return strings2;
}));
}
}
这篇关于Java:如何使用HashMaps根据第二列中的条件求和一列的所有值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!