在Python中找到.CSV文件中的最大数 [英] Find max number in .CSV file in Python

查看:260
本文介绍了在Python中找到.CSV文件中的最大数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个.csv文件,在Excel中打开时,如下所示:

I have a .csv file that when opened in Excel looks like this:

我的代码:

myfile = open("/Users/it/Desktop/Python/In-Class Programs/countries.csv", "rb")

    countries = []
    for item in myfile:
        a = item.split(",")
        countries.append(a)

    hdi_list = []
    for acountry in countries:
        hdi = acountry[3]

        try:
            hdi_list.append(float(hdi))
        except:
            pass

    average = round(sum(hdi_list)/len(hdi_list), 2)
    maxNumber = round(max(hdi_list), 2)
    minNumber = round(min(hdi_list), 2)

这段代码工作正常,但是,当我找到max,min或avg时,我需要获取相应的国家名称并打印。

This code works well, however, when I find the max,min, or avg I need to grab the corresponding name of the country and print that as well.

如何更改我的代码以获取min,max,avg的国家/地区名称?

How can I change my code to grab the country name of the min,max, avg as well?

推荐答案

以下方法足够接近您的实现,我认为它可能是有用的。但是,如果你开始使用更大或更复杂的csv文件,你应该看看如csv.reader或Pandas(如前所述)的包。它们在处理复杂的.csv数据时更加健壮和高效。您也可以使用xlrd包来处理Excel。

The following approach is close enough to your implementation that I think it might be useful. However, if you start working with larger or more complicated csv files, you should look into packages like "csv.reader" or "Pandas" (as previously mentioned). They are more robust and efficient in working with complex .csv data. You could also work through Excel with the "xlrd" package.

在我看来,引用国家名称及其各自值的最简单的解决方案是结合您的for循环'。而不是循环数据两次(在两个单独的for循环)和创建两个单独的列表,使用单个for循环并创建一个具有相关数据(即国家名称,hdi)的字典。您也可以创建一个元组(如前所述),但我认为字典更明确。

In my opinion, the simplest solution to reference country names with their respective values is to combine your 'for loops'. Instead of looping through your data twice (in two separate 'for loops') and creating two separate lists, use a single 'for loop' and create a dictionary with relevant data (ie. "country name", "hdi"). You could also create a tuple (as previously mentioned) but I think dictionaries are more explicit.

myfile = open("/Users/it/Desktop/Python/In-Class Programs/countries.csv", "rb")

countries = []
for line in myfile:
    country_name = line.split(",")[1]
    value_of_interest = float(line.split(",")[3])
    countries.append(
        {"Country Name": country_name, 
         "Value of Interest": value_of_interest})

ave_value = sum([country["Value of Interest"] for country in countries])/len(countries)
max_value = max([country["Value of Interest"] for country in countries])
min_value = min([country["Value of Interest"] for country in countries])

print "Country Average == ", ave_value
for country in countries:
    if country["Value of Interest"] == max_value:
        print "Max == {country}:{value}".format(country["Country Name"], country["Value of Interest"])
    if country["Value of Interest"] == min_value:
        print "Min == {country}:{value}".format(country["Country Name"], country["Value of Interest"])

请注意,如果这些方法具有相同的最小/最大值,则此方法会返回多个国家/地区。

Note that this method returns multiple countries if they have equal min/max values.

如果您在创建单独的列表,您可以考虑使用zip()连接您的列表(按索引),其中

If you are dead-set on creating separate lists (like your current implementation), you might consider zip() to connect your lists (by index), where

zip(countries, hdi_list) = [(countries[1], hdi_list[1]), ...]

>

For example:

for country in zip(countries, hdi_list):
    if country[1] == max_value:
        print country[0], country[1]

此方法有效,但不太明确,难以维护。

with similar logic applied to the min and average. This method works but is less explicit and more difficult to maintain.

这篇关于在Python中找到.CSV文件中的最大数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆