如何使用 Pandas 从 Excel 中读取某些列 - Python [英] how to read certain columns from Excel using Pandas - Python
问题描述
我正在从 Excel 工作表中读取数据,我想读取某些列:第 0 列因为它是行索引,以及第 22:37 列.现在这就是我要做的:
将pandas导入为pd将 numpy 导入为 npfile_loc = "path.xlsx"df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = 37)df= pd.concat([df[df.columns[0]], df[df.columns[22:]]],axis=1)
但我希望有更好的方法来做到这一点!我知道如果我执行 parse_cols=[0, 22,..,37]
我可以做到,但是对于大型数据集这没有意义.
我也这样做了:
s = pd.Series(0)s[1]=22对于范围内的 i (2,14):s[i]=s[i-1]+1df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = s)
但它读取前 15 列,即 s
的长度.
您可以像这样使用列索引(字母):
将pandas导入为pd将 numpy 导入为 npfile_loc = "path.xlsx";df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], usecols=A,C:AA")打印(df)
对应文档:><块引用>
usecols : int, str, list-like, or callable default None
如果没有,则解析所有列.
如果是 str,则表示 Excel 列字母和列范围的逗号分隔列表(例如A:E"或A,C,E:F").范围包括双方.
如果是 int 列表,则表示要解析的列号列表.
如果是字符串列表,则表示要解析的列名列表.
0.24.0 版的新功能.
如果可调用,则根据它评估每个列名,如果可调用返回 True,则解析该列.
根据上述行为返回列的子集.
0.24.0 版的新功能.
I am reading from an Excel sheet and I want to read certain columns: column 0 because it is the row-index, and columns 22:37. Now here is what I do:
import pandas as pd
import numpy as np
file_loc = "path.xlsx"
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = 37)
df= pd.concat([df[df.columns[0]], df[df.columns[22:]]], axis=1)
But I would hope there is better way to do that! I know if I do parse_cols=[0, 22,..,37]
I can do it, but for large datasets this doesn't make sense.
I also did this:
s = pd.Series(0)
s[1]=22
for i in range(2,14):
s[i]=s[i-1]+1
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = s)
But it reads the first 15 columns which is the length of s
.
You can use column indices (letters) like this:
import pandas as pd
import numpy as np
file_loc = "path.xlsx"
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], usecols="A,C:AA")
print(df)
usecols : int, str, list-like, or callable default None
If None, then parse all columns.
If str, then indicates comma separated list of Excel column letters and column ranges (e.g. "A:E" or "A,C,E:F"). Ranges are inclusive of both sides.
If list of int, then indicates list of column numbers to be parsed.
If list of string, then indicates list of column names to be parsed.
New in version 0.24.0.
If callable, then evaluate each column name against it and parse the column if the callable returns True.
Returns a subset of the columns according to behavior above.
New in version 0.24.0.
这篇关于如何使用 Pandas 从 Excel 中读取某些列 - Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!