Python Pandas数据帧在excel表中读取精确的指定范围 [英] Python Pandas dataframe reading exact specified range in an excel sheet

查看：890 发布时间：2017/9/4 1:32:09 python excel pandas

本文介绍了Python Pandas数据帧在excel表中读取精确的指定范围的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有很多不同的表格（和excel表格中的其他非结构化数据）..我需要创建一个数据帧超出范围'A3：D20'从'Sheet2'的Excel表'数据'

所有的例子，我深入到钻取层级，但不是如何从一个确切的范围选择

  import openpyxl 
 import pandas as pd 
 
 wb = openpyxl.load_workbook（'data.xlsx'）
 sheet = wb.get_sheet_by_name（'Sheet2' ）
 range = ['A3'：'D20']＃<  - 如何指定？ 
 spots = pd.DataFrame（sheet.range）#what应该是这个的确切语法？ 
 
打印（点）

一旦我得到这个，那么我打算在列A中查找一些数据，并在列B中找到相应的值。

编辑：我意识到openpyxl需要太长时间，所以更改为 pandas.read_excel（'data.xlsx'，'Sheet2'）而不是，在这个阶段，nad的速度要快得多。

Edit2：暂时把我的数据放在一张表中，删除了我最左边一列的所有其他info.added列名，应用 index_col ，然后使用wb

解决方案

这样做的一个方法是使用 openpyxl 模块。

这里有一个例子：

  from openpyxl import load_workbook 
 
 wb = load_workbook（filename ='data.xlsx'，
 read_only = True ）
 
 ws = wb ['Sh eet2'] 
 
＃将单元格值读入列表列表
 data_rows = [] 
 for ws ['A3'：'D20']：
 data_cols = [] 
行中的单元格：
 data_cols.append（cell.value）
 data_rows.append（data_cols）
 
＃转换为数据框
 import pandas as pd 
 df = pd.DataFrame（data_rows）

I have a lot of different table (and other unstructured data in an excel sheet) .. I need to create a dataframe out of range 'A3:D20' from 'Sheet2' of Excel sheet 'data'

all examples that I come across drilldown up to sheet level, but not how to pick it from an exact range

import openpyxl
import pandas as pd

wb = openpyxl.load_workbook('data.xlsx')
sheet = wb.get_sheet_by_name('Sheet2')
range = ['A3':'D20']   #<-- how to specify this?
spots = pd.DataFrame(sheet.range) #what should be the exact syntax for this?

print (spots)

Once I get this, then I plan to lookup for some data in column A and find its corresponding value in column B

EDIT: I realised that openpyxl takes too long, and so have changed that to pandas.read_excel('data.xlsx','Sheet2') instead, nad is much faster at that stage atleast

Edit2: For the time being, I have put my data in just one sheet and removed all other info..added column names, Applied index_col on my leftmost column.. and then using wb.loc[] which solves it for me

解决方案

One way to do this is to use the openpyxl module.

Here's an example:

from openpyxl import load_workbook

wb = load_workbook(filename='data.xlsx', 
                   read_only=True)

ws = wb['Sheet2']

# Read the cell values into a list of lists
data_rows = []
for row in ws['A3':'D20']:
    data_cols = []
    for cell in row:
        data_cols.append(cell.value)
    data_rows.append(data_cols)

# Transform into dataframe
import pandas as pd
df = pd.DataFrame(data_rows)

这篇关于Python Pandas数据帧在excel表中读取精确的指定范围的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python Pandas数据帧在excel表中读取精确的指定范围 [英] Python Pandas dataframe reading exact specified range in an excel sheet

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Pandas数据帧在excel表中读取精确的指定范围 [英] Python Pandas dataframe reading exact specified range in an excel sheet

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭