如何使用工作或学校帐户将 SharePoint Online (Office365) Excel 文件读入 Python,特别是 Pandas? [英] How to read SharePoint Online (Office365) Excel files into Python specifically pandas with Work or School Account?

查看:15
本文介绍了如何使用工作或学校帐户将 SharePoint Online (Office365) Excel 文件读入 Python,特别是 Pandas?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题与下面的链接非常相​​似.如何使用工作或学校帐户在 Python 中阅读 SharePoint Online (Office365) Excel 文件?

The question is very similar to the link below. How to read SharePoint Online (Office365) Excel files in Python with Work or School Account?

本质上,我想将 SharePoint 中的 excel 文件导入到 Pandas 中以进行进一步分析.

Essentially I would like to import an excel file off SharePoint into pandas for further analysis.

问题是当我运行下面的代码时出现以下错误.

The issue is when I run the code below I get the following error.

XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'
<!DOCT'

我的代码:

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File 

url = 'https://companyname.sharepoint.com/SitePages/Home.aspx'
username = 'fakeaccount@company.com'
password = 'password!'
relative_url = '/Shared%20Documents/Folder%20Number1/Folder%20Number2/Folder3/Folder%20Number%Four/Target_Excel_File_v4.xlsx?d=w8f97c2341898_random_numbers_and_letters_a065c12cbcsf=1&e=KXoU4s'


ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  #this gives me a KeyError: 'Title'
  #print("Web title: {0}".format(web.properties['Title']))
  print('Authentication Successful')
else:
  print(ctx_auth.get_last_error())


import io
import pandas as pd

response = File.open_binary(ctx, relative_url)

#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start

#read file into pandas dataframe
df = pd.read_excel(bytes_file_obj)

print(df)

推荐答案

对于那些在这个问题上像我一样结束的人,我发现必须将完整的 URL 路径到 File,不仅仅是路径:

For those of you that ended up like me here at this issue, I found that one has to path the full URL to File, not just the path:

#import all the libraries
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File 
import io
import pandas as pd

#target url taken from sharepoint and credentials
url = 'https://company.sharepoint.com/Shared%20Documents/Folder%20Number1/Folder%20Number2/Folder3/Folder%20Number4/Target_Excel_File_v4.xlsx?cid=_Random_letters_and_numbers-21dbf74c'
username = 'Dumby_account@company.com'
password = 'Password!'

ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  print("Authentication successful")

response = File.open_binary(ctx, url)

#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start

#read excel file and each sheet into pandas dataframe 
df = pd.read_excel(bytes_file_obj, sheetname = None)

这篇关于如何使用工作或学校帐户将 SharePoint Online (Office365) Excel 文件读入 Python,特别是 Pandas?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆