使用Python从Excel(.xlsx)中提取超链接 [英] Extracting Hyperlinks From Excel (.xlsx) with Python

查看:7497
本文介绍了使用Python从Excel(.xlsx)中提取超链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找用于Excel文件操作的xlrd和openpyxl库。但是,xlrd目前不支持.xlsx文件的 formatting_info = True ,所以我不能使用xlrd hyperlink_map 函数。所以我转向openpyxl,但也没有运气从excel文件中提取超链接。测试代码如下(测试文件包含一个简单的超链接到谷歌超链接文本设置为测试):

I have been looking at mostly the xlrd and openpyxl libraries for Excel file manipulation. However, xlrd currently does not support formatting_info=True for .xlsx files, so I can not use the xlrd hyperlink_map function. So I turned to openpyxl, but have also had no luck extracting a hyperlink from an excel file with it. Test code below (the test file contains a simple hyperlink to google with hyperlink text set to "test"):

import openpyxl

wb = openpyxl.load_workbook('testFile.xlsx')

ws = wb.get_sheet_by_name('Sheet1')

r = 0
c = 0

print ws.cell(row = r, column = c). value
print ws.cell(row = r, column = c). hyperlink
print ws.cell(row = r, column = c). hyperlink_rel_id

输出:

test

None

我想openpyxl不会目前还支持完全格式化?是否有一些其他库可用于从Excel(.xlsx)文件中提取超链接信息?

I guess openpyxl does not currently support formatting completely either? Is there some other library I can use to extract hyperlink information from Excel (.xlsx) files?

推荐答案

根据我的经验,获得好处.xlsx交互需要转移到IronPython。这使您可以使用公共语言运行时(clr)并直接与excel交互'

In my experience getting good .xlsx interaction requires moving to IronPython. This lets you work with the Common Language Runtime (clr) and interact directly with excel'

http://ironpython.net/

import clr
clr.AddReference("Microsoft.Office.Interop.Excel")
import Microsoft.Office.Interop.Excel as Excel
excel = Excel.ApplicationClass()

wb = excel.Workbooks.Open('testFile.xlsx')
ws = wb.Worksheets['Sheet1']

address = ws.Cells(row, col).Hyperlinks.Item(1).Address

这篇关于使用Python从Excel(.xlsx)中提取超链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆