在 python 中使用 csv 模块读取 .xlsx [英] Read in .xlsx with csv module in python
问题描述
我正在尝试使用 csv
模块读取 .xlsx 格式的 excel 文件,但是即使使用我的方言和编码,在使用 excel 文件时也没有任何运气指定的.下面,我用我尝试过的不同编码展示了我的不同尝试和错误结果.如果有人能指出我可以用来在 Python 中读取 .xlsx 文件的正确编码、语法或模块,我将不胜感激.
I'm trying to read in an excel file with .xlsx formatting with the csv
module, but I'm not having any luck with it when using an excel file even with my dialect and encoding specified. Below, I show my different attempts and error results with the different encodings I tried. If anyone could point me into the correct coding, syntax or module I could use to read in a .xlsx file in Python, I'd appreciate it.
使用下面的代码,我收到以下错误:_csv.Error: line contains NULL byte
With the below code, I get the following error: _csv.Error: line contains NULL byte
#!/usr/bin/python
import sys, csv
with open('filelocation.xlsx', "r+", encoding="Latin1") as inputFile:
csvReader = csv.reader(inputFile, dialect='excel')
for row in csvReader:
print(row)
使用下面的代码,我得到以下错误:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcc in position 16: invalid continuation byte
With the below code, I get the following error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcc in position 16: invalid continuation byte
#!/usr/bin/python
import sys, csv
with open('filelocation.xlsx', "r+", encoding="Latin1") as inputFile:
csvReader = csv.reader(inputFile, dialect='excel')
for row in csvReader:
print(row)
当我在 encoding
中使用 utf-16
时,出现以下错误:UnicodeDecodeError: 'utf-16-le' codec can't decode位置 570-571 中的字节:非法的 UTF-16 代理
When I use utf-16
in the encoding
, I get the following error: UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 570-571: illegal UTF-16 surrogate
推荐答案
您不能使用 Python 的 csv
库来读取 xlsx
格式的文件.您需要安装和使用不同的库.例如,您可以使用 openpyxl
如下:
You cannot use Python's csv
library for reading xlsx
formatted files. You need to install and use a different library. For example, you could use openpyxl
as follows:
import openpyxl
wb = openpyxl.load_workbook("filelocation.xlsx")
ws = wb.active
for row in ws.iter_rows(values_only=True):
print(row)
这会将文件中的所有行显示为行值列表.Python Excel 网站提供了其他可能的示例.
This would display all of the rows in the file as lists of row values. The Python Excel website gives other possible examples.
或者,您可以创建一个行列表:
Alternatively you could create a list of rows:
import openpyxl
wb = openpyxl.load_workbook("filelocation.xlsx")
ws = wb.active
data = list(ws.iter_rows(values_only=True))
print(data)
注意:如果您使用旧的 Excel 格式 .xls
,则可以改用 xlrd
库.虽然这不再支持 .xlsx
格式.
Note: If you are using the older Excel format .xls
, you could instead use the xlrd
library. This no longer supports the .xlsx
format though.
import xlrd
workbook = xlrd.open_workbook("filelocation.xlsx")
sheet = workbook.sheet_by_index(0)
data = [sheet.row_values(rowx) for rowx in range(sheet.nrows)]
print(data)
这篇关于在 python 中使用 csv 模块读取 .xlsx的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!