读取Excel表格,而不是简单的命名范围 [英] Read Excel Tables, not simple named ranges
问题描述
为避免重复"关闭请求:我知道如何读取Excel命名范围;示例在下面的代码中给出.这是关于Excel中的实际"表.
To avoid "duplicate" close request: I know how to read Excel named ranges; examples are given in the code below. This is about "real" tables in Excel.
Excel2007及更高版本具有表格的有用概念:您可以将范围转换为表格,并避免在排序和重新排列时出现麻烦.在Excel范围内创建表时,它将获得默认名称(德语版本为Tabelle1
,在以下示例中为TableName
),但您可以另外简单地命名表范围(TableAsRangeName
);如Excel范围名称编辑器中的图标所指示,这两者似乎被区别对待.
Excel2007 and later have the useful concept of tables: you can convert ranges to tables, and avoid hassles when sorting and rearranging. When you create a table in an Excel range, it gets a default name (Tabelle1
in German Version, TableName
in the following example), but you can additionally simply name the range of the table (TableAsRangeName
); as indicated by the icons in the Excel range name editor, these two seem to be treated differently.
我无法从R中读取这些表(严格意义上).唯一已知的解决方法是使用CSV中间体,或将表转换为正常的命名范围,当您使用列时,这具有令人讨厌的不可逆副作用单元格引用中的名称;这些将转换为A1表示法.
I have not been able to read these tables (in the strict sense) from R. The only known workaround is using CSV intermediate, or converting the table to a normal named range, which has nasty irreversible side effects when you use column names in cell references; these are converted to A1 notation.
以下示例显示了该问题.您的里程可能会因32/64位ODBC驱动程序和32/64位Java的不同组合而有所不同
The example below shows the problem. You mileage may vary with different combinations of 32/64 bit ODBC drivers and 32/64 bit Java
# Read Excel Tables (not simply named ranges)
# Test Computer: 64 Bit Windows 7, R 32 bit
# My ODBC drivers are 32 bit
library(RODBC)
# Test file has three ranges
# NonTable Simple named range
# TableName Name of table
# TableAsRangeName Named Range covering the above table
sampleFile = "ExcelTables.xlsx"
if (!file.exists(sampleFile)){
download.file("http://www.menne-biomed.de/uni/ExcelTables.xlsx",sampleFile)
# Or do it manually, if this fails
}
# ODBC
channel = odbcConnectExcel2007(sampleFile)
sqlQuery(channel, "SELECT * from NonTable") # Ok
sqlQuery(channel, "SELECT * from TableName") # Could not find range
sqlQuery(channel, "SELECT * from TableAsRangeName") # Could not find range
close(channel)
# gdata has read.xls, but seems not to support named regions
library(xlsx)
wb = loadWorkbook(sampleFile)
getRanges(wb) # This one fails already with "TableName" does not exist
ws = getSheets(wb)[[1]]
readRange("NonTable",ws) # Invalid range address
readRange("TableName",ws) # Invalid range address
readRange("TableAsRangeName",ws) # Invalid range address
# my machine requires 64 bit for this one; depends on your Java installation
sampleFile = "ExcelTables.xlsx"
library(XLConnect) # requires Java
readNamedRegionFromFile(sampleFile,"NonTable") # OK
readNamedRegionFromFile(sampleFile,"TableName") # "TableName" does not exist
readNamedRegionFromFile(sampleFile,"TableAsRangeName") # NullPointerException
wb <- loadWorkbook(sampleFile)
readNamedRegion(wb,"NonTable") # Ok
readNamedRegion(wb,"TableName") # does not exist
readNamedRegion(wb,"TableAsRangeName") # Null Pointer
推荐答案
我在 XLConnect 中添加了对Excel表的一些初始支持.请在 https://github.com/miraisolutions/xlconnect
I've added some initial support for Excel tables in XLConnect. Please find the latest changes on github at https://github.com/miraisolutions/xlconnect
在下面的一个小示例中:
In the following a small sample:
require(XLConnect)
sampleFile = "ExcelTables.xlsx"
wb = loadWorkbook(sampleFile)
readTable(wb, sheet = "ExcelTable", table = "TableName")
请注意,Excel表已与工作表关联.据我所知,可能有多个具有相同名称的表关联到不同的工作表.因此,对于readTable
有一个sheet
参数.
Note that Excel tables are associated to a sheet. So as far as I can see it's possible to have multiple tables with the same name associated to different sheets. For this reason there is a sheet
-argument to readTable
.
这篇关于读取Excel表格,而不是简单的命名范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!