SAS:使用PROC IMPORT导入.xlsx时定义类型 [英] SAS: Define type when importing .xlsx with PROC IMPORT
问题描述
问题:在使用PROC IMPORT时如何定义从.xlsx文件导入的变量的变量类型?
Questions: How do I define the variable type of variables being imported from a .xlsx file when using PROC IMPORT?
我的工作
我正在使用SAS v9.4.据我所知,它是香草SAS.我没有SAS/ACCESS等.
I am using SAS v9.4. So far as I'm aware, it is vanilla SAS. I do not have SAS/ACCESS etc.
我的数据如下:
ID1 ID2 MONTH YEAR QTR VAR1 VAR2
ABC_1234 1 1 2010 1 869 3988
ABC_1235 12 2 2010 1 639 3144
ABC_1236 13 3 2010 2 698 3714
ABC_1237 45 4 2010 2 630 3213
我正在运行的过程是:
proc import out=rawdata
datafile = "c:\rawdata.xlsx"
dbms = xlsx replace;
format ID1 $9. ;
format ID2 $3. ;
format MONTH best2. ;
format YEAR best4. ;
format QTR best1. ;
format VAR1 best3. ;
format VAR2 best4. ;
run;
运行此步骤时,将获得以下日志输出:
When I run this step, I get the following log output:
错误:您正在尝试在数据集WORK.RAWDATA中使用字符格式$和数字变量ID2.
ERROR: You are trying to use the character format $ with the numeric variable ID2 in data set WORK.RAWDATA.
这似乎告诉我,SAS自动分配变量类型.我希望能够手动控制它.我找不到说明如何执行此操作的文档. INFORMAT,LENGTH和INPUT语句似乎不适用于PROC IMPORT.
What this seems to tell me is that SAS automatically assigns the variable type. I want to be able to control it manually. I cannot find documentation which explains how to do this. INFORMAT, LENGTH, and INPUT statements do not seem to work for PROC IMPORT.
我之所以使用PROC IMPORT,是因为它对.xlsx文件整体而言取得了最大的成功.我可以想到的两个可能的解决方案是:1)将.xlsx转换为.csv并在DATA步骤中使用INFILE,以及2)将数据作为数字输入并在后续步骤中将其转换为字符.我不喜欢第一个解决方案,因为它需要我手动处理数据,这是潜在的错误来源(例如删除了前导零).我不喜欢第二个,因为第二个可能会无意间引入错误(例如,前导零),并引入多余的工作.
I am using PROC IMPORT because it has yielded the greatest success with .xlsx files overall. Two possible solutions I can think of are 1) convert .xlsx to .csv and use INFILE in a DATA step and 2) bring the data in as numeric and convert it to character in a later step. I dislike the first solution because it requires me to manually manipulate the data, a potential source of error (such as leading zeros being removed). I dislike the second because it may unintentionally introduce errors (again, such as with leading zeros) and introduces extraneous work.
推荐答案
您可以尝试在Excel中将列类型设置为文本",以查看SAS是否会从中确定.值得一试.
You can try to set the columns type as "Text" in Excel to see if SAS will determine it from that. Worth a shot.
如果这不起作用,除非您使用PC Files Server,或者在同一SAS服务器上安装了相同位数的Excel以直接访问文件,否则您将需要使用单独的数据步骤来转换列.
If that doesn't work, unless you use PC Files Server, or have Excel of the same bitness installed on the same SAS server for direct access to the file, you will need to use a separate data step to convert the columns.
proc import
file = "c:\rawdata.xlsx"
out=_rawdata(rename=(ID2 = _ID2) )
dbms = xlsx replace;
run;
data rawdata;
format ID1 $9. ;
format ID2 $3. ;
format MONTH best2. ;
format YEAR best4. ;
format QTR best1. ;
format VAR1 best3. ;
format VAR2 best4. ;
set _rawdata;
ID2 = cats(_ID2);
drop _:;
run;
If you do have SAS/Access to Excel, you can control these variables directly with the DBDSOPTS data set option. For example:
libname myxlsx Excel 'C:\rawdata.xlsx';
data rawdata;
set myxlsx.'Sheet1$'n(DBDSOPTS="DBTYPE=(ID2='CHAR(3)')");
run;
发生此问题的原因是因为proc import
中的xlsx
引擎在SAS内部,并且与Excel
引擎分开. Excel
引擎使用Microsoft Jet或Ace,而xlsx
引擎使用专有系统,该系统没有Microsoft的控制权.为什么会这样,我不知道.
The reason why the problem is occurring is because the xlsx
engine in proc import
is internal to SAS, and is separate from the Excel
engine. The Excel
engine uses Microsoft Jet or Ace, whereas the xlsx
engine uses a proprietary system that does not have as much control as Microsoft's. Why this is the case, I have no idea.
运行proc import
时,SAS将尝试猜测其应为哪种格式(您可以使用guessingrows
选项控制xls文件的格式).如果检测到所有数字,则将假定为数字变量.不幸的是,如果没有安装SAS/ACCESS to Excel或PC Files Server,就无法直接控制变量类型.
When proc import
is run, SAS will try to guess what format it should be (which you do have control over with xls files using the guessingrows
option). If it detects all numbers, it will assume a numeric variable. Unfortunately, without SAS/ACCESS to Excel or PC Files Server installed, you cannot control the variable type directly.
这篇关于SAS:使用PROC IMPORT导入.xlsx时定义类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!