将空格分隔的文本文件读取到SAS中 [英] Reading space delimited text file into SAS
问题描述
我有以下.txt文件:
Mark1[Country1]
type1=1 type2=5
type1=1.50 EUR type2=21.00 EUR
Mark2[Country2]
type1=2 type2=1 type3=1
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
....
我正在尝试将其输入到SAS程序中,以使其看起来像这样:
Mark Country Type Count Price
1 Mark1 Country1 type1 1 1.50
2 Mark1 Country1 type2 5 21.00
3 Mark1 Country1 type3 NA NA
4 Mark2 Country2 type1 2 197.50
5 Mark2 Country2 type2 2 201.00
6 Mark2 Country2 type3 1 312.50
或者也许还有其他东西,但是我需要能够打印双向报告
Country1 Country2
Type1 ... ...
Type2 ... ...
Type3 ... ...
但是问题是如何读取这种txt文件:
- 读取Mark1 [Country1]并将其分隔为两列Mark和Country;
- 保留标记和国家/地区并读取每种类型的信息(+忽略type1 =,可能使用格式),然后将其输入表格中. 也许有一种方法可以使用某种输入模板来实现该要求或简化查询.
您有3个名称/值对,但是这些对被分为两行.需要输入创意的异常文本文件. INPUT
语句具有行控制功能#
,可读取隐式DATA Step循环中的相对将来的行.
示例(执行报告)
从当前行(相对行#1)读取mark
和country
,使用#2
从相对行#2读取count
,从相对行#3
中读取price
.在为给定的mark
country
输入名称/值之后,执行基于数组的数据透视,将两个变量(count
和price
)一次转换为分类(type
)数据形式. /p>
Proc REPORT生成双向"列表.该清单实际上是一个汇总报告(计数和价格之下的单元格是默认的SUM汇总),但是每个单元格只有一个贡献值,因此SUM是原始的单个值.
data have(keep=Mark Country Type Count Price);
attrib mark country length=$10;
infile cards delimiter='[ ]' missover;
input mark country;
input #2 @'type1=' count_1 @'type2=' count_2 @'type3=' count_3;
input #3 @'type1=' price_1 @'type2=' price_2 @'type3=' price_3;
array counts count_:;
array prices price_:;
do _i_ = 1 to dim(counts);
Type = cats('type',_i_);
Count = counts(_i_);
Price = prices(_i_);
output;
end;
datalines;
Mark1[Country1]
type1=1 type2=5
type1=1.50 EUR type2=21.00 EUR
Mark2[Country2]
type1=2 type2=1 type3=1
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
;
ods html file='twoway.html';
proc report data=have;
column type country,(count price);
define type / group;
define country / ' ' across;
run;
ods html close;
输出图像
组合汇总
proc means nway data=have noprint;
class type country;
var count price;
output out=stats max(price)=price_max sum(count)=count_sum;
run;
data cells;
set stats;
if not missing(price_max) then
cell = cats(price_max,'(',count_sum,')');
run;
proc transpose data=cells out=twoway(drop=_name_);
by type;
id country;
var cell;
run;
proc print noobs data=twoway;
run;
I have a following .txt file:
Mark1[Country1]
type1=1 type2=5
type1=1.50 EUR type2=21.00 EUR
Mark2[Country2]
type1=2 type2=1 type3=1
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
....
I am trying to input it in my SAS program, so that it would look something like that:
Mark Country Type Count Price
1 Mark1 Country1 type1 1 1.50
2 Mark1 Country1 type2 5 21.00
3 Mark1 Country1 type3 NA NA
4 Mark2 Country2 type1 2 197.50
5 Mark2 Country2 type2 2 201.00
6 Mark2 Country2 type3 1 312.50
Or maybe something else, but i need it to be possible to print two way report
Country1 Country2
Type1 ... ...
Type2 ... ...
Type3 ... ...
But the question is how to read that kind of txt file:
- read and separate Mark1[Country1] to two columns Mark and Country;
- retain Mark and Country and read info for each Type (+somehow ignoring type1=, maybe using formats) and input it in a table. Maybe there is a way to use some kind of input templates to achive that or nasted queries.
You have 3 name/value pairs, but the pairs are split between two rows. An unusual text file requiring creative input. The INPUT
statement has a line control feature #
to read relative future rows within the implicit DATA Step loop.
Example (Proc REPORT)
Read the mark
and country
from the current row (relative row #1), the count
s from relative row #2 using #2
and the price
s from relative row #3
. After the name/value inputs are made for a given mark
country
perform an array based pivot, transposing two variables (count
and price
) at a time into a categorical (type
) data form.
Proc REPORT produces a 'two-way' listing. The listing is actually a summary report (cells under count and price are a default SUM aggregate), but each cell has only one contributing value so the SUM is the original individual value.
data have(keep=Mark Country Type Count Price);
attrib mark country length=$10;
infile cards delimiter='[ ]' missover;
input mark country;
input #2 @'type1=' count_1 @'type2=' count_2 @'type3=' count_3;
input #3 @'type1=' price_1 @'type2=' price_2 @'type3=' price_3;
array counts count_:;
array prices price_:;
do _i_ = 1 to dim(counts);
Type = cats('type',_i_);
Count = counts(_i_);
Price = prices(_i_);
output;
end;
datalines;
Mark1[Country1]
type1=1 type2=5
type1=1.50 EUR type2=21.00 EUR
Mark2[Country2]
type1=2 type2=1 type3=1
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
;
ods html file='twoway.html';
proc report data=have;
column type country,(count price);
define type / group;
define country / ' ' across;
run;
ods html close;
Output image
Combined aggregation
proc means nway data=have noprint;
class type country;
var count price;
output out=stats max(price)=price_max sum(count)=count_sum;
run;
data cells;
set stats;
if not missing(price_max) then
cell = cats(price_max,'(',count_sum,')');
run;
proc transpose data=cells out=twoway(drop=_name_);
by type;
id country;
var cell;
run;
proc print noobs data=twoway;
run;
这篇关于将空格分隔的文本文件读取到SAS中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!