将空格分隔的文本文件读取到SAS中 [英] Reading space delimited text file into SAS

查看:212
本文介绍了将空格分隔的文本文件读取到SAS中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下.txt文件:

Mark1[Country1]
type1=1 type2=5 
type1=1.50 EUR type2=21.00 EUR 
Mark2[Country2]
type1=2 type2=1 type3=1 
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
....

我正在尝试将其输入到SAS程序中,以使其看起来像这样:

  Mark  Country   Type  Count   Price

1 Mark1 Country1  type1   1     1.50 
2 Mark1 Country1  type2   5     21.00 
3 Mark1 Country1  type3   NA     NA 
4 Mark2 Country2  type1   2     197.50 
5 Mark2 Country2  type2   2     201.00 
6 Mark2 Country2  type3   1     312.50 

或者也许还有其他东西,但是我需要能够打印双向报告

       Country1   Country2 
Type1    ...        ...  
Type2    ...        ...   
Type3    ...        ...  

但是问题是如何读取这种txt文件:

  1. 读取Mark1 [Country1]并将其分隔为两列Mark和Country;
  2. 保留标记和国家/地区并读取每种类型的信息(+忽略type1 =,可能使用格式),然后将其输入表格中. 也许有一种方法可以使用某种输入模板来实现该要求或简化查询.

解决方案

您有3个名称/值对,但是这些对被分为两行.需要输入创意的异常文本文件. INPUT语句具有行控制功能#,可读取隐式DATA Step循环中的相对将来的行.

示例(执行报告)

从当前行(相对行#1)读取markcountry,使用#2从相对行#2读取count,从相对行#3中读取price.在为给定的mark country输入名称/值之后,执行基于数组的数据透视,将两个变量(countprice)一次转换为分类(type)数据形式. /p>

Proc REPORT生成双向"列表.该清单实际上是一个汇总报告(计数和价格之下的单元格是默认的SUM汇总),但是每个单元格只有一个贡献值,因此SUM是原始的单个值.

data have(keep=Mark Country Type Count Price);
  attrib mark country length=$10;

  infile cards delimiter='[ ]' missover; 

  input mark country;

  input #2 @'type1=' count_1 @'type2=' count_2 @'type3=' count_3;
  input #3 @'type1=' price_1 @'type2=' price_2 @'type3=' price_3;

  array counts count_:;
  array prices price_:;

  do _i_ = 1 to dim(counts);
    Type = cats('type',_i_);
    Count = counts(_i_);
    Price = prices(_i_);
    output;
  end;
datalines;
Mark1[Country1]
type1=1 type2=5 
type1=1.50 EUR type2=21.00 EUR
Mark2[Country2]
type1=2 type2=1 type3=1 
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
;

ods html file='twoway.html';

proc report data=have;
  column type country,(count price);
  define type / group;
  define country / ' ' across;
run;

ods html close;

输出图像

组合汇总

proc means nway data=have noprint;
  class type country;
  var count price;
  output out=stats max(price)=price_max sum(count)=count_sum;
run;

data cells;
  set stats;
  if not missing(price_max) then 
    cell = cats(price_max,'(',count_sum,')');
run;

proc transpose data=cells out=twoway(drop=_name_);
  by type;
  id country;
  var cell;
run;

proc print noobs data=twoway;
run;

I have a following .txt file:

Mark1[Country1]
type1=1 type2=5 
type1=1.50 EUR type2=21.00 EUR 
Mark2[Country2]
type1=2 type2=1 type3=1 
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
....

I am trying to input it in my SAS program, so that it would look something like that:

  Mark  Country   Type  Count   Price

1 Mark1 Country1  type1   1     1.50 
2 Mark1 Country1  type2   5     21.00 
3 Mark1 Country1  type3   NA     NA 
4 Mark2 Country2  type1   2     197.50 
5 Mark2 Country2  type2   2     201.00 
6 Mark2 Country2  type3   1     312.50 

Or maybe something else, but i need it to be possible to print two way report

       Country1   Country2 
Type1    ...        ...  
Type2    ...        ...   
Type3    ...        ...  

But the question is how to read that kind of txt file:

  1. read and separate Mark1[Country1] to two columns Mark and Country;
  2. retain Mark and Country and read info for each Type (+somehow ignoring type1=, maybe using formats) and input it in a table. Maybe there is a way to use some kind of input templates to achive that or nasted queries.

解决方案

You have 3 name/value pairs, but the pairs are split between two rows. An unusual text file requiring creative input. The INPUT statement has a line control feature # to read relative future rows within the implicit DATA Step loop.

Example (Proc REPORT)

Read the mark and country from the current row (relative row #1), the counts from relative row #2 using #2 and the prices from relative row #3. After the name/value inputs are made for a given mark country perform an array based pivot, transposing two variables (count and price) at a time into a categorical (type) data form.

Proc REPORT produces a 'two-way' listing. The listing is actually a summary report (cells under count and price are a default SUM aggregate), but each cell has only one contributing value so the SUM is the original individual value.

data have(keep=Mark Country Type Count Price);
  attrib mark country length=$10;

  infile cards delimiter='[ ]' missover; 

  input mark country;

  input #2 @'type1=' count_1 @'type2=' count_2 @'type3=' count_3;
  input #3 @'type1=' price_1 @'type2=' price_2 @'type3=' price_3;

  array counts count_:;
  array prices price_:;

  do _i_ = 1 to dim(counts);
    Type = cats('type',_i_);
    Count = counts(_i_);
    Price = prices(_i_);
    output;
  end;
datalines;
Mark1[Country1]
type1=1 type2=5 
type1=1.50 EUR type2=21.00 EUR
Mark2[Country2]
type1=2 type2=1 type3=1 
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
;

ods html file='twoway.html';

proc report data=have;
  column type country,(count price);
  define type / group;
  define country / ' ' across;
run;

ods html close;

Output image

Combined aggregation

proc means nway data=have noprint;
  class type country;
  var count price;
  output out=stats max(price)=price_max sum(count)=count_sum;
run;

data cells;
  set stats;
  if not missing(price_max) then 
    cell = cats(price_max,'(',count_sum,')');
run;

proc transpose data=cells out=twoway(drop=_name_);
  by type;
  id country;
  var cell;
run;

proc print noobs data=twoway;
run;

这篇关于将空格分隔的文本文件读取到SAS中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆