输出带有 SAS 表测试结果的表的宏 [英] Macro that outputs table with testing results of SAS table

查看:56
本文介绍了输出带有 SAS 表测试结果的表的宏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题

我不是非常有经验的 SAS 用户,但不幸的是,我可以访问数据的实验室仅限于 SAS.此外,我目前无法访问数据,因为它仅在实验室中可用,因此我创建了用于测试的模拟数据.

我需要创建一个宏来从 PROC MEANS 表中获取值和维度,并执行一些测试来检查数据中的前两个值是否占结果的 90%.

举个例子,假设我有列出公司收入、成本和利润的面板数据.我创建了一个表格,其中列出了 n、sum、mean、median 和 std.现在我需要检查前两家公司是否占结果的 90%,如果是,则标记是利润、收入还是成本占了 90%.

我不知道如何开始

步骤如下:

  1. 读取数据

  2. 读取创建的 PROC MEAN 表,获取维度和变量.

  3. 获取每个变量的前两个公司并进行检查

  4. 创建新表,列出变量、读取表中的值、最大和第二大值以及标志.

  5. 然后打印表格

模拟数据:

)

proc 表示数据=慈善;var MoneyRaised HoursVolunteered;输出输出=尝试总和=IDGROUP ( MAX (Moneyraised hoursVolunteered) OUT[2] (moneyraised hoursvolunteered)=max1 max2);跑步;数据 var1 (keep=name1 _freq_ moneyraised max1_1 max1_2 rename=(moneyraised=value max1_1=largest max1_2=seclargest name1=name))var2 (keep=name2 _freq_ HoursVolunteered max2_1 max2_2 rename=(HoursVolunteered=value max2_1=largest max2_2=seclargest name2=name));长度 name1 name2 $4;设置试试;name1='VAR1';name2='VAR2';跑步;数据最终合并;长度标志 $1;设置 var1 var2;如果最大+秒最大>value*0.9 然后 flag='Y';跑步;

在 proc 意味着我选择变量 moneyraised 和 hoursvolunteered,您将选择您的 var1 var2 var3 并在所有程序中进行更改.

IDgroup 将输出两个变量的最大值,如您在括号中所见,但没有 [2],显然是最大和第二大.

您必须重命名它们,我选择重命名 max1 和 max 2,然后 sas 会自动将 _1 和 _2 添加到第一个和第二个最大值.

所有输出都在同一行上,所以我在输出中引用 2 个数据集 (data var1 var2) 进行数据步骤,保留所需的变量并为下一次合并重命名它们,我还选择了一个命名系统,如您所见.

最后,我将合并创建的 2 个数据集并添加标志.

Problem

I'm not a very experienced SAS user, but unfortunately the lab where I can access data is restricted to SAS. Also, I don't currently have access to the data since it is only available in the lab, so I've created simulated data for testing.

I need to create a macro that gets the values and dimensions from a PROC MEANS table and performs some tests that check whether or not the top two values from the data make up 90% of the results.

As an example, assume I have panel data that lists firms revenue, costs, and profits. I've created a table that lists n, sum, mean, median, and std. Now I need to check whether or not the top two firms make up 90% of the results and if so, flag if it's profit, revenue, or costs that makes up 90%.

I'm not sure how to get started

Here are the steps :

  1. Read the data

  2. Read the PROC MEAN table created, get dimensions, and variables.

  3. Get top two firms in each variable and perform check

  4. Create new table that lists variable, value from read table, largest and second largest, and flag.

  5. Then print table

Simulated data :

https://www.dropbox.com/s/ypmri8s6i8irn8a/dataset.csv?dl=0

PROC MEANS Table

proc import datafile="/folders/myfolders/dataset.csv"
     out=dt
     dbms=csv
     replace;
     getnames=yes;
run;

TITLE "Macro Project Sample";
PROC MEANS n sum mean median std;
    VAR V1 V2 V3;
RUN;

Desired Results :

        Value        Largest     Sec. Largest       Flag
V1     463138.09     9888.09       9847.13     
V2     148.92        1.99           1.99      
V3     11503375      9999900       1000000           Y

解决方案

At the moment I can't open your simulated dataset but I can give you some advices, hope they will help.

You can add the n extreme values of given variables using the 'output out=' statement with the option IDGROUP.

Here an example using charity dataset ( run this to create it http://support.sas.com/documentation/cdl/en/proc/65145/HTML/default/viewer.htm#p1oii7oi6k9gfxn19hxiiszb70ms.htm)

proc means data=Charity;
   var MoneyRaised HoursVolunteered;
   output out=try sum=
   IDGROUP ( MAX (Moneyraised HoursVolunteered) OUT[2] (moneyraised hoursvolunteered)=max1 max2);
run;
data    var1 (keep=name1 _freq_ moneyraised max1_1 max1_2 rename=(moneyraised=value max1_1=largest max1_2=seclargest name1=name))
        var2 (keep=name2 _freq_ HoursVolunteered max2_1 max2_2 rename=(HoursVolunteered=value max2_1=largest max2_2=seclargest name2=name));
length name1 name2 $4;
set try ;
name1='VAR1';
name2='VAR2';
run;

data finalmerge;
length flag $1;
set var1 var2;
if largest+seclargest > value*0.9 then flag='Y';
run;

in the proc means I choose to variables moneyraised and hoursvolunteered, you will choose your var1 var2 var3 and make your changes in all the program.

The IDgroup will output the max value for both variables, as you see in the parentheses, but with out[2], obviously largest and second largest.

You must rename them, I choose to rename max1 and max 2, then sas will add an _1 and _2 to the first and the second max values automatically.

All the output will be on the same line, so I do a datastep referencing 2 datasets in output (data var1 var2) keeping the variables needed and renaming them for the next merge, I also choose a naming system as you see.

Finally I'll merge the 2 datasets created and add the flag.

这篇关于输出带有 SAS 表测试结果的表的宏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆