在 SAS 的数据步中计算变量的均值和标准差 [英] Calculate mean and std of a variable, in a datastep in SAS

查看:198
本文介绍了在 SAS 的数据步中计算变量的均值和标准差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,其中观察值是学生,然后我有一个变量作为他们的测试分数.我需要像这样标准化这些分数:

I have a dataset where observations is student and then I have a variable for their test score. I need to standardize these scores like this :

newscore = (oldscore - mean of all scores) / std of all scores

所以我想是使用数据步骤,在其中创建一个新数据集,并将新闻分数"添加到每个学生.但我不知道如何计算数据步骤中整个数据集 IN 的均值和标准差.我知道我可以使用 proc 方法计算它,然后手动输入它.但是我需要做很多次,并且可能会删除变量和其他东西.所以我希望能够在同一步骤中计算它.

So that I am thinking is using a Data Step where I create a new dataset with the 'newscore' added to each student. But I don't know how to calculate the mean and std of the entire dataset IN in the Data Step. I know I can just calculate it using proc means, and then manually type it it. But I need to do I a lot of times and maybe drop variables and other stuff. So I would like to be able to just calculate it in the same step.

数据示例:

__VAR 测试分数新闻
学生1 5 x
学生 2 8 x
学生 3 5 x

__VAR testscore newscore
Student1 5 x
Student2 8 x
Student3 5 x

我试过的代码:

data new;
set old;
newscore=(oldscore-(mean of testscore))/(std of testscore)
run;

(不能发布任何真实数据,不能从服务器中删除)

(Can't post any of the real data, can't remove it from the server)

我该怎么做?

推荐答案

Method1: 解决这个问题的有效方法是使用 proc stdize .它可以解决问题,您不需要为此计算均值和标准差.

Method1: Efficient way of solving this problem is by using proc stdize . It will do the trick and you dont need to calculate mean and standard deviation for this.

data have;
input var $ testscore;
cards;
student1 5
student2 8
student3 5
;
run;

data have;
set have;
newscore = testscore;
run;

proc stdize data=have out=want;
   var newscore;
run;   

方法 2: 正如您建议从 proc 均值中取出均值和标准差,将它们的值存储在宏中并在我们的计算中使用它们.

Method2: As you suggested taking out means and standard deviation from proc means, storing their value in a macro and using them in our calculation.

proc means data=have;
var testscore;
output out=have1 mean = m stddev=s;
run;

data _null_;
set have1;
call symputx("mean",m);
call symputx("std",s);
run;

data want;
set have;
newscore=(testscore-&mean.)/&std.;
run;

我的输出:

var           testscore  newscore
student1      5          -0.577350269   
student2      8          1.1547005384   
student3      5          -0.577350269

如有任何疑问,请告诉我.

Let me know in case of any queries.

这篇关于在 SAS 的数据步中计算变量的均值和标准差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆