随机分组内的一个变量 [英] Shuffle One Variable Within Group

查看:95
本文介绍了随机分组内的一个变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题是罗伯特·皮卡德(Robert Picard)在此处提供的出色答案的扩展:

This question is an extension of the excellent answer provided by Robert Picard here: How to Randomly Assign to Groups of Different Sizes

我们有这个数据集,与上一个问题相同,但是添加了year变量:

We have this dataset, which is the same as in the previous question, but adds the year variable:

sysuse census, clear
keep state region pop
order state pop region
decode region, gen(reg)
replace reg="NCntrl" if reg=="N Cntrl"
drop region
gen year=20 
replace year=30 if _n>15
replace year=40 if _n>35

如果我只是想在所有观察值中(而不考虑组)重新随机分配reg,则可以实现上一篇文章的答案:

If I just wanted to re-randomly assign reg's across all observations (without regard to group), I could implement the answer to the previous post:

tempfile orig
save `orig'
keep reg
rename reg reg_new
set seed 234
gen double u = runiform()
sort u reg_new
merge 1:1 _n  using `orig', nogen

如何修改代码,以便仅在year内对reg进行改组?例如,有15个观察值,其中year==20.这些观察结果应与其他年份分开处理.

How would the code be modified so that reg is shuffled, but only within year? For example, there are 15 observations where year==20. These observations should be shuffled separately than the other years.

推荐答案

改组一个变量不需要任何文件编排.可以将其缩短:

Shuffling one variable doesn't require any file choreography. This can probably be shortened:

sysuse auto, clear 
set seed 2803 

gen double shuffle = runiform() 

* example 1 
sort shuffle 
gen long which = _n 
sort mpg 
gen mpg_new = mpg[which] 
list which mpg* 

* example 2 
bysort foreign (shuffle) : gen long which2 = _n 
bysort foreign (mpg) : gen mpg2 = mpg[which2] 
list which2 mpg mpg2, sepby(foreign) 

所有这些,我认为sample会这样做,只要您指定与数据集中数字相同的样本大小即可.这太过分了,因为您获得了所有变量.

All that said, I think sample does this so long as you specify the same sample size as the number in the dataset. It's overkill because you get all the variables.

这篇关于随机分组内的一个变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆