随机分组内的一个变量 [英] Shuffle One Variable Within Group
问题描述
此问题是罗伯特·皮卡德(Robert Picard)在此处提供的出色答案的扩展:
This question is an extension of the excellent answer provided by Robert Picard here: How to Randomly Assign to Groups of Different Sizes
我们有这个数据集,与上一个问题相同,但是添加了year
变量:
We have this dataset, which is the same as in the previous question, but adds the year
variable:
sysuse census, clear
keep state region pop
order state pop region
decode region, gen(reg)
replace reg="NCntrl" if reg=="N Cntrl"
drop region
gen year=20
replace year=30 if _n>15
replace year=40 if _n>35
如果我只是想在所有观察值中(而不考虑组)重新随机分配reg
,则可以实现上一篇文章的答案:
If I just wanted to re-randomly assign reg
's across all observations (without regard to group), I could implement the answer to the previous post:
tempfile orig
save `orig'
keep reg
rename reg reg_new
set seed 234
gen double u = runiform()
sort u reg_new
merge 1:1 _n using `orig', nogen
如何修改代码,以便仅在year
内对reg
进行改组?例如,有15个观察值,其中year==20
.这些观察结果应与其他年份分开处理.
How would the code be modified so that reg
is shuffled, but only within year
? For example, there are 15 observations where year==20
. These observations should be shuffled separately than the other years.
推荐答案
改组一个变量不需要任何文件编排.可以将其缩短:
Shuffling one variable doesn't require any file choreography. This can probably be shortened:
sysuse auto, clear
set seed 2803
gen double shuffle = runiform()
* example 1
sort shuffle
gen long which = _n
sort mpg
gen mpg_new = mpg[which]
list which mpg*
* example 2
bysort foreign (shuffle) : gen long which2 = _n
bysort foreign (mpg) : gen mpg2 = mpg[which2]
list which2 mpg mpg2, sepby(foreign)
所有这些,我认为sample
会这样做,只要您指定与数据集中数字相同的样本大小即可.这太过分了,因为您获得了所有变量.
All that said, I think sample
does this so long as you specify the same sample size as the number in the dataset. It's overkill because you get all the variables.
这篇关于随机分组内的一个变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!