Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

Re: Re: st: random samples within each of 1,152 categories

 From Olga Gorbachev To statalist@hsphsun2.harvard.edu Subject Re: Re: st: random samples within each of 1,152 categories Date Mon, 13 May 2013 13:35:31 -0400

```Dear Steve and Maarten,

thank you for your replies, and sorry I did not write sooner. I get
Statadigest, and so did not see the replies, and did not realize it.

In the example you list, I'd like to sample 34% from "other" category
so 33*0.34=11. I'd like this new sample to match the sample mean of
the original "non working" sample.

We were able to solve the original issue with which I wrote to the
listserve, but now we are "fighting" with the issue of covnergence. We
are trying to match the means in the subsample and main sample, but
are not successful.

Right now, this is the program we are running:

local res = 0
local N0 = 1000
di "i = " _c
forv i = 1/`N0' {
di " `i'" _c
cap: drop wtn2
qui: gen wtn2 = .
qui: levelsof year, local(years)
foreach yr of local years {
su work [aw = wt] if year == `yr', meanonly
local pct = 1 - r(mean)
qui: count if year == `yr' & work & wt > 0
local n = r(N) * `pct'
gsample `n' if year == `yr' & work [aw = wtn], gen(smpl) replace
// qui: gen smpl`yr' = smpl
qui: replace wtn2 = wtn * smpl if year == `yr'
}

su nokid if year == 2009 [aw = wtn2], meanonly
local meannew = r(mean)
su nokid if year == 2009 & !work [aw = wt], meanonly
local meanold = r(mean)
local res = `res' + `meannew' - `meanold'
}

di `res' / `N0'

After  5751 iterations, the mean differences are persistent:

white          ed       nokid        wife        agee        RMSE

1968   .07075803   .02760528  -.07051057   .10025028  -1.9695697   .64914917

1969    .0685191    .0043999   .00714388   .07798387  -1.0421818   .44748337

1970   .06611464   .05476483  -.02358097     .077666  -1.8464169   .76425403

1971   .06971375   .02524083  -.04641226   .07669907  -1.9877308   .84812203

1972   .06842085   .00459005  -.01252929   .07209953  -1.4438688   .58143546

1973   .07875147   -.0065409  -.00719551    .0762982  -.84213075   .76031927

1974   .07394796   .01265153   .03503028   .06037437   .47233679   .45948809

1975    .0754228  -.02080965   .04125415   .06711441   1.3878045   .44676919

1976   .07922582   -.0270845    .0703499   .08149621   3.0252375   1.2009947

1977   .07757246  -.06248362   .13932287   .05320747    4.381814   1.9495654

1978    .0712201  -.10770348   .09020478   .07284452   3.3190634   1.0499406

1979    .0867201  -.11178738   .11253834   .07264209    2.972378   1.5287306

1980   .07419313  -.03035967   .13589319   .06733552   4.0215276   1.7365936

1981   .07878431  -.01949136   .17420796   .04241048   5.3660359   2.8373346

1982    .0829203  -.11727873   .17645938   .05927291   5.4543346   2.3774178

1983   .07845573  -.10130641   .09725345    .0687734   2.3112865   1.4648557

1984   .09015502  -.07159415   .09821572    .0418674   4.0170757   2.2326577

1985   .07475118  -.17578234   .15213582   .06892136   4.8550365   2.4831803

1986   .09253893  -.20126191   .16269138   .06200215   4.8770742   1.7727842

1987   .08100237  -.17625041   .14548996   .05864067   3.6014147   1.6212222

1988   .10134555  -.08595601   .20253243   .08289725   6.7326522   2.8191897

1989   .08155963  -.10436591   .15625212   .02222005   3.9071876   1.0379599

1990   .09724568   .03089819    .1811577   .08476095   4.8926164   2.3862564

1991   .08948172  -.03575608    .2627551   .08514362   8.0915346   3.8331833

1992   .08865055   -.1055572   .25235049   .10462895   7.4632178   3.3645744

1993    .0951815  -.07997661   .17046405   .06604482   4.2573245   2.0752016

1994   .04873715  -.16646878   .07550069   .03458139   .98317415   .39188131

1995   .06876277  -.13850863   .13320029   .02768267   2.4135662   .66990443

1996   .00856876  -.21758791   .08564262  -.00698818   1.4966446   .57737936

1997   .03627838  -.15611398   .15043455   .05398246   1.6478452    1.009571

1999   .11814375  -.00525869   .02250082   .08790646   .80302795   .41872088

2001   .08085215   .03209268   .00536218   .03539566   .28864816   .08032318

2003   .01760212   -.0463809   .07889079   .03968931   3.0012058   2.1627047

2005   .01684409  -.07215183   .09026966    .0235811   2.3966124   .67002878

2007   .03959067   .03748774   .09446534   .06242606   2.9837086   .87488072

2009   .02200718  -.04037616   .05716718   .05698124   2.5813597   1.7555616

Total   .08024411  -.08345948    .0901088   .06647665    2.531596    1.108369

*************************

Re: st: random samples within each of 1,152 categories

________________________________
```