Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: st: random samples within each of 1,152 categories


From   Olga Gorbachev <olga.gorbachev@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: Re: st: random samples within each of 1,152 categories
Date   Mon, 13 May 2013 13:35:31 -0400

Dear Steve and Maarten,

thank you for your replies, and sorry I did not write sooner. I get
Statadigest, and so did not see the replies, and did not realize it.

TO answer Steve's question first:

In the example you list, I'd like to sample 34% from "other" category
so 33*0.34=11. I'd like this new sample to match the sample mean of
the original "non working" sample.

We were able to solve the original issue with which I wrote to the
listserve, but now we are "fighting" with the issue of covnergence. We
are trying to match the means in the subsample and main sample, but
are not successful.

Right now, this is the program we are running:

local res = 0
local N0 = 1000
di "i = " _c
forv i = 1/`N0' {
    di " `i'" _c
    cap: drop wtn2
    qui: gen wtn2 = .
    qui: levelsof year, local(years)
    foreach yr of local years {
        su work [aw = wt] if year == `yr', meanonly
        local pct = 1 - r(mean)
        qui: count if year == `yr' & work & wt > 0
        local n = r(N) * `pct'
        gsample `n' if year == `yr' & work [aw = wtn], gen(smpl) replace
        // qui: gen smpl`yr' = smpl
        qui: replace wtn2 = wtn * smpl if year == `yr'
    }

    su nokid if year == 2009 [aw = wtn2], meanonly
    local meannew = r(mean)
    su nokid if year == 2009 & !work [aw = wt], meanonly
    local meanold = r(mean)
    local res = `res' + `meannew' - `meanold'
}

di `res' / `N0'


After  5751 iterations, the mean differences are persistent:


            white          ed       nokid        wife        agee        RMSE

 1968   .07075803   .02760528  -.07051057   .10025028  -1.9695697   .64914917

 1969    .0685191    .0043999   .00714388   .07798387  -1.0421818   .44748337

 1970   .06611464   .05476483  -.02358097     .077666  -1.8464169   .76425403

 1971   .06971375   .02524083  -.04641226   .07669907  -1.9877308   .84812203

 1972   .06842085   .00459005  -.01252929   .07209953  -1.4438688   .58143546

 1973   .07875147   -.0065409  -.00719551    .0762982  -.84213075   .76031927

 1974   .07394796   .01265153   .03503028   .06037437   .47233679   .45948809

 1975    .0754228  -.02080965   .04125415   .06711441   1.3878045   .44676919

 1976   .07922582   -.0270845    .0703499   .08149621   3.0252375   1.2009947

 1977   .07757246  -.06248362   .13932287   .05320747    4.381814   1.9495654

 1978    .0712201  -.10770348   .09020478   .07284452   3.3190634   1.0499406

 1979    .0867201  -.11178738   .11253834   .07264209    2.972378   1.5287306

 1980   .07419313  -.03035967   .13589319   .06733552   4.0215276   1.7365936

 1981   .07878431  -.01949136   .17420796   .04241048   5.3660359   2.8373346

 1982    .0829203  -.11727873   .17645938   .05927291   5.4543346   2.3774178

 1983   .07845573  -.10130641   .09725345    .0687734   2.3112865   1.4648557

 1984   .09015502  -.07159415   .09821572    .0418674   4.0170757   2.2326577

 1985   .07475118  -.17578234   .15213582   .06892136   4.8550365   2.4831803

 1986   .09253893  -.20126191   .16269138   .06200215   4.8770742   1.7727842

 1987   .08100237  -.17625041   .14548996   .05864067   3.6014147   1.6212222

 1988   .10134555  -.08595601   .20253243   .08289725   6.7326522   2.8191897

 1989   .08155963  -.10436591   .15625212   .02222005   3.9071876   1.0379599

 1990   .09724568   .03089819    .1811577   .08476095   4.8926164   2.3862564

 1991   .08948172  -.03575608    .2627551   .08514362   8.0915346   3.8331833

 1992   .08865055   -.1055572   .25235049   .10462895   7.4632178   3.3645744

 1993    .0951815  -.07997661   .17046405   .06604482   4.2573245   2.0752016

 1994   .04873715  -.16646878   .07550069   .03458139   .98317415   .39188131

 1995   .06876277  -.13850863   .13320029   .02768267   2.4135662   .66990443

 1996   .00856876  -.21758791   .08564262  -.00698818   1.4966446   .57737936

 1997   .03627838  -.15611398   .15043455   .05398246   1.6478452    1.009571

 1999   .11814375  -.00525869   .02250082   .08790646   .80302795   .41872088

 2001   .08085215   .03209268   .00536218   .03539566   .28864816   .08032318

 2003   .01760212   -.0463809   .07889079   .03968931   3.0012058   2.1627047

 2005   .01684409  -.07215183   .09026966    .0235811   2.3966124   .67002878

 2007   .03959067   .03748774   .09446534   .06242606   2.9837086   .87488072

 2009   .02200718  -.04037616   .05716718   .05698124   2.5813597   1.7555616

Total   .08024411  -.08345948    .0901088   .06647665    2.531596    1.108369



*************************

Re: st: random samples within each of 1,152 categories

________________________________


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index