Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: problem with getting a 50% bootstrapped sample stratified by treatment group and clustered by patient zip code


From   Woolton Lee <[email protected]>
To   [email protected]
Subject   Re: st: RE: problem with getting a 50% bootstrapped sample stratified by treatment group and clustered by patient zip code
Date   Thu, 7 Jan 2010 16:11:18 -0500

Actually no, there is nothing wrong with bsample.ado.  The problem is
with the code I pasted below is that nsamp tells bsample to draw a
random sample of size larger than the number of clusters there are in
the data.  I am trying to figure out how to modify the code using the
count command so that nsamp reflects the number of clusters in the
data and not the number of observations

On Thu, Jan 7, 2010 at 2:31 PM, Martin Weiss <[email protected]> wrote:
>
> <>
>
> This looks like a forgotten linejoin indicator in line 452 of bsample.ado to
> me...
>
> Currently reads:
>
> ***
>                if _rc {
>                        di as err
>                "resample size must not be greater than number of clusters"
>                        exit 498
>                }
> ***
>
> Should be:
>
> ***
>                if _rc {
>                        di as err ///
>                "resample size must not be greater than number of clusters"
>                        exit 498
>                }
> ***
>
>
> HTH
> Martin
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Woolton Lee
> Sent: Donnerstag, 7. Januar 2010 20:15
> To: statalist
> Subject: st: problem with getting a 50% bootstrapped sample stratified by
> treatment group and clustered by patient zip code
>
> Hello,
>
> I am have a dataset structured in the following manner,
>
> . input unique_id newgp str2 given_ahaid pat_zip trmtgp dis
>
>     unique_id      newgp   given_a~d    pat_zip     trmtgp        dis
>  1.     1              18001   "A"              1800           1
>       10
>  2.     2              18001 "B" 1800 1 15
>  3.     3              18001 "C" 1800 1 18
>  4.     4              18002 "A" 1800 1  5
>  5.     5              18002 "B" 1800 1  3
>  6.     6              18011 "A" 1801 1  0
>  7.     7              18011 "C" 1801 1  8
>  8.     8              18011 "D" 1801 1  9
>  9.     9              18011 "E" 1801 1  5
>  10.    10            18012 "B" 1801 1  7
>  11.    11            18012 "C" 1801 1 10
>  12.    12            18012 "D" 1801 1  4
>  13.    13            18012 "E" 1801 1  6
>  14.    14            18013 "D" 1801 1  9
>  15.    15            18013 "E" 1801 1  5
>  16.    16            17001 "A" 1700 0  8
>  17.    17            17001 "B" 1700 0  9
>  18.    18            17001 "C" 1700 0  7
>  19.   19             17002 "A" 1700 0 12
>  20.    20           17002 "B" 1700 0  8
>  21.    21            17011 "A" 1701 0  2
>  22.    22            17011 "C" 1701 0  1
>  23.    23           17011 "D" 1701 0  6
>  24.    24          17011 "E" 1701 0  4
>  25.    25           17012 "B" 1701 0  0
>  26.    26          17012 "C" 1701 0 17
>  27.    27          17012 "D" 1701 0  5
>  28.    28           17012 "E" 1701 0  4
>  29.    29           17013 "D" 1701 0  6
>  30.    30           17013 "E" 1701 0  7
>  31. end
>
> I would like to draw a 50% random sample for this dataset stratified
> by trtmtgp (treatment group) and clustered on patient zip code
> (pat_zip).  I have a simple code to do and have pasted the code below.
>  However when the code is run I get the error below.
>
> . forvalue i=1/5{
>  2.         clear
>  3.       use bsampletest, clear
>  4.       qui tab trmtgp
>  5.         scalar prnt=0.5                 // to set 50 percent
> sampling, can change
>> the number here
>  6.         qui count if trmtgp==0     // comparision market in beginning
> year
>  7.         scalar tabgp1=int(r(N)*prnt)
>  8.         qui count if trmtgp==1     // treatment market in beginning
> year
>  9.         scalar tabgp3=int(r(N)*prnt)
>  10.         *display "tabgp1="tabgp1, "tabgp3=" tabgp3
> .         gen nsamp=cond(trmtgp==0 , tabgp1, 0) + cond(trmtgp==1, tabgp3, 0)
>  11.
> .
> .         bsample nsamp, strata (trmtgp) cluster(pat_zip)
>  12.       egen mean_dist=mean(dis)
>  13.         display "_N=" _N, "mean_dis=" mean_dis
>  14. }
>
> unrecognized command:  "resample size must not be great invalid command name
> r(199);
>
> Is there anyway to get bsample to draw a 50% random sample with
> replacement where it is stratified by treatment group and clustered on
> patient zip code?
>
> Thank you for your help,
>
> Wool
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index