Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: sample selection (-gsample) in stata

 From Steven Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: sample selection (-gsample) in stata Date Wed, 6 Jul 2011 22:05:18 -0500

```Try this:

****CODE BEGINS****
set seed 95504066
sysuse auto, clear
tab foreign  // stratum variable
sample 12.5, by(foreign) //12.5% (1 in 8) sample in each stratum
tab foreign
gen fwt = 1/.125 //inverse of sampling probability
list fwt in 1
svyset _n [pweight=fwt], strata(foreign)
****CODE ENDS****

Steve
sjsamuels@gmail.com

On Jul 6, 2011, at 6:44 PM, Shikha Sinha wrote:

Steve,

Thanks a lot for your valuable comments. You are right, I am confusing
by you. Does Stata have a command to do draw a sample with
proportional allocation?

Thanks,
Shikha

On Wed, Jul 6, 2011 at 4:18 PM, Steven Samuels <sjsamuels@gmail.com> wrote:
>
> Shikka,  In excel, you have drawn a stratified sample with proportional allocation.  -gsample- has drawn a sample of 30 observations per stratum (as should be clear from the -help-).
>
> You are using the term "PPS" without understanding what it means. I've already given you my best advice, so I don't think that I will add anything more.
>
> Steve
> sjsamuels@gmail.com
>
>
>
> On Jul 6, 2011, at 5:32 PM, Shikha Sinha wrote:
>
> Thanks everyone for the response. I think two-stage PPS is complex.
> However to understand the one-stage PPS in Stata, I still need your
> inputs. I did it in excel, and results are below:
>
> City    No of companies Prob (no of companies/1397)     Number to be selected
> (300*prob)
> Central 135     0.10    29
> Copperbelt      184     0.13    40
> Eastern 173     0.12    37
> Luapula 136     0.10    29
> Lusaka  87      0.06    19
> North Western   130     0.09    28
> Northern        173     0.12    37
> Southern        231     0.17    50
> Western 148     0.11    32
> Total   1397    1       300
>
> This is what I meant by PPS. From the sampling frame of 1397 companies
> in 9 cities, I want to draw a random sample of 300 comapnies based on
> PPS. Do you think I am doing it right in excel?
>
> Next, I tried to generate the same in stata using -gsample.
>
> bys  City: gen freq= _N
>
> . g pps=freq/1397
>
> . gsample 30 [aw=pps], wor strata( pid)
> (1127 observations deleted)
>
> . tab  Province
>
> City       Freq.     Percent    Cum.
>
> Central          30       11.11 11.11
> Copperbelt          30       11.11      22.22
> Eastern          30       11.11 33.33
> Luapula          30       11.11 44.44
> Lusaka          30       11.11  55.56
> North Western          30       11.11   66.67
> Northern          30       11.11        77.78
> Southern          30       11.11        88.89
> Western          30       11.11 100.00
>
> Total         270      100.00
>
> The stata output is different from the excel output. -gsample draw 30
> obs from each City, then how can it be based on PPS. Could you suggest
> me the right code using -gsample to generate the excel output. or can
> I use -samplepps, what would be the code for this?
>
> Thanks,
>
> Shikha
>
>
>
>
>
>
> On Wed, Jul 6, 2011 at 12:30 PM, Stas Kolenikov <skolenik@gmail.com> wrote:
>> On Tue, Jul 5, 2011 at 4:48 PM, Shikha Sinha <shikha.sinha414@gmail.com> wrote:
>>> -gsample looks good, but I am still struggling. How do I calculate the
>>> size for -gsample. I want the select companies from each cities and of
>>> each type in each city.
>>
>> -gsample- will only produce appropriate PPS samples if you specify
>> sampling with replacement (which is the approximation you would have
>> to make at the analysis stage, anyway). PPS sampling without
>> replacement is far more complicated, and if the phrase "Rao-Sampford
>> algorithm" does not ring a bell, you will end up with wrong sampling
>> weights.
>>
>> --
>> Stas Kolenikov, also found at http://stas.kolenikov.name
>> Small print: I use this email account for mailing lists only.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```