Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: BSQREG and WEIGHTS


From   "Cruces,GA (pgr)" <G.A.Cruces@lse.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: BSQREG and WEIGHTS
Date   Mon, 16 Dec 2002 21:20:11 -0000

Dear All,

I would like to run a quantile regression with bootstrapped standard
errors with a household survey dataset that has weights. However, the
BSQREG command does not allow weights. 

The weights are integers (say, between 40 and 2600) such that adding
them up gives the total population of each city (though I'm working with
one city only right now - no PSUs or clusters). 

The only reference I could fin on Statalist was the email from W. Gould
copied below - however, I can't figure out the correct way to implement
his suggestions in the quantile regression case. I emailed Stata support
and they only replied that implementing weights in BSQREG was not in
their plans, and that the email below could be useful...

Has anyone encountered similar problems? I went into the BSQREG code but
it was too obscure for me. Any help will be greatly appreciated! 

Thank you very much

Best regards,

Guillermo Cruces
STICERD-LSE



-----Original Message-----
January 2002
Carlo Fiorio <c.fiorio@lse.ac.uk> writes, 

> I need to extract a set of bootstrap samples out of a survey database.
> Observation do not have the same probability of inclusion in the 
> sample since each observation comes with a probabilty weight attached 
> (inverse of probability of inclusion, normalized to sum to N). There 
> is no way to obtain information regarding clusters or strata.
>
> Does anyone have any suggestion on how to perform such a bootstrap
> resampling?

One way to sample with weights is to expand the dataset to include more
than _N observations and then sample _N observations from that.  For
instance, if all the weights were 1 or 2 (probability of inclusion p or
p/2), then 
one could make an unweighted dataset where here observation with weight
2 
was included twice.

In Carlo's case, let's assume his dataset has weight variable W obtained

from 

        . gen W = 1/P 

where P is the probabilty of inclusion.  Carlo mentions that he has
further normalized W to sum to _N, which is fine and generally a good
idea), but for what I write below, I do not care whether or not the
variable is normalized.

Let's find the smallest value of W:

        . summarize W

Having that, I am now going to renormalize W to that the smallest value
is 1:

        . replace W = 1/r(min)

and I am going to make a rounded-to-integer version of W:

        . gen Wint = round(W, 1)

Now I have to ask myself whether Wint is an adequate approximation to W.
For instance, if it is, then I can expand my dataset using Wint

         . expand Wint 
         . save tosamplefrom

I can now use dataset tosamplefrom to draw bootstramp samples.  When I
draw 
bootstrap samples, I do not type -bsample- but instead type 

        . bsample #

where # is the original number of observations in the dataset.

If Wint is not an adequate approximation to W, drop Wint and rescale W 
by multiplication by a constant.  For instance, I might type 

        . drop Wint 
        . replace W = 2*W
        . gen Wint = round(W, 1)

Of course, if could just make Wint a perfect approximation to W by
multiplying by a large enough number, but then the tosamplefrom dataset
might be very 
large.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index