Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to set calibrated weights


From   Veronica Galassi <veronicagalassi@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: How to set calibrated weights
Date   Sat, 27 Oct 2012 09:09:05 +0100

Sorry Steve, forgive my ignorance!

I managed to send the log file with the problem to the survey organisation.
I will let you know something as soon as they come back to me.

Thank you again for your help!

2012/10/26 Steve Samuels <sjsamuels@gmail.com>:
>
> You can't use  "//" , "/*",  "*/" or "///" on the command line.
> Put the code in a do file and log the results.  See the section
> on "log your results" in http://folkesundhed.au.dk/fileadmin/www.folkesundhed.au.dk/uddannelse/stata/introduction/takecare.pdf
>
>
> There won't be much harm in omitting the stratum variable, except slightly larger
> standard errors, as you say.
>
> Steve
>
>
> On Oct 26, 2012, at 12:27 PM, Veronica Galassi wrote:
>
> I see...thank you for warning me!
>
> I tried to run the code you gave me but I got back a couple of errors.
> Below the output:
>
> preserve
>
> . bys cluster w2_gc_dc: keep if _n==1
> (20701 observations deleted)
>
> . bys cluster: gen nstrat= _n
>
> . tab nstrat  //duplicates are nstrat >1
> / invalid name
> r(198);
>
> . sort cluster
>
> . list cluster w2_gc_dc if nstrat>1 //show problem
> 1/ invalid name
> r(198);
>
> . save problem, replace
> (note: file problem.dta not found)
> file problem.dta saved
>
> . restore
>
>
> I told already the organisation about this funny thing when setting
> the PSUs but they did not pay attention to it claiming that "cluster"
> is the right PSU variable and end of the story.
>
> I will try to raise the issue again. I know that not specifying the
> strata gives me standard errors bigger than usual, but maybe this is a
> good compromise.
> What do you think?
>
>
>
> 2012/10/20 Steve Samuels <sjsamuels@gmail.com>:
>> Veronica,
>>
>> The PSU variable is not missing. It is the sampling unit at the first
>> stage of sampling and it's one of your cluster variables, probably
>> "cluster 1" (check). Your statement that one must know the PSU variable
>> to use probability weights is also incorrect. One can get proper
>> weighted estimates, though not standard errors, without knowing the PSU.
>>
>> I'm not sure what wrong with your -concat- statement. I would have
>> used "egen combination = group()". For it to have worked, the value of
>> the "post-stratification weight" would have to be the population count
>> for each combination of the three variables.
>>
>> If the "post-stratification" weights are not integers, they are probably
>> "calibration" weights that have already adjusted the probability
>> weights. In that case, further post-stratification are likely to be
>> superfluous. You would  then use the "post-stratification weight" in place of
>> the probability weights. All weights should be
>> described in the study documents (though usually not the"codebook"). If
>> they are not, then contact the organization that did the study for
>> details.
>>
>> If sampling was without replacement at one or more stages,
>> you could use the fpc() option for those stages. In practice,
>> it makes a difference only for the first stage.
>>
>> In any case, one guess at a -svyset- statement (assuming the
>> "post-stratification weight" is a "calibration" weight) is:
>> *************************************************************
>> svyset w2_gc_prov [pw = w2_wgt], strata(w2_gc_dc) || w2_hhgeo
>> **************************************************************
>>
>> But I could be wrong, depending on how w2_wgt was calculated.
>>
>> Before proceeding, I suggest that you learn more about sampling or take
>> a survey course. I gave some references in:
>> http://www.stata.com/statalist/archive/2012-09/msg01058.html.
>> The Stata survey manual is also a very good resource, though the section on
>> post-stratification is skimpy.
>>
>> Steve
>>
>>
>> On Oct 19, 2012, at 1:57 PM, Veronica Galassi wrote:
>>
>> Dear Statalisters,
>>
>> I am writing you concerning the application of calibrated weights to
>> my dataset for the computation of descriptive statistics only.
>>
>> The dataset I am working on collects information at household and
>> individual level and comes from a stratified, two-stage clustered
>> sample. The followings are the variables I have got:
>> - probability weights: w2_dwgt
>> - strata: w2_gc_dc
>> - cluster 1: w2_gc_prov
>> - cluster 2: w2_hhgeo
>> - post-stratified weights: w2_wgt
>> - age intervals:  w2_age_intervals
>> - gender: w2_best_gen
>> - population group: w2_best_race
>>
>> In order to set the probability weights using the command svyset, I
>> need the psu variable. As you may have noticed, this variable is
>> missing and this makes me impossible to set pweights.
>> In addition, from a couple of previous statalist conversations ( see
>> in particular: http://www.ats.ucla.edu/stat/stata/faq/svy_stata_post.htm
>> and http://www.stata.com/statalist/archive/2012-02/msg00584.html), I
>> understood that:
>> - when using calibrated weights I still have to set pweights and
>> specify the original strata and clusters
>> - In order to apply calibrated data I need to know the characteristics
>> on the base of which the sample have been post-stratified ( in my case
>> age intervals, gender and population groups).
>>
>> Therefore, I tried to set my post-stratified weights using the
>> following command:
>> "svyset [pw=w2_dwgt], strata (w2_gc_dc) poststrata (w2_age_intervals
>> w2_best_gen w2_best_race) postweight(w2_wgt)"
>> which did not work because in Stata the poststrata must be mutually
>> exclusive and thus only one variable can be specified.
>>
>> In order to overcome this problem, I tried to generate a variable
>> which is a combination of the three characteristics by using the
>> command
>> "egen combination=concat( w2_age_intervals w2_best_race w2_best_gen),
>> format (float)".
>> However, this command generated a variable containing only missing
>> values and for this reason Stata gave me back the error:
>> "option postweight() requires option poststrata()".
>> The only way to make Stata set the post-calibrated weight was by using
>> the command
>> "svyset, poststrata (combination) postweight(w2_wgt)" with combination
>> being a string variable. However I am scared that this command is not
>> complete.
>>
>> At this point, I would really appreciate any hint on what I am doing
>> wrong and how to proceed to set my post-stratified weights.
>>
>> Many thanks for your help!
>>
>> Kind regards,
>>
>> Veronica Galassi
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index