Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]



To conclude, thank you very much for all your suggestions!!!

Cheers,

Veronica


2012/10/20 Steve Samuels <sjsamuels@gmail.com>:
> Veronica,
>
> The PSU variable is not missing. It is the sampling unit at the first
> stage of sampling and it's one of your cluster variables, probably
> "cluster 1" (check). Your statement that one must know the PSU variable
> to use probability weights is also incorrect. One can get proper
> weighted estimates, though not standard errors, without knowing the PSU.
>
> I'm not sure what wrong with your -concat- statement. I would have
> used "egen combination = group()". For it to have worked, the value of
> the "post-stratification weight" would have to be the population count
> for each combination of the three variables.
>
> If the "post-stratification" weights are not integers, they are probably
> "calibration" weights that have already adjusted the probability
> weights. In that case, further post-stratification are likely to be
> superfluous. You would  then use the "post-stratification weight" in place of
> the probability weights. All weights should be
> described in the study documents (though usually not the"codebook"). If
> they are not, then contact the organization that did the study for
> details.
>
> If sampling was without replacement at one or more stages,
> you could use the fpc() option for those stages. In practice,
> it makes a difference only for the first stage.
>
> In any case, one guess at a -svyset- statement (assuming the
> "post-stratification weight" is a "calibration" weight) is:
> *************************************************************
> svyset w2_gc_prov [pw = w2_wgt], strata(w2_gc_dc) || w2_hhgeo
> **************************************************************
>
> But I could be wrong, depending on how w2_wgt was calculated.
>
> Before proceeding, I suggest that you learn more about sampling or take
> a survey course. I gave some references in:
> http://www.stata.com/statalist/archive/2012-09/msg01058.html.
> The Stata survey manual is also a very good resource, though the section on
> post-stratification is skimpy.
>
> Steve
>
>
> On Oct 19, 2012, at 1:57 PM, Veronica Galassi wrote:
>
> Dear Statalisters,
>
> I am writing you concerning the application of calibrated weights to
> my dataset for the computation of descriptive statistics only.
>
> The dataset I am working on collects information at household and
> individual level and comes from a stratified, two-stage clustered
> sample. The followings are the variables I have got:
> - probability weights: w2_dwgt
> - strata: w2_gc_dc
> - cluster 1: w2_gc_prov
> - cluster 2: w2_hhgeo
> - post-stratified weights: w2_wgt
> - age intervals:  w2_age_intervals
> - gender: w2_best_gen
> - population group: w2_best_race
>
> In order to set the probability weights using the command svyset, I
> need the psu variable. As you may have noticed, this variable is
> missing and this makes me impossible to set pweights.
> In addition, from a couple of previous statalist conversations ( see
> in particular: http://www.ats.ucla.edu/stat/stata/faq/svy_stata_post.htm
> and http://www.stata.com/statalist/archive/2012-02/msg00584.html), I
> understood that:
> - when using calibrated weights I still have to set pweights and
> specify the original strata and clusters
> - In order to apply calibrated data I need to know the characteristics
> on the base of which the sample have been post-stratified ( in my case
> age intervals, gender and population groups).
>
> Therefore, I tried to set my post-stratified weights using the
> following command:
> "svyset [pw=w2_dwgt], strata (w2_gc_dc) poststrata (w2_age_intervals
> w2_best_gen w2_best_race) postweight(w2_wgt)"
> which did not work because in Stata the poststrata must be mutually
> exclusive and thus only one variable can be specified.
>
> In order to overcome this problem, I tried to generate a variable
> which is a combination of the three characteristics by using the
> command
> "egen combination=concat( w2_age_intervals w2_best_race w2_best_gen),
> format (float)".
> However, this command generated a variable containing only missing
> values and for this reason Stata gave me back the error:
> "option postweight() requires option poststrata()".
> The only way to make Stata set the post-calibrated weight was by using
> the command
> "svyset, poststrata (combination) postweight(w2_wgt)" with combination
> being a string variable. However I am scared that this command is not
> complete.
>
> At this point, I would really appreciate any hint on what I am doing
> wrong and how to proceed to set my post-stratified weights.
>
> Many thanks for your help!
>
> Kind regards,
>
> Veronica Galassi
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index