[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: --svy & --pweights: problems for median, graphs & regression

From   "Austin Nichols" <>
Subject   Re: st: --svy & --pweights: problems for median, graphs & regression
Date   Thu, 11 Sep 2008 10:19:33 -0400 --
On (b): Note also that -eqprhistogram- on SSC allows aweights, as does
-tabulate-, and for proportion in each category, aweights are the same
as pweights. You can make your own histogram with pweights/aweights
with -collapse- as well:

sysuse auto, clear
ren rep78 category
 g one=1
 collapse (sum) one [pw=wei], by(category)
 keep if category<.
 g s=sum(one)
 egen Fraction=max(s)
 replace Fraction=one/Fraction
 sc Fraction category, recast(bar)
tab category [aw=wei]

In fact, I can't see why Stata does not allow aweights (or even
pweights) in -histogram- (I believe that was a Wishes/Grumbles item).

On Thu, Sep 11, 2008 at 3:07 AM, Steven Samuels
<> wrote:
> Hafida
> a. Look at -help-   for -pctile- and -_pctile-.  These take pweights
> b. -histogram- will take fweights, but not pweights: See
> and preceding
> messages in the thread.
> c. -svy: reg- automatically computes standard errors that are robust to
> heteroskedasticity. Homogeneity of variance is not an assumption for survey
> tests of means or regression coefficients.
> d.  See: to
> compute an adjusted R-square with survey data.  You should consider other
> measures of fit, such as -linktest-.
> e.  No--not if you want to believe the standard errors and tests.  However
> you might be able to test hypotheses about single-outcome survey regressions
> with -suest-.
> To use the survey-enabled programs like -svy: reg-  for inference
> (hypothesis, confidence intervals, standard errors) you must first -svyset-
> your data.  -svyset- will allow you to account for the entire sampling
> design, not just weighting.
> For future reference the appropriate way to refer to Stata commands in the
> list is with hyphens around them"  "-manova-" , not "--manova"
> -Steve
> On Sep 11, 2008, at 12:59 AM,
> wrote:
>> Hi all,
>> I have 4 continuous DVs (quality of life domains of SF-36: GH, PF, MH, SF)
>> which are moderately inter-correlated (0.5) and two of them have skewed
>> distributions. I originally intended to use --manova & --mvreg until lately
>> when I realised that I'm using a dataset from a survey with over-sampling
>> for participants living in remote areas. The dataset had had a weighted
>> variable already so have to take this into account. To some extent, this had
>> affected the statistical method I'd like to use previously. So far, I've
>> performed a separate analysis for each DV as I have no idea on how to apply
>> --svy nor --pw to --manova and --mvreg.
>> Some of my concerns are:
>> a. I need descriptives other than mean & proportion, particularly for
>> skewed DVs which I think median or percentiles is more appropriate. While
>> --svy does not support this, is there a way to get the estimates?
>> b. How to create histogram for weighted mean DV as this will help get a
>> sense if assumption for normality for an OLS regression is met?
>> c. When using --regress with --svy to get ANOVA, how is homogeneity of
>> variance assessed?
>> d. I noticed that there is no adjusted R-squared when using --svy, so is
>> it appropriate to build a model using R-squared instead?
>> e. Lastly, if it's not impractical at all, is it still possible to run
>> --mvreg and take weights into account?
*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index