Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: pweight + aweight, double weights

From	Jochen Späth <[email protected]>
To	<[email protected]>
Subject	AW: st: pweight + aweight, double weights
Date	Thu, 5 Aug 2010 10:40:59 +0200

Hi Steve,

thanks for your little program. What I do not understand is your statement that with a "probability weighted mean of the  individual growth rates" I "would wind up with the rate based on the probability-weighted aggregated sums". Check out this:

**************************CODE BEGINS**************************
sysuse auto, clear
gen length_2 = displacement
rename length length_1
rename trunk pw

* Look up the pweighted sums of length_1 and length_2 for foreign and domestic cars:

table foreign [pw= pw], c(sum length_1 sum length_2)

* Look up the growth rates based on the aggregate sums of lenght_1 and length_2:

di "domestic:" (311319 - 270137 ) / 270137 
di "foreign:"  (155268 - 235051) / 235051

* Do a pweighted mean of the individual growth rated with pw = inital value x pweight:
cap drop rate
gen rate = (length_2 - length_1) / length_1
table foreign [pweight = length_1 * pw], c(mean rate)
***************************CODE ENDS***************************

Jochen

> -----Ursprüngliche Nachricht-----
> Von: [email protected] [mailto:owner-
> [email protected]] Im Auftrag von Steve Samuels
> Gesendet: Mittwoch, 4. August 2010 23:14
> An: [email protected]
> Betreff: Re: st: pweight + aweight, double weights
> 
> I can see that the program is a little cryptic.  To clarify:
> 
> I applied- svy: ratio- to  R =  length_2/length_1  and got asymmetric
> confidence intervals for R by computing them on the log scale and
> transforming back.
> 
> The rate that Jochen asked for is rate =  (length_2 -
> length_1)/length_1 = R - 1, and that is what the -antilog-- program
> reports.  "relc" meant "relative change", which seemed clear to me, at
> the time.
> 
> Steve
> 
> On Wed, Aug 4, 2010 at 1:37 PM, Steve Samuels <[email protected]> wrote:
> > Jochen--
> > If you do a probability weighted mean of the  individual growth rates
> > for a time period (single year, first year to last year) and weight by
> > w =  (initial value) x (probability weight), you would wind up with
> > the rate based on the probability-weighted aggregated sums. So Stas's
> > solution is exactly the solution you seek. Moreover,  Stas's version
> > will provide the correct standard error, one appropriate for a ratio
> > estimate.
> >
> > You could also calculate the ratio estimate directly and get
> > asymmetric CI's, which are likely to be more accurate than the
> > symmetric intervals
> >
> > **************************CODE BEGINS**************************
> > capture program drop _all
> > program antilog
> > local lparm  el(r(b),1,1)
> > local se    sqrt(el(r(V),1,1))
> > local bound  invttail(e(df_r),.025)*`se'
> > local parm  exp(`lparm')
> >
> > local ll  exp(`lparm'  - `bound')
> > local ul  exp( `lparm' + `bound')
> > di  "relc = "  100*( `parm'-1)  "    ll = "  100*(`ll'-1)  "   ul = "
> > 100*(`ul'-1)
> > end
> >
> > sysuse auto, clear
> > gen length_2 = displacement
> > rename length length_1
> > svyset _n
> > svy: ratio length_2/length_1
> > nlcom log(_b[_ratio_1])
> > antilog
> >
> > ***************************CODE ENDS***************************
> >
> >
> > Steve
> > '
> > Steven Samuels
> > [email protected]
> > 18 Cantine's Island
> > Saugerties NY 12477
> > USA
> > Voice: 845-246-0774
> > Fax:    206-202-4783
> >
> >
> >
> > On Wed, Aug 4, 2010 at 11:43 AM, Stas Kolenikov <[email protected]>
> wrote:
> >> Who knows. You might be able to get identical answers, but you'll
> >> spend more time trying to figure out the appropriate composition of
> >> weights trying to reproduce the answer from those -total- commands.
> >>
> >> On Wed, Aug 4, 2010 at 2:58 AM, Jochen Späth <[email protected]>
> wrote:
> >>> Hello Stas,
> >>>
> >>> thank you very much for your advice. I'm aware of the possibility of
> calculating the aggregate sums of investment for different subpopoluations
> using the pweight and calculating the aggregate (=aweighted) growth rates
> from the newly-generated data. I was just wondering whether there were a
> more "flexible" approach, such as, say multiplicating the two weight
> variables and use the result in a single -tabstat- or something like that.
> >>
> >> -
> >
> > On Tue, Aug 3, 2010 at 12:30 PM, Stas Kolenikov <[email protected]>
> wrote:
> >> You would probably want to
> >>
> >> svyset PSU [pw=your weight], strata(strata)
> >> svy : total investment, over( year sector )
> >> nlcom ([investment]_subpop_2 -
> [investment]_subpop_1)/[investment]_subpop_1
> >>
> >> or whatever labels the -total- command is going to give to individual
> >> coefficients.
> >>
> >> On Tue, Aug 3, 2010 at 8:29 AM, Jochen Späth <[email protected]>
> wrote:
> >>> Dear Statalisters,
> >>>
> >>> I have a question about weights, especially about "double weights".
> >>>
> >>> I have micro-data on firms containing information about their
> investment behaviour (amounts) for several years. I then went on to
> calculate the firms' individual (discrete) growth rates of investment,
> i.e.
> >>>
> >>> rate_t = (inv_t - inv_t-1) / inv_t-1
> >>>
> >>> and wish to use these individual growth rates to calculate average
> growth rates for, say, economic sectors. Thereby, I'd like to attach an
> aweight to the -tabstat-, -table- or other suitable command, such that
> firms with higher investments in t-1 contribute a higher share to the
> average growth rate. This is, of course, straightforward in Stata.
> >>>
> >>> However, since I have sampled data I need to attach to this operation
> also a pweight to get information for the population instead of the
> sample.
> >>>
> >>> Can I calculate the average growth rates from the individual ones or
> do I need to -collapse- or -table, replace- my data? It seems that -
> svyset- could be what I am looking for, but it seems rather complicated.
> Is there a way to avoid the -svyset- command and to go on with simple -
> tabstat- or alike instead?
> >>>
> >>> Best,
> >>> Jochen
> >>>
> >
> 
> 
> 
> --
> Steven Samuels
> [email protected]
> 18 Cantine's Island
> Saugerties NY 12477
> USA
> Voice: 845-246-0774
> Fax:    206-202-4783
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: pweight + aweight, double weights
  - From: Steve Samuels <[email protected]>

References:
- st: pweight + aweight, double weights
  - From: Jochen Späth <[email protected]>
- Re: st: pweight + aweight, double weights
  - From: Stas Kolenikov <[email protected]>
- AW: st: pweight + aweight, double weights
  - From: Jochen Späth <[email protected]>
- Re: st: pweight + aweight, double weights
  - From: Stas Kolenikov <[email protected]>
- Re: st: pweight + aweight, double weights
  - From: Steve Samuels <[email protected]>
- Re: st: pweight + aweight, double weights
  - From: Steve Samuels <[email protected]>

Prev by Date: st: re: SUR with endogenous regressors
Next by Date: st: automated response
Previous by thread: Re: st: pweight + aweight, double weights
Next by thread: Re: st: pweight + aweight, double weights
Index(es):
- Date
- Thread