Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: pweight + aweight, double weights


From   Jochen Späth <jochen.spaeth@iaw.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   AW: st: pweight + aweight, double weights
Date   Fri, 6 Aug 2010 11:20:23 +0200

Thanks Steve,

you are perfectly right. I could swear the first -table- statement in my code had provided the results that I posted.

Now, this is perfectly the solution I was looking for, many thanks!
Jochen


> -----Ursprüngliche Nachricht-----
> Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
> statalist@hsphsun2.harvard.edu] Im Auftrag von Steve Samuels
> Gesendet: Donnerstag, 5. August 2010 19:08
> An: statalist@hsphsun2.harvard.edu
> Betreff: Re: st: pweight + aweight, double weights
> 
> The two methods give identical results because they are algebraically
> equivalent:
> 
> For the pweighted mean with pwt2 = length_1 x pwt.
> and rate = (length_2 - length_1):
> 
> pwt2--weighted mean of rate = (sum of pwt2 x rate)/(sum of pwt2)
> 
> The numerator is:
>    sum of pwt x length_1 x (length_2 - length_1)/length_1
> = sum of pwt x (length_2 - length_1)
> = sum of pwt x length_2   minus sum of pwt x length_1
> = (pwt--weighted sum of length_2) minus  (pwt-weighted sum of length_1)
> 
> The denominator is:
>    sum of pwt x length_1
> = pwt-weighted sum of length_1
> 
> 
> Steve
> On Thu, Aug 5, 2010 at 9:23 AM, Steve Samuels <sjsamuels@gmail.com> wrote:
> > Jochen, the totals you used in the -display- lines are different from
> > those produced by the first -table- statement.  When I use the
> > latter, the results of the two methods are identical.
> >
> > Steve
> >
> > **************************CODE BEGINS**************************
> > sysuse auto, clear
> > gen double length_2 = displacement
> > rename length length_1
> > rename trunk pwt
> > * Look up the pweighted sums of length_1 and length_2 for foreign and
> > domestic cars:
> > table foreign [pw= pwt], c(sum length_1 sum length_2)
> >
> > di  "Domestic: "  (190108 - 153917)/153917
> > di " Foreign:  "  (28194  -  42450)/42450
> >
> > * Look up the growth rates based on the aggregate sums of lenght_1 and
> length_2:
> >
> > * Do a pweighted mean of the individual growth rated with pw = inital
> > value x pweight:
> > gen double pwt2 = length_1*pwt
> > cap drop rate
> > gen double rate = (length_2 - length_1) / length_1
> > table foreign [pweight = pwt2], c(mean rate)
> > ***************************CODE ENDS***************************
> >
> >
> > On Thu, Aug 5, 2010 at 4:40 AM, Jochen Späth <jochen.spaeth@iaw.edu>
> wrote:
> >> Hi Steve,
> >>
> >> thanks for your little program. What I do not understand is your
> statement that with a "probability weighted mean of the  individual growth
> rates" I "would wind up with the rate based on the probability-weighted
> aggregated sums". Check out this:
> >>
> >> **************************CODE BEGINS**************************
> >> sysuse auto, clear
> >> gen length_2 = displacement
> >> rename length length_1
> >> rename trunk pw
> >>
> >> * Look up the pweighted sums of length_1 and length_2 for foreign and
> domestic cars:
> >>
> >> table foreign [pw= pw], c(sum length_1 sum length_2)
> >>
> >> * Look up the growth rates based on the aggregate sums of lenght_1 and
> length_2:
> >>
> >> di "domestic:" (311319 - 270137 ) / 270137
> >> di "foreign:"  (155268 - 235051) / 235051
> >>
> >> * Do a pweighted mean of the individual growth rated with pw = inital
> value x pweight:
> >> cap drop rate
> >> gen rate = (length_2 - length_1) / length_1
> >> table foreign [pweight = length_1 * pw], c(mean rate)
> >> ***************************CODE ENDS***************************
> >>
> >> Jochen
> >>
> >>> -----Ursprüngliche Nachricht-----
> >>> Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
> >>> statalist@hsphsun2.harvard.edu] Im Auftrag von Steve Samuels
> >>> Gesendet: Mittwoch, 4. August 2010 23:14
> >>> An: statalist@hsphsun2.harvard.edu
> >>> Betreff: Re: st: pweight + aweight, double weights
> >>>
> >>> I can see that the program is a little cryptic.  To clarify:
> >>>
> >>> I applied- svy: ratio- to  R =  length_2/length_1  and got asymmetric
> >>> confidence intervals for R by computing them on the log scale and
> >>> transforming back.
> >>>
> >>> The rate that Jochen asked for is rate =  (length_2 -
> >>> length_1)/length_1 = R - 1, and that is what the -antilog-- program
> >>> reports.  "relc" meant "relative change", which seemed clear to me, at
> >>> the time.
> >>>
> >>> Steve
> >>>
> >>> On Wed, Aug 4, 2010 at 1:37 PM, Steve Samuels <sjsamuels@gmail.com>
> wrote:
> >>> > Jochen--
> >>> > If you do a probability weighted mean of the  individual growth
> rates
> >>> > for a time period (single year, first year to last year) and weight
> by
> >>> > w =  (initial value) x (probability weight), you would wind up with
> >>> > the rate based on the probability-weighted aggregated sums. So
> Stas's
> >>> > solution is exactly the solution you seek. Moreover,  Stas's version
> >>> > will provide the correct standard error, one appropriate for a ratio
> >>> > estimate.
> >>> >
> >>> > You could also calculate the ratio estimate directly and get
> >>> > asymmetric CI's, which are likely to be more accurate than the
> >>> > symmetric intervals
> >>> >
> >>> > **************************CODE BEGINS**************************
> >>> > capture program drop _all
> >>> > program antilog
> >>> > local lparm  el(r(b),1,1)
> >>> > local se    sqrt(el(r(V),1,1))
> >>> > local bound  invttail(e(df_r),.025)*`se'
> >>> > local parm  exp(`lparm')
> >>> >
> >>> > local ll  exp(`lparm'  - `bound')
> >>> > local ul  exp( `lparm' + `bound')
> >>> > di  "relc = "  100*( `parm'-1)  "    ll = "  100*(`ll'-1)  "   ul =
> "
> >>> > 100*(`ul'-1)
> >>> > end
> >>> >
> >>> > sysuse auto, clear
> >>> > gen length_2 = displacement
> >>> > rename length length_1
> >>> > svyset _n
> >>> > svy: ratio length_2/length_1
> >>> > nlcom log(_b[_ratio_1])
> >>> > antilog
> >>> >
> >>> > ***************************CODE ENDS***************************
> >>> >
> >>> >
> >>> > Steve
> >>> > '
> >>> > Steven Samuels
> >>> > sjsamuels@gmail.com
> >>> > 18 Cantine's Island
> >>> > Saugerties NY 12477
> >>> > USA
> >>> > Voice: 845-246-0774
> >>> > Fax:    206-202-4783
> >>> >
> >>> >
> >>> >
> >>> > On Wed, Aug 4, 2010 at 11:43 AM, Stas Kolenikov <skolenik@gmail.com>
> >>> wrote:
> >>> >> Who knows. You might be able to get identical answers, but you'll
> >>> >> spend more time trying to figure out the appropriate composition of
> >>> >> weights trying to reproduce the answer from those -total- commands.
> >>> >>
> >>> >> On Wed, Aug 4, 2010 at 2:58 AM, Jochen Späth
> <jochen.spaeth@iaw.edu>
> >>> wrote:
> >>> >>> Hello Stas,
> >>> >>>
> >>> >>> thank you very much for your advice. I'm aware of the possibility
> of
> >>> calculating the aggregate sums of investment for different
> subpopoluations
> >>> using the pweight and calculating the aggregate (=aweighted) growth
> rates
> >>> from the newly-generated data. I was just wondering whether there were
> a
> >>> more "flexible" approach, such as, say multiplicating the two weight
> >>> variables and use the result in a single -tabstat- or something like
> that.
> >>> >>
> >>> >> -
> >>> >
> >>> > On Tue, Aug 3, 2010 at 12:30 PM, Stas Kolenikov <skolenik@gmail.com>
> >>> wrote:
> >>> >> You would probably want to
> >>> >>
> >>> >> svyset PSU [pw=your weight], strata(strata)
> >>> >> svy : total investment, over( year sector )
> >>> >> nlcom ([investment]_subpop_2 -
> >>> [investment]_subpop_1)/[investment]_subpop_1
> >>> >>
> >>> >> or whatever labels the -total- command is going to give to
> individual
> >>> >> coefficients.
> >>> >>
> >>> >> On Tue, Aug 3, 2010 at 8:29 AM, Jochen Späth
> <jochen.spaeth@iaw.edu>
> >>> wrote:
> >>> >>> Dear Statalisters,
> >>> >>>
> >>> >>> I have a question about weights, especially about "double
> weights".
> >>> >>>
> >>> >>> I have micro-data on firms containing information about their
> >>> investment behaviour (amounts) for several years. I then went on to
> >>> calculate the firms' individual (discrete) growth rates of investment,
> >>> i.e.
> >>> >>>
> >>> >>> rate_t = (inv_t - inv_t-1) / inv_t-1
> >>> >>>
> >>> >>> and wish to use these individual growth rates to calculate average
> >>> growth rates for, say, economic sectors. Thereby, I'd like to attach
> an
> >>> aweight to the -tabstat-, -table- or other suitable command, such that
> >>> firms with higher investments in t-1 contribute a higher share to the
> >>> average growth rate. This is, of course, straightforward in Stata.
> >>> >>>
> >>> >>> However, since I have sampled data I need to attach to this
> operation
> >>> also a pweight to get information for the population instead of the
> >>> sample.
> >>> >>>
> >>> >>> Can I calculate the average growth rates from the individual ones
> or
> >>> do I need to -collapse- or -table, replace- my data? It seems that -
> >>> svyset- could be what I am looking for, but it seems rather
> complicated.
> >>> Is there a way to avoid the -svyset- command and to go on with simple
> -
> >>> tabstat- or alike instead?
> >>> >>>
> >>> >>> Best,
> >>> >>> Jochen
> >>> >>>
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Steven Samuels
> >>> sjsamuels@gmail.com
> >>> 18 Cantine's Island
> >>> Saugerties NY 12477
> >>> USA
> >>> Voice: 845-246-0774
> >>> Fax:    206-202-4783
> >>>
> >>> *
> >>> *   For searches and help try:
> >>> *   http://www.stata.com/help.cgi?search
> >>> *   http://www.stata.com/support/statalist/faq
> >>> *   http://www.ats.ucla.edu/stat/stata/
> >>
> >> *
> >> *   For searches and help try:
> >> *   http://www.stata.com/help.cgi?search
> >> *   http://www.stata.com/support/statalist/faq
> >> *   http://www.ats.ucla.edu/stat/stata/
> >>
> >
> >
> >
> > --
> > Steven Samuels
> > sjsamuels@gmail.com
> > 18 Cantine's Island
> > Saugerties NY 12477
> > USA
> > Voice: 845-246-0774
> > Fax:    206-202-4783
> >
> 
> 
> 
> --
> Steven Samuels
> sjsamuels@gmail.com
> 18 Cantine's Island
> Saugerties NY 12477
> USA
> Voice: 845-246-0774
> Fax:    206-202-4783
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index