Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Jochen Späth <jochen.spaeth@iaw.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
AW: st: pweight + aweight, double weights |

Date |
Fri, 6 Aug 2010 11:20:23 +0200 |

Thanks Steve, you are perfectly right. I could swear the first -table- statement in my code had provided the results that I posted. Now, this is perfectly the solution I was looking for, many thanks! Jochen > -----Ursprüngliche Nachricht----- > Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner- > statalist@hsphsun2.harvard.edu] Im Auftrag von Steve Samuels > Gesendet: Donnerstag, 5. August 2010 19:08 > An: statalist@hsphsun2.harvard.edu > Betreff: Re: st: pweight + aweight, double weights > > The two methods give identical results because they are algebraically > equivalent: > > For the pweighted mean with pwt2 = length_1 x pwt. > and rate = (length_2 - length_1): > > pwt2--weighted mean of rate = (sum of pwt2 x rate)/(sum of pwt2) > > The numerator is: > sum of pwt x length_1 x (length_2 - length_1)/length_1 > = sum of pwt x (length_2 - length_1) > = sum of pwt x length_2 minus sum of pwt x length_1 > = (pwt--weighted sum of length_2) minus (pwt-weighted sum of length_1) > > The denominator is: > sum of pwt x length_1 > = pwt-weighted sum of length_1 > > > Steve > On Thu, Aug 5, 2010 at 9:23 AM, Steve Samuels <sjsamuels@gmail.com> wrote: > > Jochen, the totals you used in the -display- lines are different from > > those produced by the first -table- statement. When I use the > > latter, the results of the two methods are identical. > > > > Steve > > > > **************************CODE BEGINS************************** > > sysuse auto, clear > > gen double length_2 = displacement > > rename length length_1 > > rename trunk pwt > > * Look up the pweighted sums of length_1 and length_2 for foreign and > > domestic cars: > > table foreign [pw= pwt], c(sum length_1 sum length_2) > > > > di "Domestic: " (190108 - 153917)/153917 > > di " Foreign: " (28194 - 42450)/42450 > > > > * Look up the growth rates based on the aggregate sums of lenght_1 and > length_2: > > > > * Do a pweighted mean of the individual growth rated with pw = inital > > value x pweight: > > gen double pwt2 = length_1*pwt > > cap drop rate > > gen double rate = (length_2 - length_1) / length_1 > > table foreign [pweight = pwt2], c(mean rate) > > ***************************CODE ENDS*************************** > > > > > > On Thu, Aug 5, 2010 at 4:40 AM, Jochen Späth <jochen.spaeth@iaw.edu> > wrote: > >> Hi Steve, > >> > >> thanks for your little program. What I do not understand is your > statement that with a "probability weighted mean of the individual growth > rates" I "would wind up with the rate based on the probability-weighted > aggregated sums". Check out this: > >> > >> **************************CODE BEGINS************************** > >> sysuse auto, clear > >> gen length_2 = displacement > >> rename length length_1 > >> rename trunk pw > >> > >> * Look up the pweighted sums of length_1 and length_2 for foreign and > domestic cars: > >> > >> table foreign [pw= pw], c(sum length_1 sum length_2) > >> > >> * Look up the growth rates based on the aggregate sums of lenght_1 and > length_2: > >> > >> di "domestic:" (311319 - 270137 ) / 270137 > >> di "foreign:" (155268 - 235051) / 235051 > >> > >> * Do a pweighted mean of the individual growth rated with pw = inital > value x pweight: > >> cap drop rate > >> gen rate = (length_2 - length_1) / length_1 > >> table foreign [pweight = length_1 * pw], c(mean rate) > >> ***************************CODE ENDS*************************** > >> > >> Jochen > >> > >>> -----Ursprüngliche Nachricht----- > >>> Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner- > >>> statalist@hsphsun2.harvard.edu] Im Auftrag von Steve Samuels > >>> Gesendet: Mittwoch, 4. August 2010 23:14 > >>> An: statalist@hsphsun2.harvard.edu > >>> Betreff: Re: st: pweight + aweight, double weights > >>> > >>> I can see that the program is a little cryptic. To clarify: > >>> > >>> I applied- svy: ratio- to R = length_2/length_1 and got asymmetric > >>> confidence intervals for R by computing them on the log scale and > >>> transforming back. > >>> > >>> The rate that Jochen asked for is rate = (length_2 - > >>> length_1)/length_1 = R - 1, and that is what the -antilog-- program > >>> reports. "relc" meant "relative change", which seemed clear to me, at > >>> the time. > >>> > >>> Steve > >>> > >>> On Wed, Aug 4, 2010 at 1:37 PM, Steve Samuels <sjsamuels@gmail.com> > wrote: > >>> > Jochen-- > >>> > If you do a probability weighted mean of the individual growth > rates > >>> > for a time period (single year, first year to last year) and weight > by > >>> > w = (initial value) x (probability weight), you would wind up with > >>> > the rate based on the probability-weighted aggregated sums. So > Stas's > >>> > solution is exactly the solution you seek. Moreover, Stas's version > >>> > will provide the correct standard error, one appropriate for a ratio > >>> > estimate. > >>> > > >>> > You could also calculate the ratio estimate directly and get > >>> > asymmetric CI's, which are likely to be more accurate than the > >>> > symmetric intervals > >>> > > >>> > **************************CODE BEGINS************************** > >>> > capture program drop _all > >>> > program antilog > >>> > local lparm el(r(b),1,1) > >>> > local se sqrt(el(r(V),1,1)) > >>> > local bound invttail(e(df_r),.025)*`se' > >>> > local parm exp(`lparm') > >>> > > >>> > local ll exp(`lparm' - `bound') > >>> > local ul exp( `lparm' + `bound') > >>> > di "relc = " 100*( `parm'-1) " ll = " 100*(`ll'-1) " ul = > " > >>> > 100*(`ul'-1) > >>> > end > >>> > > >>> > sysuse auto, clear > >>> > gen length_2 = displacement > >>> > rename length length_1 > >>> > svyset _n > >>> > svy: ratio length_2/length_1 > >>> > nlcom log(_b[_ratio_1]) > >>> > antilog > >>> > > >>> > ***************************CODE ENDS*************************** > >>> > > >>> > > >>> > Steve > >>> > ' > >>> > Steven Samuels > >>> > sjsamuels@gmail.com > >>> > 18 Cantine's Island > >>> > Saugerties NY 12477 > >>> > USA > >>> > Voice: 845-246-0774 > >>> > Fax: 206-202-4783 > >>> > > >>> > > >>> > > >>> > On Wed, Aug 4, 2010 at 11:43 AM, Stas Kolenikov <skolenik@gmail.com> > >>> wrote: > >>> >> Who knows. You might be able to get identical answers, but you'll > >>> >> spend more time trying to figure out the appropriate composition of > >>> >> weights trying to reproduce the answer from those -total- commands. > >>> >> > >>> >> On Wed, Aug 4, 2010 at 2:58 AM, Jochen Späth > <jochen.spaeth@iaw.edu> > >>> wrote: > >>> >>> Hello Stas, > >>> >>> > >>> >>> thank you very much for your advice. I'm aware of the possibility > of > >>> calculating the aggregate sums of investment for different > subpopoluations > >>> using the pweight and calculating the aggregate (=aweighted) growth > rates > >>> from the newly-generated data. I was just wondering whether there were > a > >>> more "flexible" approach, such as, say multiplicating the two weight > >>> variables and use the result in a single -tabstat- or something like > that. > >>> >> > >>> >> - > >>> > > >>> > On Tue, Aug 3, 2010 at 12:30 PM, Stas Kolenikov <skolenik@gmail.com> > >>> wrote: > >>> >> You would probably want to > >>> >> > >>> >> svyset PSU [pw=your weight], strata(strata) > >>> >> svy : total investment, over( year sector ) > >>> >> nlcom ([investment]_subpop_2 - > >>> [investment]_subpop_1)/[investment]_subpop_1 > >>> >> > >>> >> or whatever labels the -total- command is going to give to > individual > >>> >> coefficients. > >>> >> > >>> >> On Tue, Aug 3, 2010 at 8:29 AM, Jochen Späth > <jochen.spaeth@iaw.edu> > >>> wrote: > >>> >>> Dear Statalisters, > >>> >>> > >>> >>> I have a question about weights, especially about "double > weights". > >>> >>> > >>> >>> I have micro-data on firms containing information about their > >>> investment behaviour (amounts) for several years. I then went on to > >>> calculate the firms' individual (discrete) growth rates of investment, > >>> i.e. > >>> >>> > >>> >>> rate_t = (inv_t - inv_t-1) / inv_t-1 > >>> >>> > >>> >>> and wish to use these individual growth rates to calculate average > >>> growth rates for, say, economic sectors. Thereby, I'd like to attach > an > >>> aweight to the -tabstat-, -table- or other suitable command, such that > >>> firms with higher investments in t-1 contribute a higher share to the > >>> average growth rate. This is, of course, straightforward in Stata. > >>> >>> > >>> >>> However, since I have sampled data I need to attach to this > operation > >>> also a pweight to get information for the population instead of the > >>> sample. > >>> >>> > >>> >>> Can I calculate the average growth rates from the individual ones > or > >>> do I need to -collapse- or -table, replace- my data? It seems that - > >>> svyset- could be what I am looking for, but it seems rather > complicated. > >>> Is there a way to avoid the -svyset- command and to go on with simple > - > >>> tabstat- or alike instead? > >>> >>> > >>> >>> Best, > >>> >>> Jochen > >>> >>> > >>> > > >>> > >>> > >>> > >>> -- > >>> Steven Samuels > >>> sjsamuels@gmail.com > >>> 18 Cantine's Island > >>> Saugerties NY 12477 > >>> USA > >>> Voice: 845-246-0774 > >>> Fax: 206-202-4783 > >>> > >>> * > >>> * For searches and help try: > >>> * http://www.stata.com/help.cgi?search > >>> * http://www.stata.com/support/statalist/faq > >>> * http://www.ats.ucla.edu/stat/stata/ > >> > >> * > >> * For searches and help try: > >> * http://www.stata.com/help.cgi?search > >> * http://www.stata.com/support/statalist/faq > >> * http://www.ats.ucla.edu/stat/stata/ > >> > > > > > > > > -- > > Steven Samuels > > sjsamuels@gmail.com > > 18 Cantine's Island > > Saugerties NY 12477 > > USA > > Voice: 845-246-0774 > > Fax: 206-202-4783 > > > > > > -- > Steven Samuels > sjsamuels@gmail.com > 18 Cantine's Island > Saugerties NY 12477 > USA > Voice: 845-246-0774 > Fax: 206-202-4783 > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

**References**:**st: pweight + aweight, double weights***From:*Jochen Späth <jochen.spaeth@iaw.edu>

**Re: st: pweight + aweight, double weights***From:*Stas Kolenikov <skolenik@gmail.com>

**AW: st: pweight + aweight, double weights***From:*Jochen Späth <jochen.spaeth@iaw.edu>

**Re: st: pweight + aweight, double weights***From:*Stas Kolenikov <skolenik@gmail.com>

**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

**AW: st: pweight + aweight, double weights***From:*Jochen Späth <jochen.spaeth@iaw.edu>

**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

- Prev by Date:
**Re: AW: AW: st: Panelvar** - Next by Date:
**Re : st: AW: count with forval and 2 conditions** - Previous by thread:
**Re: st: pweight + aweight, double weights** - Next by thread:
**Re: st: pweight + aweight, double weights** - Index(es):