Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: pweight + aweight, double weights |

Date |
Sat, 7 Aug 2010 23:11:11 -0400 |

Jochen, I can't tell you the number of times I've done something similar. To go back to your original question: You can do many other things with the individual rates: plot them; for example, or get their probability-weighted means, standard deviations,and percentiles. Steve On Fri, Aug 6, 2010 at 5:20 AM, Jochen Späth <jochen.spaeth@iaw.edu> wrote: > Thanks Steve, > > you are perfectly right. I could swear the first -table- statement in my code had provided the results that I posted. > > Now, this is perfectly the solution I was looking for, many thanks! > Jochen > > >> -----Ursprüngliche Nachricht----- >> Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner- >> statalist@hsphsun2.harvard.edu] Im Auftrag von Steve Samuels >> Gesendet: Donnerstag, 5. August 2010 19:08 >> An: statalist@hsphsun2.harvard.edu >> Betreff: Re: st: pweight + aweight, double weights >> >> The two methods give identical results because they are algebraically >> equivalent: >> >> For the pweighted mean with pwt2 = length_1 x pwt. >> and rate = (length_2 - length_1): >> >> pwt2--weighted mean of rate = (sum of pwt2 x rate)/(sum of pwt2) >> >> The numerator is: >> sum of pwt x length_1 x (length_2 - length_1)/length_1 >> = sum of pwt x (length_2 - length_1) >> = sum of pwt x length_2 minus sum of pwt x length_1 >> = (pwt--weighted sum of length_2) minus (pwt-weighted sum of length_1) >> >> The denominator is: >> sum of pwt x length_1 >> = pwt-weighted sum of length_1 >> >> >> Steve >> On Thu, Aug 5, 2010 at 9:23 AM, Steve Samuels <sjsamuels@gmail.com> wrote: >> > Jochen, the totals you used in the -display- lines are different from >> > those produced by the first -table- statement. When I use the >> > latter, the results of the two methods are identical. >> > >> > Steve >> > >> > **************************CODE BEGINS************************** >> > sysuse auto, clear >> > gen double length_2 = displacement >> > rename length length_1 >> > rename trunk pwt >> > * Look up the pweighted sums of length_1 and length_2 for foreign and >> > domestic cars: >> > table foreign [pw= pwt], c(sum length_1 sum length_2) >> > >> > di "Domestic: " (190108 - 153917)/153917 >> > di " Foreign: " (28194 - 42450)/42450 >> > >> > * Look up the growth rates based on the aggregate sums of lenght_1 and >> length_2: >> > >> > * Do a pweighted mean of the individual growth rated with pw = inital >> > value x pweight: >> > gen double pwt2 = length_1*pwt >> > cap drop rate >> > gen double rate = (length_2 - length_1) / length_1 >> > table foreign [pweight = pwt2], c(mean rate) >> > ***************************CODE ENDS*************************** >> > >> > >> > On Thu, Aug 5, 2010 at 4:40 AM, Jochen Späth <jochen.spaeth@iaw.edu> >> wrote: >> >> Hi Steve, >> >> >> >> thanks for your little program. What I do not understand is your >> statement that with a "probability weighted mean of the individual growth >> rates" I "would wind up with the rate based on the probability-weighted >> aggregated sums". Check out this: >> >> >> >> **************************CODE BEGINS************************** >> >> sysuse auto, clear >> >> gen length_2 = displacement >> >> rename length length_1 >> >> rename trunk pw >> >> >> >> * Look up the pweighted sums of length_1 and length_2 for foreign and >> domestic cars: >> >> >> >> table foreign [pw= pw], c(sum length_1 sum length_2) >> >> >> >> * Look up the growth rates based on the aggregate sums of lenght_1 and >> length_2: >> >> >> >> di "domestic:" (311319 - 270137 ) / 270137 >> >> di "foreign:" (155268 - 235051) / 235051 >> >> >> >> * Do a pweighted mean of the individual growth rated with pw = inital >> value x pweight: >> >> cap drop rate >> >> gen rate = (length_2 - length_1) / length_1 >> >> table foreign [pweight = length_1 * pw], c(mean rate) >> >> ***************************CODE ENDS*************************** >> >> >> >> Jochen >> >> >> >>> -----Ursprüngliche Nachricht----- >> >>> Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner- >> >>> statalist@hsphsun2.harvard.edu] Im Auftrag von Steve Samuels >> >>> Gesendet: Mittwoch, 4. August 2010 23:14 >> >>> An: statalist@hsphsun2.harvard.edu >> >>> Betreff: Re: st: pweight + aweight, double weights >> >>> >> >>> I can see that the program is a little cryptic. To clarify: >> >>> >> >>> I applied- svy: ratio- to R = length_2/length_1 and got asymmetric >> >>> confidence intervals for R by computing them on the log scale and >> >>> transforming back. >> >>> >> >>> The rate that Jochen asked for is rate = (length_2 - >> >>> length_1)/length_1 = R - 1, and that is what the -antilog-- program >> >>> reports. "relc" meant "relative change", which seemed clear to me, at >> >>> the time. >> >>> >> >>> Steve >> >>> >> >>> On Wed, Aug 4, 2010 at 1:37 PM, Steve Samuels <sjsamuels@gmail.com> >> wrote: >> >>> > Jochen-- >> >>> > If you do a probability weighted mean of the individual growth >> rates >> >>> > for a time period (single year, first year to last year) and weight >> by >> >>> > w = (initial value) x (probability weight), you would wind up with >> >>> > the rate based on the probability-weighted aggregated sums. So >> Stas's >> >>> > solution is exactly the solution you seek. Moreover, Stas's version >> >>> > will provide the correct standard error, one appropriate for a ratio >> >>> > estimate. >> >>> > >> >>> > You could also calculate the ratio estimate directly and get >> >>> > asymmetric CI's, which are likely to be more accurate than the >> >>> > symmetric intervals >> >>> > >> >>> > **************************CODE BEGINS************************** >> >>> > capture program drop _all >> >>> > program antilog >> >>> > local lparm el(r(b),1,1) >> >>> > local se sqrt(el(r(V),1,1)) >> >>> > local bound invttail(e(df_r),.025)*`se' >> >>> > local parm exp(`lparm') >> >>> > >> >>> > local ll exp(`lparm' - `bound') >> >>> > local ul exp( `lparm' + `bound') >> >>> > di "relc = " 100*( `parm'-1) " ll = " 100*(`ll'-1) " ul = >> " >> >>> > 100*(`ul'-1) >> >>> > end >> >>> > >> >>> > sysuse auto, clear >> >>> > gen length_2 = displacement >> >>> > rename length length_1 >> >>> > svyset _n >> >>> > svy: ratio length_2/length_1 >> >>> > nlcom log(_b[_ratio_1]) >> >>> > antilog >> >>> > >> >>> > ***************************CODE ENDS*************************** >> >>> > >> >>> > >> >>> > Steve >> >>> > ' >> >>> > Steven Samuels >> >>> > sjsamuels@gmail.com >> >>> > 18 Cantine's Island >> >>> > Saugerties NY 12477 >> >>> > USA >> >>> > Voice: 845-246-0774 >> >>> > Fax: 206-202-4783 >> >>> > >> >>> > >> >>> > >> >>> > On Wed, Aug 4, 2010 at 11:43 AM, Stas Kolenikov <skolenik@gmail.com> >> >>> wrote: >> >>> >> Who knows. You might be able to get identical answers, but you'll >> >>> >> spend more time trying to figure out the appropriate composition of >> >>> >> weights trying to reproduce the answer from those -total- commands. >> >>> >> >> >>> >> On Wed, Aug 4, 2010 at 2:58 AM, Jochen Späth >> <jochen.spaeth@iaw.edu> >> >>> wrote: >> >>> >>> Hello Stas, >> >>> >>> >> >>> >>> thank you very much for your advice. I'm aware of the possibility >> of >> >>> calculating the aggregate sums of investment for different >> subpopoluations >> >>> using the pweight and calculating the aggregate (=aweighted) growth >> rates >> >>> from the newly-generated data. I was just wondering whether there were >> a >> >>> more "flexible" approach, such as, say multiplicating the two weight >> >>> variables and use the result in a single -tabstat- or something like >> that. >> >>> >> >> >>> >> - >> >>> > >> >>> > On Tue, Aug 3, 2010 at 12:30 PM, Stas Kolenikov <skolenik@gmail.com> >> >>> wrote: >> >>> >> You would probably want to >> >>> >> >> >>> >> svyset PSU [pw=your weight], strata(strata) >> >>> >> svy : total investment, over( year sector ) >> >>> >> nlcom ([investment]_subpop_2 - >> >>> [investment]_subpop_1)/[investment]_subpop_1 >> >>> >> >> >>> >> or whatever labels the -total- command is going to give to >> individual >> >>> >> coefficients. >> >>> >> >> >>> >> On Tue, Aug 3, 2010 at 8:29 AM, Jochen Späth >> <jochen.spaeth@iaw.edu> >> >>> wrote: >> >>> >>> Dear Statalisters, >> >>> >>> >> >>> >>> I have a question about weights, especially about "double >> weights". >> >>> >>> >> >>> >>> I have micro-data on firms containing information about their >> >>> investment behaviour (amounts) for several years. I then went on to >> >>> calculate the firms' individual (discrete) growth rates of investment, >> >>> i.e. >> >>> >>> >> >>> >>> rate_t = (inv_t - inv_t-1) / inv_t-1 >> >>> >>> >> >>> >>> and wish to use these individual growth rates to calculate average >> >>> growth rates for, say, economic sectors. Thereby, I'd like to attach >> an >> >>> aweight to the -tabstat-, -table- or other suitable command, such that >> >>> firms with higher investments in t-1 contribute a higher share to the >> >>> average growth rate. This is, of course, straightforward in Stata. >> >>> >>> >> >>> >>> However, since I have sampled data I need to attach to this >> operation >> >>> also a pweight to get information for the population instead of the >> >>> sample. >> >>> >>> >> >>> >>> Can I calculate the average growth rates from the individual ones >> or >> >>> do I need to -collapse- or -table, replace- my data? It seems that - >> >>> svyset- could be what I am looking for, but it seems rather >> complicated. >> >>> Is there a way to avoid the -svyset- command and to go on with simple >> - >> >>> tabstat- or alike instead? >> >>> >>> >> >>> >>> Best, >> >>> >>> Jochen >> >>> >>> >> >>> > >> >>> >> >>> >> >>> >> >>> -- >> >>> Steven Samuels >> >>> sjsamuels@gmail.com >> >>> 18 Cantine's Island >> >>> Saugerties NY 12477 >> >>> USA >> >>> Voice: 845-246-0774 >> >>> Fax: 206-202-4783 >> >>> >> >>> * >> >>> * For searches and help try: >> >>> * http://www.stata.com/help.cgi?search >> >>> * http://www.stata.com/support/statalist/faq >> >>> * http://www.ats.ucla.edu/stat/stata/ >> >> >> >> * >> >> * For searches and help try: >> >> * http://www.stata.com/help.cgi?search >> >> * http://www.stata.com/support/statalist/faq >> >> * http://www.ats.ucla.edu/stat/stata/ >> >> >> > >> > >> > >> > -- >> > Steven Samuels >> > sjsamuels@gmail.com >> > 18 Cantine's Island >> > Saugerties NY 12477 >> > USA >> > Voice: 845-246-0774 >> > Fax: 206-202-4783 >> > >> >> >> >> -- >> Steven Samuels >> sjsamuels@gmail.com >> 18 Cantine's Island >> Saugerties NY 12477 >> USA >> Voice: 845-246-0774 >> Fax: 206-202-4783 >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Steven Samuels sjsamuels@gmail.com 18 Cantine's Island Saugerties NY 12477 USA Voice: 845-246-0774 Fax: 206-202-4783 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: pweight + aweight, double weights***From:*Jochen Späth <jochen.spaeth@iaw.edu>

**Re: st: pweight + aweight, double weights***From:*Stas Kolenikov <skolenik@gmail.com>

**AW: st: pweight + aweight, double weights***From:*Jochen Späth <jochen.spaeth@iaw.edu>

**Re: st: pweight + aweight, double weights***From:*Stas Kolenikov <skolenik@gmail.com>

**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

**AW: st: pweight + aweight, double weights***From:*Jochen Späth <jochen.spaeth@iaw.edu>

**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: pweight + aweight, double weights***From:*Steve Samuels <sjsamuels@gmail.com>

**AW: st: pweight + aweight, double weights***From:*Jochen Späth <jochen.spaeth@iaw.edu>

- Prev by Date:
**Re: st: endogenous variables** - Next by Date:
**st: Fw: Contingency tables etc. in Stata and SPSS** - Previous by thread:
**AW: st: pweight + aweight, double weights** - Next by thread:
**st: Emily's fall school dates** - Index(es):