Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: pweight + aweight, double weights

From	Steve Samuels <[email protected]>
To	[email protected]
Subject	Re: st: pweight + aweight, double weights
Date	Thu, 5 Aug 2010 09:23:15 -0400

Jochen, the totals you used in the -display- lines are different from
those produced by the first -table- statement.  When I use the
latter, the results of the two methods are identical.

Steve

**************************CODE BEGINS**************************
sysuse auto, clear
gen double length_2 = displacement
rename length length_1
rename trunk pwt
* Look up the pweighted sums of length_1 and length_2 for foreign and
domestic cars:
table foreign [pw= pwt], c(sum length_1 sum length_2)

di  "Domestic: "  (190108 - 153917)/153917
di " Foreign:  "  (28194  -  42450)/42450

* Look up the growth rates based on the aggregate sums of lenght_1 and length_2:

* Do a pweighted mean of the individual growth rated with pw = inital
value x pweight:
gen double pwt2 = length_1*pwt
cap drop rate
gen double rate = (length_2 - length_1) / length_1
table foreign [pweight = pwt2], c(mean rate)
***************************CODE ENDS***************************


On Thu, Aug 5, 2010 at 4:40 AM, Jochen Späth <[email protected]> wrote:
> Hi Steve,
>
> thanks for your little program. What I do not understand is your statement that with a "probability weighted mean of the  individual growth rates" I "would wind up with the rate based on the probability-weighted aggregated sums". Check out this:
>
> **************************CODE BEGINS**************************
> sysuse auto, clear
> gen length_2 = displacement
> rename length length_1
> rename trunk pw
>
> * Look up the pweighted sums of length_1 and length_2 for foreign and domestic cars:
>
> table foreign [pw= pw], c(sum length_1 sum length_2)
>
> * Look up the growth rates based on the aggregate sums of lenght_1 and length_2:
>
> di "domestic:" (311319 - 270137 ) / 270137
> di "foreign:"  (155268 - 235051) / 235051
>
> * Do a pweighted mean of the individual growth rated with pw = inital value x pweight:
> cap drop rate
> gen rate = (length_2 - length_1) / length_1
> table foreign [pweight = length_1 * pw], c(mean rate)
> ***************************CODE ENDS***************************
>
> Jochen
>
>> -----Ursprüngliche Nachricht-----
>> Von: [email protected] [mailto:owner-
>> [email protected]] Im Auftrag von Steve Samuels
>> Gesendet: Mittwoch, 4. August 2010 23:14
>> An: [email protected]
>> Betreff: Re: st: pweight + aweight, double weights
>>
>> I can see that the program is a little cryptic.  To clarify:
>>
>> I applied- svy: ratio- to  R =  length_2/length_1  and got asymmetric
>> confidence intervals for R by computing them on the log scale and
>> transforming back.
>>
>> The rate that Jochen asked for is rate =  (length_2 -
>> length_1)/length_1 = R - 1, and that is what the -antilog-- program
>> reports.  "relc" meant "relative change", which seemed clear to me, at
>> the time.
>>
>> Steve
>>
>> On Wed, Aug 4, 2010 at 1:37 PM, Steve Samuels <[email protected]> wrote:
>> > Jochen--
>> > If you do a probability weighted mean of the  individual growth rates
>> > for a time period (single year, first year to last year) and weight by
>> > w =  (initial value) x (probability weight), you would wind up with
>> > the rate based on the probability-weighted aggregated sums. So Stas's
>> > solution is exactly the solution you seek. Moreover,  Stas's version
>> > will provide the correct standard error, one appropriate for a ratio
>> > estimate.
>> >
>> > You could also calculate the ratio estimate directly and get
>> > asymmetric CI's, which are likely to be more accurate than the
>> > symmetric intervals
>> >
>> > **************************CODE BEGINS**************************
>> > capture program drop _all
>> > program antilog
>> > local lparm  el(r(b),1,1)
>> > local se    sqrt(el(r(V),1,1))
>> > local bound  invttail(e(df_r),.025)*`se'
>> > local parm  exp(`lparm')
>> >
>> > local ll  exp(`lparm'  - `bound')
>> > local ul  exp( `lparm' + `bound')
>> > di  "relc = "  100*( `parm'-1)  "    ll = "  100*(`ll'-1)  "   ul = "
>> > 100*(`ul'-1)
>> > end
>> >
>> > sysuse auto, clear
>> > gen length_2 = displacement
>> > rename length length_1
>> > svyset _n
>> > svy: ratio length_2/length_1
>> > nlcom log(_b[_ratio_1])
>> > antilog
>> >
>> > ***************************CODE ENDS***************************
>> >
>> >
>> > Steve
>> > '
>> > Steven Samuels
>> > [email protected]
>> > 18 Cantine's Island
>> > Saugerties NY 12477
>> > USA
>> > Voice: 845-246-0774
>> > Fax:    206-202-4783
>> >
>> >
>> >
>> > On Wed, Aug 4, 2010 at 11:43 AM, Stas Kolenikov <[email protected]>
>> wrote:
>> >> Who knows. You might be able to get identical answers, but you'll
>> >> spend more time trying to figure out the appropriate composition of
>> >> weights trying to reproduce the answer from those -total- commands.
>> >>
>> >> On Wed, Aug 4, 2010 at 2:58 AM, Jochen Späth <[email protected]>
>> wrote:
>> >>> Hello Stas,
>> >>>
>> >>> thank you very much for your advice. I'm aware of the possibility of
>> calculating the aggregate sums of investment for different subpopoluations
>> using the pweight and calculating the aggregate (=aweighted) growth rates
>> from the newly-generated data. I was just wondering whether there were a
>> more "flexible" approach, such as, say multiplicating the two weight
>> variables and use the result in a single -tabstat- or something like that.
>> >>
>> >> -
>> >
>> > On Tue, Aug 3, 2010 at 12:30 PM, Stas Kolenikov <[email protected]>
>> wrote:
>> >> You would probably want to
>> >>
>> >> svyset PSU [pw=your weight], strata(strata)
>> >> svy : total investment, over( year sector )
>> >> nlcom ([investment]_subpop_2 -
>> [investment]_subpop_1)/[investment]_subpop_1
>> >>
>> >> or whatever labels the -total- command is going to give to individual
>> >> coefficients.
>> >>
>> >> On Tue, Aug 3, 2010 at 8:29 AM, Jochen Späth <[email protected]>
>> wrote:
>> >>> Dear Statalisters,
>> >>>
>> >>> I have a question about weights, especially about "double weights".
>> >>>
>> >>> I have micro-data on firms containing information about their
>> investment behaviour (amounts) for several years. I then went on to
>> calculate the firms' individual (discrete) growth rates of investment,
>> i.e.
>> >>>
>> >>> rate_t = (inv_t - inv_t-1) / inv_t-1
>> >>>
>> >>> and wish to use these individual growth rates to calculate average
>> growth rates for, say, economic sectors. Thereby, I'd like to attach an
>> aweight to the -tabstat-, -table- or other suitable command, such that
>> firms with higher investments in t-1 contribute a higher share to the
>> average growth rate. This is, of course, straightforward in Stata.
>> >>>
>> >>> However, since I have sampled data I need to attach to this operation
>> also a pweight to get information for the population instead of the
>> sample.
>> >>>
>> >>> Can I calculate the average growth rates from the individual ones or
>> do I need to -collapse- or -table, replace- my data? It seems that -
>> svyset- could be what I am looking for, but it seems rather complicated.
>> Is there a way to avoid the -svyset- command and to go on with simple -
>> tabstat- or alike instead?
>> >>>
>> >>> Best,
>> >>> Jochen
>> >>>
>> >
>>
>>
>>
>> --
>> Steven Samuels
>> [email protected]
>> 18 Cantine's Island
>> Saugerties NY 12477
>> USA
>> Voice: 845-246-0774
>> Fax:    206-202-4783
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Steven Samuels
[email protected]
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: pweight + aweight, double weights
  - From: Steve Samuels <[email protected]>

References:
- st: pweight + aweight, double weights
  - From: Jochen Späth <[email protected]>
- Re: st: pweight + aweight, double weights
  - From: Stas Kolenikov <[email protected]>
- AW: st: pweight + aweight, double weights
  - From: Jochen Späth <[email protected]>
- Re: st: pweight + aweight, double weights
  - From: Stas Kolenikov <[email protected]>
- Re: st: pweight + aweight, double weights
  - From: Steve Samuels <[email protected]>
- Re: st: pweight + aweight, double weights
  - From: Steve Samuels <[email protected]>
- AW: st: pweight + aweight, double weights
  - From: Jochen Späth <[email protected]>

Prev by Date: st: discrete choice model with asclogit
Next by Date: Re: st: Survey Design Degrees of Freedom
Previous by thread: AW: st: pweight + aweight, double weights
Next by thread: Re: st: pweight + aweight, double weights
Index(es):
- Date
- Thread