[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
jpitblado@stata.com (Jeff Pitblado, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Statsby and weights |

Date |
Thu, 25 Mar 2004 13:08:16 -0600 |

Dale Plummer <dale.plummer@vanderbilt.edu> asks about using weights with -statsby-: > I may have overlooked something obvious, but I cannot see why the > statsby command will not allow weights in the commands it is executing. > Would someone please explain this? There really isn't a good reason for this. From a development point of view, -statsby- uses the same parsing engine as -bootstrap-, -jknife-, -simulate-, and -permute-; some of which require careful consideration (and new code) to handle weights. There are ways around this. The long way is to set up -postfile- and use -post- within a -forvalues- loop. This requires a decent amount of coding to reproduce some of the features of -statsby-. The short way, involves tricking -statsby-. I generally would warn users against trying to "trick" a command to do something that a developer purposely tried to prevent, but this is one of those special cases. Suppose we want to use fweights with -summarize- for each category of a variable. The unweighted version would be . sysuse auto (1978 Automobile Data) . statsby "sum mpg" r(mean), by(rep) command: sum mpg statistic: _stat_1 = r(mean) by: rep78 . list +------------------+ | rep78 _stat_1 | |------------------| 1. | 1 21 | 2. | 2 19.125 | 3. | 3 19.43333 | 4. | 4 21.66667 | 5. | 5 27.36364 | +------------------+ As already noted, -statsby- does not like weights to be specified: . capture noisily statsby "sum mpg [fw=1]" r(mean), by(rep) weights not allowed We could write a wrap-around command for -summarize- that took weights in a different way: program mysum syntax varlist [if] [in] [, weight(string) * ] sum mpg `if' `in' [`weight'], `options' end Now we can pass weights to -summarize- using -mysum-'s -weight()- option. Here we'll specified an -fweight- of one to check the result with the unweighted version: . sysuse auto (1978 Automobile Data) . statsby "mysum mpg, weight(fw=1)" r(mean), by(rep) command: mysum mpg , weight(fw=1) statistic: _stat_1 = r(mean) by: rep78 . list +------------------+ | rep78 _stat_1 | |------------------| 1. | 1 21 | 2. | 2 19.125 | 3. | 3 19.43333 | 4. | 4 21.66667 | 5. | 5 27.36364 | +------------------+ Now let's really specify some weights: . statsby "mysum mpg, weight(fw=turn)" r(mean), by(rep) command: mysum mpg , weight(fw=turn) statistic: _stat_1 = r(mean) by: rep78 . list +------------------+ | rep78 _stat_1 | |------------------| 1. | 1 20.92683 | 2. | 2 18.97983 | 3. | 3 19.11445 | 4. | 4 21.1342 | 5. | 5 27.19898 | +------------------+ We can verify the weights were specified by looking at the results on a group-by-group basis: . sysuse auto (1978 Automobile Data) . sum mpg [fw=turn] if rep==1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- mpg | 82 20.92683 3.017564 18 24 . sum mpg [fw=turn] if rep==2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- mpg | 347 18.97983 3.466128 14 24 . sum mpg [fw=turn] if rep==3 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- mpg | 1232 19.11445 4.018323 12 29 . sum mpg [fw=turn] if rep==4 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- mpg | 693 21.1342 4.836715 14 30 . sum mpg [fw=turn] if rep==5 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- mpg | 392 27.19898 8.349844 17 41 As a final note, let me just warn against using this trick with -bootstrap-, -permute-, and -jknife-. The result will most definitely not be what you would expect. --Jeff jpitblado@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**RE: st: RE: Data manipulation** - Next by Date:
**st: A question about time series** - Previous by thread:
**st: Statsby and weights** - Next by thread:
**RE: st: Statsby and weights** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |