[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: RE: Re: How to speed up loop |

Date |
Fri, 17 Sep 2004 12:51:54 +0100 |

I'll try. The prefix . by hhid: is, as you know, an instruction that operations are done separately for distinct values of -hhid-. For this to work in the example given, observations need to be in sort order of -hhid-. Also, we need them to be sorted by -lineno- within -hhid-, as is implied by the example dataset given. A more careful prefix is thus . bysort hhid (lineid): which does the sorting if need be. Now the key wrinkle is that under the aegis of -by <byvarlist>:-, subscripts are interpreted as being within groups defined by the <byvarlist> so if you go . bysort panel (time) : gen first = value[1] the [1] always refers to the first observation within each -panel- (_not_ the first observation in the dataset), and similarly . bysort panel (time) : gen last = value[_N] is always the last observation within each -panel- (_not_ in the dataset). These two examples already give an important hint: what is within the subscript can be an expression, and need not be a constant. (The expression need not even evaluate to an integer. . di mpg[exp(1)] is legal Stata, although I can't think of a use for it. exp(1) gets truncated to 2, by the way.) So also is this legal Stata: . by hhid : gen mage = age[mlineno] Take hhid lineno age mlineno mage 1 1 32 . . 1 2 30 . . 1 3 5 2 30 Each expression within [ ] is evaluated separately for each observation. For the first and second age[mlineno] becomes age[.] which is taken as missing. For the third, age[mlineno] becomes age[3] which by the wrinkle rule above is 30. It is the third observation _within that group_. As far as -by:- is concerned, see also SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step Q1/02 SJ 2(1):86-102 (no commands) explains the use of the by varlist : construct to tackle a variety of problems with group structure, ranging from simple calculations for each of several groups to more advanced manipulations that use the built-in _n and _N For another example of cute subscript use, look inside the code for -qqplot- (or of -qplot- from SSC, borrowing the same trick). The same idea in general is what I call "cosorting". Cosorting sorts each variable in a varlist and replaces variables so that all are in sorted order, aligned so that the first of each is in the first observation, the second of each is in the second observation, and so on. Variables may be numeric or string. Suppose we have a b c 3 7 13 1 8 12 2 9 11 After cosorting we have a b c 1 7 11 2 8 12 3 9 13 Warning: this is rarely needed and destroys information in your data set in so far as values in each observation are typically not kept together. Anyway, here is one way to do it: program define cosort *! 1.0.0 NJC 3 November 1999 version 6 syntax varlist(min=2) [if] [in] tokenize `varlist' tempvar touse order mark `touse' `if' `in' qui replace `touse' = 1 - `touse' sort `touse' `1' gen long `order' = _n mac shift qui while "`1'" != "" { tempvar copy local type : type `1' gen `type' `copy' = `1' sort `touse' `1' replace `1' = `copy'[`order'] drop `copy' mac shift } sort `order' end Nick n.j.cox@durham.ac.uk Scott Merryman > Nick, > > Could you please explain how this -gen mage = age[mlineno]- > works or where I > could find it. I realize that square brackets are used for explicit > subscripting, but is not clear to me how this working. Nick Cox > > Looks like > > > > by hhid : gen mage = age[mlineno] > > > > > <snip> > > > > hhid lineno age mlineno mage > > > 1 1 32 . . > > > 1 2 30 . . > > > 1 3 5 2 30 > > > 2 1 68 . . > > > 2 2 41 1 68 > > > 2 3 40 . . > > > 2 4 17 3 40 > > > 2 5 14 3 40 > > > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: RE: RE: RE: Re: How to speed up loop***From:*"David E Moore" <davem@hartman-group.com>

**st: Variance of a ratio***From:*Leonelo Bautista <lebautista@wisc.edu>

**st: RE: RE: RE: RE: Re: How to speed up loop***From:*"Scott Merryman" <smerryman@kc.rr.com>

- Prev by Date:
**Re: st: Failure time in stset.** - Next by Date:
**st: NL (not your favorite math equations editor)** - Previous by thread:
**st: RE: Re: RE: Re: How to speed up loop** - Next by thread:
**st: RE: RE: RE: RE: Re: How to speed up loop** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |