Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"William Gould, StataCorp LP" <wgould@stata.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: xt: unit-specific trends |

Date |
Thu, 19 Apr 2012 13:36:59 -0500 |

Laszlo <sandorl@gmail.com> wrote, > I used "if `touse'" because that is the official way to make a program > byable (http://www.stata.com/help.cgi?byable). If there is any case > where the -if- condition need not be checked for the entire dataset, a > -by: - run is that, isn't it? Laszlo is wrong in assuming that the data are necessarily sorted, and thus -if `touse' is the official way to program this case. The problem for -by- is that it is turning control over to a user-written program, and it is not uncommon for user-written programs to re-sort the data and then not put them back into the original order. So -by- was written to accomondate that. If you as a programmer know that the the data will still be sorted you can convert the -if `touse'- into an -in- range by coding, tempvar x quietly gen long `x' = `touse'*_n quietly sum `x', meanonly local first = r(min) local last = r(max) drop `x' In the rest of your code you can then code -in `first'/`last'- instead of -if `touse'-. There may be a quicker way to convert an -if `touse' into an -in- range. This is just the first way that occurred to me. I would still be hesitant to use -in- range instead of -if `touse'- because I would need to be certain that every command I used in my ado-file did not change the sort order. Here's demonstration that of a by-able program that re-sorts the data and yet still produces the expected results because it is coded using -if `touse'-: . program tryit, byable(recall) 1. di "hi" 2. syntax 3. marksample touse 4. list rep78 if `touse' 5. sort mpg 6. end . sysuse auto, clear (1978 Automobile Data) . sort rep78 . by rep78: tryit -------------------------------------- -> rep78 = 1 hi +-------+ | rep78 | |-------| 1. | 1 | 2. | 1 | +-------+ -------------------------------------- -> rep78 = 2 hi +-------+ | rep78 | |-------| 3. | 2 | 14. | 2 | 15. | 2 | 22. | 2 | 24. | 2 | |-------| 45. | 2 | 52. | 2 | 53. | 2 | +-------+ <remaining output omitted> . _ When -tryit- was called the first time to process rep78==1, the data were in order, and we see that, as expected, the observations for which rep78 is 1 are at the top of the dataset, namely in observations 1 and 2. Now look at the -tryit- code. -tryit-, just before exiting, re-sorts the data! So, the second time -tryit- is called, when -tryit- is called to process the rep78 = 2 data, the observations will not be in order. And we can see that iun the listing. The listing was produced by coding -list rep78 if `touse'- and, just as one would hope, all the observations for which `touse' contains 1 are rep78==2 observations. This time, however, the data are no longer in order. The observations for which `touse' is 1 are observations 3, 14, 15, 22, 24, 45, 52, and 53. It didn't matter, however, because we coded -if `touse'-. -by- plust -tryit- still produced correct results. Our thinking when we coded by and made the recommendation of using -if `touse'- was that sometimes it is better to produce correct results than to produce incorrect results more quickly. -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: xt: unit-specific trends***From:*László Sándor <sandorl@gmail.com>

- Prev by Date:
**Re: st: abar after xtivreg2** - Next by Date:
**RE: st: use of tempfile** - Previous by thread:
**Re: st: xt: unit-specific trends** - Next by thread:
**Re: st: xt: unit-specific trends** - Index(es):