Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: xt: unit-specific trends
From 
 
"William Gould, StataCorp LP" <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: xt: unit-specific trends 
Date 
 
Thu, 19 Apr 2012 13:36:59 -0500 
Laszlo <[email protected]> wrote, 
> I used "if `touse'" because that is the official way to make a program
> byable (http://www.stata.com/help.cgi?byable). If there is any case
> where the -if- condition need not be checked for the entire dataset, a
> -by: - run is that, isn't it? 
Laszlo is wrong in assuming that the data are necessarily sorted, and
thus -if `touse' is the official way to program this case.
The problem for -by- is that it is turning control over to a
user-written program, and it is not uncommon for user-written programs
to re-sort the data and then not put them back into the original
order.  So -by- was written to accomondate that.
If you as a programmer know that the the data will still be sorted
you can convert the -if `touse'- into an -in- range by coding,
        tempvar x
        quietly gen long `x' = `touse'*_n
        quietly sum `x', meanonly 
        local first = r(min)
        local last  = r(max)
        drop `x'
In the rest of your code you can then code -in `first'/`last'- instead 
of -if `touse'-.
There may be a quicker way to convert an -if `touse' into an -in- range.
This is just the first way that occurred to me.  
I would still be hesitant to use -in- range instead of -if `touse'-
because I would need to be certain that every command I used in my
ado-file did not change the sort order.
Here's demonstration that of a by-able program that re-sorts the data 
and yet still produces the expected results because it is coded using 
-if `touse'-:
        . program tryit, byable(recall)
          1.         di "hi"
          2.         syntax
          3.         marksample touse
          4.         list rep78 if `touse'
          5.         sort mpg
          6. end
        . sysuse auto, clear 
        (1978 Automobile Data)
        . sort rep78 
        . by rep78: tryit
        --------------------------------------
        -> rep78 = 1
        hi
             +-------+
             | rep78 |
             |-------|
          1. |     1 |
          2. |     1 |
             +-------+
        --------------------------------------
        -> rep78 = 2
        hi
             +-------+
             | rep78 |
             |-------|
          3. |     2 |
         14. |     2 |
         15. |     2 |
         22. |     2 |
         24. |     2 |
             |-------|
         45. |     2 |
         52. |     2 |
         53. |     2 |
             +-------+
        <remaining output omitted>
        . _
When -tryit- was called the first time to process rep78==1, the data
were in order, and we see that, as expected, the observations for
which rep78 is 1 are at the top of the dataset, namely in observations
1 and 2.  Now look at the -tryit- code.  -tryit-, just before exiting,
re-sorts the data!
So, the second time -tryit- is called, when -tryit- is called to
process the rep78 = 2 data, the observations will not be in order.
And we can see that iun the listing.  The listing was produced by
coding -list rep78 if `touse'- and, just as one would hope, all the
observations for which `touse' contains 1 are rep78==2 observations.
This time, however, the data are no longer in order.  The observations
for which `touse' is 1 are observations 3, 14, 15, 22, 24, 45, 52, and
53.  It didn't matter, however, because we coded -if `touse'-.
-by- plust -tryit- still produced correct results. 
Our thinking when we coded by and made the recommendation of using 
-if `touse'- was that sometimes it is better to produce correct
results than to produce incorrect results more quickly.
-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/