Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: xt: unit-specific trends

From   "William Gould, StataCorp LP" <>
Subject   Re: st: xt: unit-specific trends
Date   Thu, 19 Apr 2012 13:36:59 -0500

Laszlo <> wrote, 

> I used "if `touse'" because that is the official way to make a program
> byable ( If there is any case
> where the -if- condition need not be checked for the entire dataset, a
> -by: - run is that, isn't it? 

Laszlo is wrong in assuming that the data are necessarily sorted, and
thus -if `touse' is the official way to program this case.

The problem for -by- is that it is turning control over to a
user-written program, and it is not uncommon for user-written programs
to re-sort the data and then not put them back into the original
order.  So -by- was written to accomondate that.

If you as a programmer know that the the data will still be sorted
you can convert the -if `touse'- into an -in- range by coding,

        tempvar x
        quietly gen long `x' = `touse'*_n
        quietly sum `x', meanonly 
        local first = r(min)
        local last  = r(max)
        drop `x'

In the rest of your code you can then code -in `first'/`last'- instead 
of -if `touse'-.

There may be a quicker way to convert an -if `touse' into an -in- range.
This is just the first way that occurred to me.  

I would still be hesitant to use -in- range instead of -if `touse'-
because I would need to be certain that every command I used in my
ado-file did not change the sort order.

Here's demonstration that of a by-able program that re-sorts the data 
and yet still produces the expected results because it is coded using 
-if `touse'-:

        . program tryit, byable(recall)
          1.         di "hi"
          2.         syntax
          3.         marksample touse
          4.         list rep78 if `touse'
          5.         sort mpg
          6. end

        . sysuse auto, clear 
        (1978 Automobile Data)

        . sort rep78 

        . by rep78: tryit

        -> rep78 = 1

             | rep78 |
          1. |     1 |
          2. |     1 |

        -> rep78 = 2

             | rep78 |
          3. |     2 |
         14. |     2 |
         15. |     2 |
         22. |     2 |
         24. |     2 |
         45. |     2 |
         52. |     2 |
         53. |     2 |

        <remaining output omitted>

        . _

When -tryit- was called the first time to process rep78==1, the data
were in order, and we see that, as expected, the observations for
which rep78 is 1 are at the top of the dataset, namely in observations
1 and 2.  Now look at the -tryit- code.  -tryit-, just before exiting,
re-sorts the data!

So, the second time -tryit- is called, when -tryit- is called to
process the rep78 = 2 data, the observations will not be in order.
And we can see that iun the listing.  The listing was produced by
coding -list rep78 if `touse'- and, just as one would hope, all the
observations for which `touse' contains 1 are rep78==2 observations.
This time, however, the data are no longer in order.  The observations
for which `touse' is 1 are observations 3, 14, 15, 22, 24, 45, 52, and
53.  It didn't matter, however, because we coded -if `touse'-.

-by- plust -tryit- still produced correct results. 

Our thinking when we coded by and made the recommendation of using 
-if `touse'- was that sometimes it is better to produce correct
results than to produce incorrect results more quickly.

-- Bill
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index