# RE: st: differencing

 From "Nick Cox" To Subject RE: st: differencing Date Mon, 14 Nov 2005 16:44:01 -0000

```Indeed. Your example is well behaved, but
as we all know, real data need not be. Also,
it is easy to get confused on the details
of the -by:-, so that for example

bysort state county :

does not _guarantee_ that -year- is in the
right order within -state- and -county-.
Once -tsset-, neither pitfall should catch you.

Nick
n.j.cox@durham.ac.uk

Eric G. Wruck

> Yeah, I goofed.  For one thing, I entered the data
> incorrectly.  I was trying to follow what Gregor said he
> wanted, which I'm not sure I understood or that he wrote down
> clearly.  I fully acknowledge that using the D. operator
> --which you & Kit suggested-- is probably the way to go.
>
> Nevertheless, I want to try to correct what I did earlier.  I
> added a third observation for one of the state county
> combinations.  I am assuming that Gregor wants a difference
> in employment from one year to the next within state &
> county.  So here goes:
>
> . sort state county year
>
> . l
>
>      +----------------------------------+
>      | year   state   county   employ~t |
>      |----------------------------------|
>   1. |    1       1        1         10 |
>   2. |    2       1        1         20 |
>   3. |    3       1        1         22 |
>   4. |    1       2        1         15 |
>   5. |    2       2        1         30 |
>      +----------------------------------+
>
> . bysort state county: gen diff = employment - employment[_n - 1]
> (2 missing values generated)
>
> . l
>
>      +-----------------------------------------+
>      | year   state   county   employ~t   diff |
>      |-----------------------------------------|
>   1. |    1       1        1         10      . |
>   2. |    2       1        1         20     10 |
>   3. |    3       1        1         22      2 |
>   4. |    1       2        1         15      . |
>   5. |    2       2        1         30     15 |
>      +-----------------------------------------+
>
>
>
> If I understand the tsset stuff at all, that approach would
> force Gregor to come to terms with any date gaps & duplicate
> years which my approach glosses over.  Is that right?
>
>
> Eric
>
>
>
> >There are two issues here: what to calculate and
> >how to do it. Eric's example presumes two
> >estimates for each combination of state, county, year
> >and wanting to find the difference between them.
> >Evidently this could arise, but on the face of it
> >I would guess rather at
> >
> >bysort state county (year) : gen diff = emp - emp[_n-1]
> >
> >i.e. the difference between each year and the previous.
> >
> >A more robust approach would be to -tsset-
> >
> >egen countyid = group(state county), label
> >tsset countyid year
> >gen diff = D.emp
> >
> >Nick
> >n.j.cox@durham.ac.uk
> >
> >Eric G. Wruck
> >
> >> You were close but your generate (gen) statement wasn't
> quite right.
> >>
> >>
> >> . bysort year state county: gen employdiff = employment -
> >> employment[_n - 1]
> >> (2 missing values generated)
> >>
> >> . l, noobs
> >>
> >>   +---------------------------------------------+
> >>   | year   state   county   employ~    employ~f |
> >>   |---------------------------------------------|
> >>   |    1       1        1         10          . |
> >>   |    1       1        1         15          5 |
> >>   |    2       2        1         20          . |
> >>   |    2       2        1         30         10 |
> >>   +---------------------------------------------+
> >
> >> >My data is structured as follows
> >> >
> >> >year    state   county   employment
> >> >1            1         1            10
> >> >2            1         1            20
> >> >1            2         1            15
> >> >2            2         1            30
> >> >...
> >> >for 6 years, 50 states, and some counties in each state. I
> >> have 1.5 million observations.
> >> >
> >> >I want to construct a variable that is the difference in
> >> employment by year in each state and county.
> >> >
> >> >I tried
> >> >
> >> >by year state county, sort: gen newvar =
> > > employment-employment[_n-1]  but that didn't work.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```