# RE: st: differencing

 From "Eric G. Wruck" To statalist@hsphsun2.harvard.edu Subject RE: st: differencing Date Mon, 14 Nov 2005 11:35:40 -0500

```Yeah, I goofed.  For one thing, I entered the data incorrectly.  I was trying to follow what Gregor said he wanted, which I'm not sure I understood or that he wrote down clearly.  I fully acknowledge that using the D. operator --which you & Kit suggested-- is probably the way to go.

Nevertheless, I want to try to correct what I did earlier.  I added a third observation for one of the state county combinations.  I am assuming that Gregor wants a difference in employment from one year to the next within state & county.  So here goes:

. sort state county year

. l

+----------------------------------+
| year   state   county   employ~t |
|----------------------------------|
1. |    1       1        1         10 |
2. |    2       1        1         20 |
3. |    3       1        1         22 |
4. |    1       2        1         15 |
5. |    2       2        1         30 |
+----------------------------------+

. bysort state county: gen diff = employment - employment[_n - 1]
(2 missing values generated)

. l

+-----------------------------------------+
| year   state   county   employ~t   diff |
|-----------------------------------------|
1. |    1       1        1         10      . |
2. |    2       1        1         20     10 |
3. |    3       1        1         22      2 |
4. |    1       2        1         15      . |
5. |    2       2        1         30     15 |
+-----------------------------------------+

If I understand the tsset stuff at all, that approach would force Gregor to come to terms with any date gaps & duplicate years which my approach glosses over.  Is that right?

Eric

>There are two issues here: what to calculate and
>how to do it. Eric's example presumes two
>estimates for each combination of state, county, year
>and wanting to find the difference between them.
>Evidently this could arise, but on the face of it
>I would guess rather at
>
>bysort state county (year) : gen diff = emp - emp[_n-1]
>
>i.e. the difference between each year and the previous.
>
>A more robust approach would be to -tsset-
>
>egen countyid = group(state county), label
>tsset countyid year
>gen diff = D.emp
>
>Nick
>n.j.cox@durham.ac.uk
>
>Eric G. Wruck
>
>> You were close but your generate (gen) statement wasn't quite right.
>>
>>
>> . bysort year state county: gen employdiff = employment -
>> employment[_n - 1]
>> (2 missing values generated)
>>
>> . l, noobs
>>
>>   +---------------------------------------------+
>>   | year   state   county   employ~    employ~f |
>>   |---------------------------------------------|
>>   |    1       1        1         10          . |
>>   |    1       1        1         15          5 |
>>   |    2       2        1         20          . |
>>   |    2       2        1         30         10 |
>>   +---------------------------------------------+
>
>> >My data is structured as follows
>> >
>> >year    state   county   employment
>> >1            1         1            10
>> >2            1         1            20
>> >1            2         1            15
>> >2            2         1            30
>> >...
>> >for 6 years, 50 states, and some counties in each state. I
>> have 1.5 million observations.
>> >
>> >I want to construct a variable that is the difference in
>> employment by year in each state and county.
>> >
>> >I tried
>> >
>> >by year state county, sort: gen newvar =
> > employment-employment[_n-1]  but that didn't work.

--

===================================================

Eric G. Wruck
Econalytics
Columbus, OH  43209

ph:      614.231.5034
cell:    614.330.8846
eFax:    614.573.6639
eMail:   ewruck@econalytics.com
website: http://www.econalytics.com

====================================================
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```