Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: help expanding the dataset by n observations


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: help expanding the dataset by n observations
Date   Fri, 24 Nov 2006 16:58:19 -0000

OK, joking aside:

We have two approaches to doing it. The 
comparison is interesting. 

(1) no change of data shape. 

(a) add extra pseudo-observations
(b) -fillin-
(c) drop extra pseudo-observations 

For what I recommended, you don't have to go into -edit-. 
Let's look at the all command version. 

set obs `= _N + 12' 
tokenize "1983 1986 1989 1990 1993 1994 1997 1998 2001 2002 2003 2004"
forval i = 1/12 { 
	replace year = ``i'' in -`i' 
} 
fillin fip county year 
drop if county == . | fip == . 

(2) change of data shape. 

(a) -reshape-
(b) new variables
(c) -reshape- back 

egen id=group(county fip)
reshape wide vbl, i(id) j(year)
foreach y of num 1983 1986 1989 1990 1993 1994 1997 1998 2001-2004 { 
	gen vbl`y' = .
}
reshape long vbl, i(id) j(year)
drop id 

So it looks like 7 lines of code in each case. I eat some words.... 

Nick 
n.j.cox@durham.ac.uk 

Scott Cunningham
> Sent: 24 November 2006 16:07
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: RE: help expanding the dataset by n observations
> 
> 
> On Nov 24, 2006, at 10:57 AM, Nick Cox wrote:
> 
> > No, no, no. You want extra observations.
> > Try my advice. It should take about two minutes.
> > -reshape- is wonderful, but not the answer here.
> 
> Really?  It is making extra observations.  I have to reshape first  
> long, generate the variables with the missing data, then reshape  
> back, but it worked.  Here's what I did.  Here's a sample 
> slice of my  
> dataset:
> 
>      +-----------------------------+
>       | county   fip   year   vbl |
>       |-----------------------------|
>    1. |      1     1   1980       . |
>    2. |      1     1   1981       . |
>    3. |      1     1   1982       . |
>    4. |      1     1   1984       . |
>    5. |      1     1   1985       . |
>       |-----------------------------|
>    6. |      1     1   1987       . |
>    7. |      1     1   1988       . |
>    8. |      1     1   1991       . |
>    9. |      1     1   1992       . |
> 10. |      1     1   1995       . |
>       |-----------------------------|
> 11. |      1     1   1996       . |
> 12. |      1     1   1999       . |
> 13. |      1     1   2000       . |
> 14. |      3     1   1980       1 |
> 15. |      3     1   1981       1 |
>       |-----------------------------|
> 16. |      3     1   1982       1 |
> 17. |      3     1   1984       0 |
> 18. |      3     1   1985       0 |
> 19. |      3     1   1987       0 |
> 20. |      3     1   1988       0 |
>       |-----------------------------|
> 21. |      3     1   1991       0 |
> 22. |      3     1   1992       0 |
> 23. |      3     1   1995       0 |
> 24. |      3     1   1996       0 |
> 25. |      3     1   1999       0 |
>       |-----------------------------|
> 26. |      3     1   2000       0 |
>       +-----------------------------+
> 
> 
> . egen id=group(county fip)
> . reshape wide vbl, i(id) j(year)
> . gen vbl1983=.
> . gen vbl1986=.
> ...
> . gen vbl2004=.
> . reshape long vbl, i(id) j(year)
> . list
> 
>       +----------------------------------+
>       | id   year   county   fip   vbl |
>       |----------------------------------|
>    1. |  1   1980        1     1       . |
>    2. |  1   1981        1     1       . |
>    3. |  1   1982        1     1       . |
>    4. |  1   1983        1     1       . |
>    5. |  1   1984        1     1       . |
>       |----------------------------------|
>    6. |  1   1985        1     1       . |
>    7. |  1   1986        1     1       . |
>    8. |  1   1987        1     1       . |
>    9. |  1   1988        1     1       . |
> 10. |  1   1989        1     1       . |
>       |----------------------------------|
> 11. |  1   1990        1     1       . |
> 12. |  1   1991        1     1       . |
> 13. |  1   1992        1     1       . |
> 14. |  1   1993        1     1       . |
> 15. |  1   1994        1     1       . |
>       |----------------------------------|
> 16. |  1   1995        1     1       . |
> 17. |  1   1996        1     1       . |
> 18. |  1   1997        1     1       . |
> 19. |  1   1998        1     1       . |
> 20. |  1   1999        1     1       . |
>       |----------------------------------|
> 21. |  1   2000        1     1       . |
> 22. |  1   2001        1     1       . |
> 23. |  1   2002        1     1       . |
> 24. |  1   2003        1     1       . |
> 25. |  1   2004        1     1       . |
>       |----------------------------------|
> 26. |  2   1980        3     1       1 |
> 27. |  2   1981        3     1       1 |
> 28. |  2   1982        3     1       1 |
> 29. |  2   1983        3     1       . |
> 30. |  2   1984        3     1       0 |
>       |----------------------------------|
> 31. |  2   1985        3     1       0 |
> 32. |  2   1986        3     1       . |
> 33. |  2   1987        3     1       0 |
> 34. |  2   1988        3     1       0 |
> 35. |  2   1989        3     1       . |
>       |----------------------------------|
> 36. |  2   1990        3     1       . |
> 37. |  2   1991        3     1       0 |
> 38. |  2   1992        3     1       0 |
> 39. |  2   1993        3     1       . |
> 40. |  2   1994        3     1       . |
>       |----------------------------------|
> 41. |  2   1995        3     1       0 |
> 42. |  2   1996        3     1       0 |
> 43. |  2   1997        3     1       . |
> 44. |  2   1998        3     1       . |
> 45. |  2   1999        3     1       0 |
>       |----------------------------------|
> 46. |  2   2000        3     1       0 |
> 47. |  2   2001        3     1       . |
> 48. |  2   2002        3     1       . |
> 49. |  2   2003        3     1       . |
> 50. |  2   2004        3     1       . |
>       +----------------------------------+
> 
> Then I need to code the missing values (omitted here).  This 
> solution  
> works with many different counties, and so seemed more 
> efficient than  
> going through -edit-. 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index