Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Creating long, filledin dataset from two, year variables |
Date | Sun, 6 Mar 2011 09:03:47 +0000 |
I think you guessed wrong. This is just a wide structure. I don't know how you are going in fill in populations after the first, but this is a start. You could -reshape long-. clear input id pop startyear endyear 1 11000 1818 1822 2 1500 1820 1824 3 15000 1820 1823 4 2200 1821 1836 5 2000 1821 1840 6 125000 1821 1828 end gen nyears = endyear - startyear + 1 rename startyear year1 rename endyear year2 reshape long year, i(id) replace pop =. if _j == 2 expand nyears if _j == 2 bysort id (_j) : replace year = year[_n-1] + 1 if _j > 1 drop nyears _j Here is some reading:. help for -reshape-, -expand-. FAQ . . . . . . . . . . . . . . . . . . . . . . . . Problems with reshape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox 12/03 I am having problems with the reshape command. Can you give further guidance? http://www.stata.com/support/faqs/data/reshape3.html FAQ . . . . . . . . . . . . . . . . . . . . . . . Replacing missing values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox 2/03 How can I replace missing values with previous or following nonmissing values? http://www.stata.com/support/faqs/data/missing.html clear input id pop startyear endyear 1 11000 1818 1822 2 1500 1820 1824 3 15000 1820 1823 4 2200 1821 1836 5 2000 1821 1840 6 125000 1821 1828 end gen nyears = endyear - startyear + 1 rename startyear year1 rename endyear year2 reshape long year, i(id) replace pop =. if _j == 2 expand nyears if _j == 2 bysort id (_j) : replace year = year[_n-1] + 1 if _j > 1 drop nyears _j I gave the -reshape- solution because it is always worth knowing about -reshape-. But there is a more direct solution too: clear input id pop startyear endyear 1 11000 1818 1822 2 1500 1820 1824 3 15000 1820 1823 4 2200 1821 1836 5 2000 1821 1840 6 125000 1821 1828 end gen nyears = endyear - startyear + 1 expand nyears gen year = startyear bysort id : replace year = year[_n-1] + 1 if _n > 1 replace pop =. if year != startyear drop nyears startyear endyear On Sun, Mar 6, 2011 at 4:33 AM, Kevin O'Connell <kevocon@gmail.com> wrote: > I am trying to move a dataset that was built with a start year, start > year population and end year to having a long format. I guessed this > was a double-wide dataset, but I couldnt get my variables to match up, > or to fillin . > > id pop startyear endyear > 1 11000 1818 1822 > 2 1500 1820 1824 > 3 15000 1820 1823 > 4 2200 1821 1836 > 5 2000 1821 1840 > 6 125000 1821 1828 > > I am trying to get the dataset into this format so I can fill in > missing variables for population over the time span between start and > end: > > id year pop > 1 1818 11000 > 1 1819 > 1 1820 > 1 1821 > 1 1822 > 2 1820 1500 > 2 1821 > 2 1822 > 2 1823 > 2 1824 > > and so on. > There are about 5000 years within 500 id, so I am hoping to find a > better solution that data entry, but i dont know the right > term/command for what I am trying to do. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/