Joerg Luedicke <joerg.luedicke@gmail.com> |

Re: st: rectangulizing data |

Thu, 26 May 2011 12:10:13 -0400 |

On Thu, May 26, 2011 at 11:19 AM, Dmitriy Krichevskiy <krichevskyd@gmail.com> wrote: > Dear Listers, > I am trying to figure out the simplest way to covert a large panel > dataset from monthly to annual income. The income is only reported > monthly and I would want to clean the data of anyone missing a month > in a particular year. I would like to drop observations for that > person-year only and keep that person if they are fully present in > some other year. Here is an equivalent data structure. As always, that > a lot for your help. > Dmitriy > > ID Month Income > 1 1 1000 > 1 2 500 > 1 3 1000 > 1 13 0 > 1 14 0 > 1 15 0 > 1 16 0 > 1 17 600 > 1 18 1000 > 1 19 1000 > 1 20 1000 > 1 21 1000 > 1 22 1000 > 1 23 660 > 1 24 800 > 1 25 1200 > 2 1 2400 > 2 2 2400 > 2 5 2600 As Oliver already mentioned, your example data is confusing. Anyway, if I understand you right you want to drop 12 months that fall into a calendar year (I assume) when valid income data is only available for a number of months < 12? So, for example, if ID X has 10 month of income data in year Y, you want to drop year Y entirely for that ID? I feel it is hard to imagine circumstances in which that would make a lot of sense as you would discard a ton of information for no apparent reason. I would think even a simple imputation approach would fare better here (for example, by replacing the missing data with information on income from a previous month). In any case, what is the reason for missingness of income information here? J. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

