Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Dmitriy Krichevskiy <krichevskyd@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: rectangulizing data |

Date |
Thu, 26 May 2011 12:52:06 -0400 |

Thank you for you responses; I apologize for the confusion(s), Clarification then, The data comes from Survey of Income and Program Participation (SIPP) and my particular dataset combines 7 years of data. The data is collected quarterly and recorded monthly (via phone interviews). Hence time=14 is the second month of the second year. Many people in this sample miss interviews often, also income exhibits a lot of volatility (I still do not know why). My goal is to analyze income transitions from quintile to quintile (via -xttrans-) and for annual income I need to aggregate monthly income while differentiating between zero income from missing income. Hence, I am trying to drop people who only have few month of income on record for those years where their information is incomplete while keeping the same people for other years in which they have all the income information recorded. Given very large volatility and a lot of missing interviews I am not sure imputing income is harmless. On 5/26/11, Nick Cox <njcoxstata@gmail.com> wrote: > I think this might need to be > > bysort ID year: egen obs = count(month) > > -- perhaps after some work -- > > but as is agreed the example is unclear. > > On 26 May 2011, at 16:52, Oliver Jones <ojones@wiwi.uni-bielefeld.de> > wrote: > >> Hi, >> your example data structure is a bit confusing since you have month >> greater than 12... I'll assume you have at most 12 Month per person >> per year. >> >> Maybe this can help to drop people how have less than 12 observations >> for one particular year. Let's assume this year is 2006. >> >> bysort ID: egen obs = count(Month) >> drop if year == 2006 & obs < 12 >> >> Dose it work? >> >> Best >> Oliver >> >> Am 26.05.2011 17:19, schrieb Dmitriy Krichevskiy: >>> Dear Listers, >>> I am trying to figure out the simplest way to covert a large panel >>> dataset from monthly to annual income. The income is only reported >>> monthly and I would want to clean the data of anyone missing a month >>> in a particular year. I would like to drop observations for that >>> person-year only and keep that person if they are fully present in >>> some other year. Here is an equivalent data structure. As always, >>> that >>> a lot for your help. >>> Dmitriy >>> >>> ID Month Income >>> 1 1 1000 >>> 1 2 500 >>> 1 3 1000 >>> 1 13 0 >>> 1 14 0 >>> 1 15 0 >>> 1 16 0 >>> 1 17 600 >>> 1 18 1000 >>> 1 19 1000 >>> 1 20 1000 >>> 1 21 1000 >>> 1 22 1000 >>> 1 23 660 >>> 1 24 800 >>> 1 25 1200 >>> 2 1 2400 >>> 2 2 2400 >>> 2 5 2600 >>> * >>> > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Dmitriy Krichevskiy Ph.D. Candidate Economics Department Florida International University www.fiu.edu/~dkrichev Research Associate, College of Education Lumina Foundation Project * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: rectangulizing data***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

**Re: st: rectangulizing data***From:*Austin Nichols <austinnichols@gmail.com>

**References**:**st: rectangulizing data***From:*Dmitriy Krichevskiy <krichevskyd@gmail.com>

**Re: st: rectangulizing data***From:*Oliver Jones <ojones@wiwi.uni-bielefeld.de>

**Re: st: rectangulizing data***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: rectangulizing data** - Next by Date:
**Re: st: rectangulizing data** - Previous by thread:
**Re: st: rectangulizing data** - Next by thread:
**Re: st: rectangulizing data** - Index(es):