Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: rectangulizing data


From   Dmitriy Krichevskiy <krichevskyd@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: rectangulizing data
Date   Thu, 26 May 2011 12:52:06 -0400

Thank you for you responses; I apologize for the confusion(s),

Clarification then,

The data comes from Survey of Income and Program Participation (SIPP)
and my particular dataset combines 7 years of data. The data is
collected quarterly and recorded monthly (via phone interviews). Hence
time=14 is the second month of the second year. Many people in this
sample miss interviews often, also income exhibits a lot of volatility
(I still do not know why). My goal is to analyze income transitions
from quintile to quintile (via -xttrans-) and for annual income I need
to aggregate monthly income while differentiating between zero income
from missing income. Hence, I am trying to drop people who only have
few month of income on record for those years where their information
is incomplete while keeping the same people for other years in which
they have all the income information recorded. Given very large
volatility and a lot of missing interviews I am not sure imputing
income is harmless.

On 5/26/11, Nick Cox <njcoxstata@gmail.com> wrote:
> I think this might need to be
>
> bysort ID year: egen obs = count(month)
>
> -- perhaps after some work --
>
> but as is agreed the example is unclear.
>
> On 26 May 2011, at 16:52, Oliver Jones <ojones@wiwi.uni-bielefeld.de>
> wrote:
>
>> Hi,
>> your example data structure is a bit confusing since you have month
>> greater than 12... I'll assume you have at most 12 Month per person
>> per year.
>>
>> Maybe this can help to drop people how have less than 12 observations
>> for one particular year. Let's assume this year is 2006.
>>
>> bysort ID: egen obs = count(Month)
>> drop if year == 2006 & obs < 12
>>
>> Dose it work?
>>
>> Best
>> Oliver
>>
>> Am 26.05.2011 17:19, schrieb Dmitriy Krichevskiy:
>>> Dear Listers,
>>> I am trying to figure out the simplest way to covert a large panel
>>> dataset from monthly to annual income. The income is only reported
>>> monthly and I would want to clean the data of anyone missing a month
>>> in a particular year. I would like to drop observations for that
>>> person-year only and keep that person if they are fully present in
>>> some other year. Here is an equivalent data structure. As always,
>>> that
>>> a lot for your help.
>>> Dmitriy
>>>
>>> ID     Month   Income
>>> 1       1          1000
>>> 1       2           500
>>> 1       3          1000
>>> 1       13         0
>>> 1       14         0
>>> 1       15         0
>>> 1       16         0
>>> 1       17         600
>>> 1       18        1000
>>> 1       19        1000
>>> 1       20        1000
>>> 1       21        1000
>>> 1       22        1000
>>> 1       23        660
>>> 1       24        800
>>> 1       25        1200
>>> 2        1         2400
>>> 2        2         2400
>>> 2        5         2600
>>> *
>>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>


-- 
Dmitriy Krichevskiy Ph.D. Candidate
Economics Department
Florida International University
www.fiu.edu/~dkrichev

Research Associate, College of Education
Lumina Foundation Project
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index