Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: rectangulizing data


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: rectangulizing data
Date   Thu, 26 May 2011 12:58:22 -0400

Dmitriy Krichevskiy <krichevskyd@gmail.com>:
Nor is dropping cases harmless; there is some discussion at
http://www.urban.org/publications/411971.html
and slides 12-14 of
http://www-personal.umich.edu/~nicholsa/an_dds.pdf

On Thu, May 26, 2011 at 12:52 PM, Dmitriy Krichevskiy
<krichevskyd@gmail.com> wrote:
> Thank you for you responses; I apologize for the confusion(s),
>
> Clarification then,
>
> The data comes from Survey of Income and Program Participation (SIPP)
> and my particular dataset combines 7 years of data. The data is
> collected quarterly and recorded monthly (via phone interviews). Hence
> time=14 is the second month of the second year. Many people in this
> sample miss interviews often, also income exhibits a lot of volatility
> (I still do not know why). My goal is to analyze income transitions
> from quintile to quintile (via -xttrans-) and for annual income I need
> to aggregate monthly income while differentiating between zero income
> from missing income. Hence, I am trying to drop people who only have
> few month of income on record for those years where their information
> is incomplete while keeping the same people for other years in which
> they have all the income information recorded. Given very large
> volatility and a lot of missing interviews I am not sure imputing
> income is harmless.
>
> On 5/26/11, Nick Cox <njcoxstata@gmail.com> wrote:
>> I think this might need to be
>>
>> bysort ID year: egen obs = count(month)
>>
>> -- perhaps after some work --
>>
>> but as is agreed the example is unclear.
>>
>> On 26 May 2011, at 16:52, Oliver Jones <ojones@wiwi.uni-bielefeld.de>
>> wrote:
>>
>>> Hi,
>>> your example data structure is a bit confusing since you have month
>>> greater than 12... I'll assume you have at most 12 Month per person
>>> per year.
>>>
>>> Maybe this can help to drop people how have less than 12 observations
>>> for one particular year. Let's assume this year is 2006.
>>>
>>> bysort ID: egen obs = count(Month)
>>> drop if year == 2006 & obs < 12
>>>
>>> Dose it work?
>>>
>>> Best
>>> Oliver
>>>
>>> Am 26.05.2011 17:19, schrieb Dmitriy Krichevskiy:
>>>> Dear Listers,
>>>> I am trying to figure out the simplest way to covert a large panel
>>>> dataset from monthly to annual income. The income is only reported
>>>> monthly and I would want to clean the data of anyone missing a month
>>>> in a particular year. I would like to drop observations for that
>>>> person-year only and keep that person if they are fully present in
>>>> some other year. Here is an equivalent data structure. As always,
>>>> that
>>>> a lot for your help.
>>>> Dmitriy
>>>>
>>>> ID     Month   Income
>>>> 1       1          1000
>>>> 1       2           500
>>>> 1       3          1000
>>>> 1       13         0
>>>> 1       14         0
>>>> 1       15         0
>>>> 1       16         0
>>>> 1       17         600
>>>> 1       18        1000
>>>> 1       19        1000
>>>> 1       20        1000
>>>> 1       21        1000
>>>> 1       22        1000
>>>> 1       23        660
>>>> 1       24        800
>>>> 1       25        1200
>>>> 2        1         2400
>>>> 2        2         2400
>>>> 2        5         2600
>>>> *

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index