Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Compressing a panel dataset

From	Sergiy Radyakin <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Compressing a panel dataset
Date	Tue, 23 Jul 2013 15:41:26 -0400

Dear Lukas,

you need to formalize this: "Sometimes I am interested to keep only
the 0 in such a case, sometimes the 1 and certainly need to keep the
info if data is missing at all."

Under what conditions do you want the resulting value to be 0? under
what be 1? under what be missing? Once we know the answers to these
questions, we can modify the -collapse- statement or write something
else more suitable.

Also what is the universe of values of vi? is it only 0/1/. or other
values are possible?

Suppose you want 1's and value -9999 is not in the universe:
replace v1=-9999 if missing(v1)
collapse (max) v1, by(id)
replace v1=. if v1==-9999

Do not write collapse (max) v1 if !missing(v1)
This might drop observations that you want to retain.

Best, Sergiy

On Tue, Jul 23, 2013 at 3:09 PM, Lukas Borkowski <[email protected]> wrote:
> Sergiy,
>
> thank you for you help. However, I encountered a problem. My dataset is unfortunately not as easy as I described in the earlier email. Initially, I didn't think it would make a big difference, but it does. There a few cases where one dummy variable, say v1, has different values within the household. Sometimes I am interested to keep only the 0 in such a case, sometimes the 1 and certainly need to keep the info if data is missing at all. So it looks like:
>
> id      v1      v2      v3
> 1       1      .       2
> 1       0      7       .
> 1       .      .       .
> 2       1     .       1
> 2       1      7       .
> 2       .       .       .
> ...
>
> If I run -collapse (min)v1, by(id)- I can get rid of the missing values and keep the 0 for household 1. But say I was interested in the 1, what could I do? Running -collapse (max)v1, by(id)- takes on the missing value.
>
> Do you have an idea?
>
> Best,
>
> Lukas
>
> #
> Lukas Borkowski
> University of London, School of Oriental and African Studies (SOAS)
>
>
>
>
> On 23.07.2013, at 16:43, Sergiy Radyakin <[email protected]> wrote:
>
>> collapse (min) v1 (min) v2 (min) v3, by(id)
>>
>>    id   v1   v2   v3
>>     1    9    7    2
>>     2    7    7    1
>>
>>
>> Best, Sergiy
>>
>> On Tue, Jul 23, 2013 at 10:25 AM, Lukas Borkowski <[email protected]> wrote:
>>> Dear list,
>>>
>>> I am using Stata 12 and currently clean up a dataset that will become a panel dataset. The quality of the dataset is quite poor (originates from a survey) and I face multiple (endless) situations where values to the same questions are recorded in different variables. I would now want to eliminate duplicates and to retain only one row for each household. My dataset looks somewhat like this:
>>>
>>> id      v1      v2      v3
>>> 1       .       .       2
>>> 1       .       7       .
>>> 1       9       .       .
>>> 2       .       .       1
>>> 2       .       7       .
>>> 2       7       .       .
>>> ...
>>>
>>> I would like to retain only one row for each household. Is there a command for this? I have tried different things but have not found any solution.
>>>
>>> Do you have any suggestion what I could do?
>>>
>>> Thank you very much for your help!
>>>
>>> Best,
>>>
>>> Lukas
>>>
>>> #
>>> Lukas Borkowski
>>> University of London, School of Oriental and African Studies (SOAS)
>>>
>>> M: [email protected]
>>>
>>>
>>>
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Compressing a panel dataset
  - From: Lukas Borkowski <[email protected]>
- Re: st: Compressing a panel dataset
  - From: Sergiy Radyakin <[email protected]>
- Re: st: Compressing a panel dataset
  - From: Lukas Borkowski <[email protected]>

Prev by Date: st: SSC Archive activity, May-June 2013
Next by Date: Re: st: inputting data via a series of nested loops
Previous by thread: Re: st: Compressing a panel dataset
Next by thread: Re: st: Count observations
Index(es):
- Date
- Thread