Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Compressing a panel dataset


From   Lukas Borkowski <570722@soas.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Compressing a panel dataset
Date   Tue, 23 Jul 2013 21:09:56 +0200

Sergiy,

thank you for you help. However, I encountered a problem. My dataset is unfortunately not as easy as I described in the earlier email. Initially, I didn't think it would make a big difference, but it does. There a few cases where one dummy variable, say v1, has different values within the household. Sometimes I am interested to keep only the 0 in such a case, sometimes the 1 and certainly need to keep the info if data is missing at all. So it looks like:

id      v1      v2      v3
1       1      .       2
1       0      7       .
1       .      .       .
2       1     .       1
2       1      7       .
2       .       .       .
...

If I run -collapse (min)v1, by(id)- I can get rid of the missing values and keep the 0 for household 1. But say I was interested in the 1, what could I do? Running -collapse (max)v1, by(id)- takes on the missing value.

Do you have an idea?

Best,

Lukas

#
Lukas Borkowski
University of London, School of Oriental and African Studies (SOAS)




On 23.07.2013, at 16:43, Sergiy Radyakin <serjradyakin@gmail.com> wrote:

> collapse (min) v1 (min) v2 (min) v3, by(id)
> 
>    id   v1   v2   v3
>     1    9    7    2
>     2    7    7    1
> 
> 
> Best, Sergiy
> 
> On Tue, Jul 23, 2013 at 10:25 AM, Lukas Borkowski <570722@soas.ac.uk> wrote:
>> Dear list,
>> 
>> I am using Stata 12 and currently clean up a dataset that will become a panel dataset. The quality of the dataset is quite poor (originates from a survey) and I face multiple (endless) situations where values to the same questions are recorded in different variables. I would now want to eliminate duplicates and to retain only one row for each household. My dataset looks somewhat like this:
>> 
>> id      v1      v2      v3
>> 1       .       .       2
>> 1       .       7       .
>> 1       9       .       .
>> 2       .       .       1
>> 2       .       7       .
>> 2       7       .       .
>> ...
>> 
>> I would like to retain only one row for each household. Is there a command for this? I have tried different things but have not found any solution.
>> 
>> Do you have any suggestion what I could do?
>> 
>> Thank you very much for your help!
>> 
>> Best,
>> 
>> Lukas
>> 
>> #
>> Lukas Borkowski
>> University of London, School of Oriental and African Studies (SOAS)
>> 
>> M: 570722@soas.ac.uk
>> 
>> 
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index