Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Labeling different kinds of missing observations


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Labeling different kinds of missing observations
Date   Fri, 20 Apr 2012 09:08:40 +0100

You are correct. When you asked for code to generate a variable, I did
not understand that you want to replace a variable.

Also, there is an error in the code I posted (_not_ a macro in Stata
terms). References to -`first'- were spurious: should have been just
-first-. Sorry about that.

But the -first- variable it creates is still relevant.

clear

input  ID  Y1  Y2  Y3  Y4  Y5  Y6  Y7
 1    .      .      2   3    4    5  6
 2    .     7     .    8     9     10 11
 3    .      .      12   13  .     14  15
 4    .      16    .    17    18    .   19
 5    20    21    .   22     23    24 .
end

gen missbefore = 0
gen missafter = 0
gen first = .

qui forval J = 1/7 {
      replace missbefore = 1 if missing(Y`J') & `J' < first
      replace first = `J' if missing(first) & !missing(Y`J')
      replace missafter = 1 if missing(Y`J') & `J' > first
}

drop miss*

qui forval J = 1/7 {
	replace Y`J' = cond(`J' < first, 0, 1) if missing(Y`J')
}

list

That said, this sounds like a bad idea.

1. If 1 and 0 are in principle possible non-missing values it is a
very bad idea.

2. Even if not, you need to remember to exclude the 0s and 1s from
many, if not most, calculations with these variables.

3. Extended missing values (.a, .b, etc.) sound like what you really need here.

My question "Why not -reshape long-?" still stands.

Nick

On Fri, Apr 20, 2012 at 7:52 AM, Rituparna Basu <[email protected]> wrote:
> Hi Nick,
>
> Thank you so much for the resources and the code.
> I did run the macro but it said 'invalid syntax'.
>
> I think I did not mention my question properly. I would like to transform the following data :
>  ID  Y1  Y2  Y3  Y4  Y5  Y6  Y7
>  1    .      .      x    x     x     x   x
>  2    .     x      .    x     x     x   x
>  3    .      .      x    x    .     x   x
>  4    .      x     .    x     x     .   x
>  5    x     x     .   x      x     x  .
>
> Transform to:
>
> ID  Y1  Y2  Y3  Y4  Y5  Y6  Y7
>  1   0      0      x    x     x     x   x
>  2    0     x      1    x     x     x   x
>  3    0      0      x    x   1.     x   x
>  4    0      x     1    x     x     1   x
>  5    x     x     1   x      x     x  1
>
> Basically, replace the missing of var Y* (missing obs before and after the first obs (as you can see)) and not create a new variable.
> I apologize  for the confusion but any help is greatly appreciated!
>
> Thank you!
>
> Regards,
> RB
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: Thursday, April 19, 2012 12:15 PM
> To: [email protected]
> Subject: Re: st: Labeling different kinds of missing observations
>
> This sounds as if you want indicators
>
> missbefore  1 if any missing before first non-missing and 0 otherwise
>
> missafter     1 if any missing after etc.
>
> Here's a sketch. Code not tested.
>
> gen missbefore = 0
> gen missafter = 0
> gen first = .
>
> qui forval J = 1/7 {
>       replace missbefore = 1 if missing(Y`J') & `J' < `first'
>       replace first = `J' if missing(first) & !missing(Y`J')
>       replace missafter = 1 if missing(Y`J') & `J' > `first'
> }
>
> I think of these problems in this way.
>
> 1. I need to initialise an indicator. Sometimes the initial value does not matter; sometimes it does. You have to think it through for each problem.
>
> 2. I need to loop over the variables.
>
> 3. The first key then is "when do I change my mind?"
>
> 4. The second key is "if I change my mind, is the indicator then fixed, or may I need to update it?"
>
> But why not -reshape long-?
>
> See also
>
> SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
>        (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
>        Q1/09   SJ 9(1):137--157
>        shows how to exploit functions, egen functions, and Mata
>        for working rowwise; rowsort and rowranks are introduced
>
> Nick
>
> On Thu, Apr 19, 2012 at 7:10 PM, Rituparna Basu <[email protected]> wrote:
>
>> I am trying to generate a variable that will indicate missing BEFORE FIRST YEAR of OBSERVATION and missing AFTER FIRST YEAR of OBSERVATION.
>> Here is a sample of the data:
>>
>> ID  Y1  Y2  Y3  Y4  Y5  Y6  Y7
>> 1    .      .      x    x     x     x   x
>> 2    .     x      .    x     x     x   x
>> 3    .      .      x    x    .     x   x
>> 4    .      x     .    x     x     .   x
>> 5    x     x     .   x      x     x  .

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index