Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Mark observations with missing data


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Mark observations with missing data
Date   Fri, 6 Jan 2012 18:51:45 +0000

-dropmiss- is mine. The reason there isn't a -markmiss-, as it were,
is as Alan explains because Stata's inbuilt function -missing()-
exists.

With large numbers of variables,

egen ismissing = rowmiss(*)

is an alternative. The value of -ismissing- is not restricted to 0 or
1, evidently, but

... if ismissing

catches observations with missing values regardless.

But it is only numeric variables you are concerned about

regress <varlist>
gen ismissing = !e(sample)

is a statistical way of doing this.

Nick

On Fri, Jan 6, 2012 at 6:16 PM, Richard Herron
<richard.c.herron@gmail.com> wrote:

> Thanks, Alan! I knew that I had to missing something (here largely
> failing to read the manual :) ).
>
> I guess I can really do this in one line -generate byte missing =
> missing(mpg, price, rep78)-.

 On Fri, Jan 6, 2012 at 13:07, Alan Neustadtl <alan.neustadtl@gmail.com> wrote:

>> The missing function might be useful here.  See -help missing()-.  For example:
>>
>> /* Begin example */
>>
>> sysuse auto, clear
>> gen missvar=0
>> replace missvar=1 if missing(rep78, mpg, price)
>> list rep78 mpg price missvar
>>
>> /* End example */

On Fri, Jan 6, 2012 at 1:00 PM, Richard Herron

>>> I learned about -dropmiss- (Stata Journal) from this Stata List post:
>>> http://www.stata.com/statalist/archive/2010-05/msg00513.html
>>>
>>> But I would like to keep all observations and just mark the ones with
>>> missing observations in a given -varlist-. This way with an -if-
>>> statement I can find summary statistics for the same data that I will
>>> later use in regressions. I provide a hackish solution below, but is
>>> there a better way? Is there anything flawed with my approach? I am a
>>> little wary of loops and my own coding. Thanks!
>>>
>>> Here is my hackish solution if I want to mark observations with any
>>> missing variables.
>>>
>>> * begin code
>>> sysuse auto, clear
>>> replace price = . in 1
>>> replace mpg = . in 2
>>> list in 1/5
>>>
>>> * hackish solution
>>> generate byte complete = 1
>>> foreach x of varlist * {
>>>    replace complete = 0 if missing(`x')
>>> }
>>> list in 1/5
>>> * end code

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index