Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Dealing with missing values


From   "Joseph Coveney" <stajc2@gmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: Dealing with missing values
Date   Tue, 2 Apr 2013 09:21:00 +0900

Ching Wong wrote:

I am currently using Stata 12.

I need to convert the variable 'year' to a numeric variable by using
the commands, gen yr = date(year, "Y"). However, there is 1 missing
value generated. I have no idea why it would happen. As I have checked
every single obs for 'year', there is no missing value. But when I
checked the obs for 'yr', there is one "." after converting from
'year' to 'yr'.

Why does it happen only for the year 1990 particularly, which is the
maximum for the variable 'year'? And how should I deal with this
missing value? I have tried to replace yr="" if yr==".". But it can't
be eliminated. ie type mismatch r(109)

Year  Yr
1960   0
1940   -7305
1990   . <--??
1986   9497
1915   -16436

--------------------------------------------------------------------------------

Just a couple of suggestions.  Because Year is a string variable, it can take
nonnumeric characters, some of which might not be readily discernable with some
typefaces.  So, make sure that the one in the "1990" observation is a numeral
and not a lower-case el, and that the zero is a zero and not an upper-case oh.
Data-entry errors do happen.  You can scan for observations that aren't all
numeric (that is, they contain at least one nonnumeric, nonwhite-space
character) using the -real()- function, as in the line below.

    list Year if missing(real(Year))

If there's a single missing value in Yr and it's only for the year 1990, then
you can replace it with the following.

   replace Yr = date("1990", "Y")

taking care when typing-in "1990".  It's a bit of a kludge, but you can think of
it as a temporary workaround that will let you go on with your work while you
figure out in parallel what's going wrong.

Joseph Coveney

P.S. It's always helpful to show exactly what you typed and exactly what you
got, for example, a copy-and-paste from the Results window.  Sometimes you
believe that you typed something that you didn't, and it's easier for others to
see the problem when you copy-and-paste the Results screen verbatim rather than
paraphrase as you did.

. input str4 Year

          Year
  1. 1960
  2. 1940
  3. 1990
  4. 1986
  5. 1915
  6. end

. 
. generate float Yr = date(Year, "Y")

. list, noobs

  +---------------+
  | Year       Yr |
  |---------------|
  | 1960        0 |
  | 1940    -7305 |
  | 1990    10958 |
  | 1986     9497 |
  | 1915   -16436 |
  +---------------+

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index