Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: latest update of cut


From   Marcello Pagano <pagano@hsph.harvard.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: latest update of cut
Date   Wed, 07 Aug 2002 13:25:54 -0400

In my opinion this is terrible.  A missing value
is a missing value is a missing value, and should
remain so.

m.p.


Jean Marie Linhart, StataCorp. wrote:

Michael Hills <mhills@blueyonder.co.uk> wrote:

I was disturrbed to find that in the latest ado update the cut
function of egen has changed how it deals with missing values.

In the version before the update, when you cut a variable with missing
values, these were coded as missing in the new variable. Now they are
coded with the upper limit specified in cut.

I can see no logic in this and I presume it is an error which has been
introduced during the latest update of cut. There is no change in the
help file for egen.

The change to -egen, cut- was intentional; it was pointed out by a
user that its behavior did not match the documentation. I concurred
and modified the function to better match the documentation.

Previously when -egen, cut- was used with the -at- option, the
(ascending) list of numbers are intended left hand endpoints for
subdividing the data. However any member of the input data greater
than or equal to the largest value in -at- got mapped to missing. This contradicted the notion that the list of numbers was intended
as left hand endpoints.

I modified -egen, cut- so that the values of the input data that are
greater than or equal to the largest value in -at- are mapped to the
largest value in the -at- list.

Since missing values are greater than numerical values, this resulted
in the behavior that now missing values are mapped to the largest
value specified in -at-. This makes sense to me given that the values
in -at- are intended as left hand endpoints.

If the user wishes to exclude missing values, it is easy to accomplish
this using the -if- or -in- options. For example,

. egen newx = cut(x), at(2, 4, 6) if x != .

The documentation for -egen, cut- says nothing about what happens
to missing values, but perhaps it should. I'll get that done.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

--Jean Marie
jlinhart@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/


--
______________________________________________________________________

Marcello Pagano
Biostatistics Department			Tel: 1-617-432-4911
Harvard School of Public Health		        Fax: 1-617-739-1781
655 Huntington Avenue            		email:pagano@biostat.harvard.edu
Boston, MA  02115                 		http://biosun1.harvard.edu/~bio200
USA

eppur si muove


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index