Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: A bug in egen and gen?

From   "Liao, Junlin" <>
To   "" <>
Subject   RE: st: A bug in egen and gen?
Date   Fri, 18 Feb 2011 16:30:22 +0000


The -doubletofloat- procedure is a nice addition to -compress-. It essentially makes up for what should be included in the -compress- procedure. People concerning with dataset size should be getting this one. Thanks.


-----Original Message-----
From: [] On Behalf Of David Kantor
Sent: Friday, February 18, 2011 9:56 AM
Subject: RE: st: A bug in egen and gen?

At 09:43 PM 2/17/2011, Junlin wrote:
>Unless someone changed the command, my version of Stata 11 does not
>compress double to float when it can do so. This is also indicated in
>the documentation. However, I can recast a variable to float if there
>is no loss of precision, otherwise I have to put in the /force option
>to force convert with loss of precision.

You may want to check
-ssc desc doubletofloat-

While -compress- does recast double to long, int or byte, it does not go double to float. Typing -recast float ...- is easy, but
-doubletofloat- provides some additional convenience.

You may also be interested in
-ssc desc floattolong-
That does not save space, but recasts to a possibly more appropriate type.
I use long, int, or byte, (depending on the range) whenever the value are sure to be integer.
What I'd like to see is an 8-byte integer type. Could be useful in some circumstances, for several reasons.

Finally, in response to Nick's comment,...

 > You always have to tell -generate-, etc. what variable type you  > want created. On the whole, I don't think that would be a popular  > change.

This is my choice. Maybe it's a result of my programming background, but when I create a variable, the first thing I want to know is what data type it should be. What kinds of values -- integer or fractional? What range? Based on that, I choose the appropriate type.
A command such as,
gen a = ...
looks risky to me, and I rarely do it. (I would do it only in manually-typed experiments. I would never do that in "live" work.) It could possibly have different results in different circumstances (depending on the default type). This habit is so ingrained, that I sometimes write, gen float a = ...

In summary, I almost never depend on the default; I work as if the data type were a required feature of -gen- and -egen-.

*   For searches and help try:

Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged.  If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited.  Please reply to the sender that you have received the message in error, then delete it.  Thank you.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index