Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: A bug in egen and gen?

From   "Liao, Junlin" <>
To   "" <>
Subject   RE: st: A bug in egen and gen?
Date   Thu, 17 Feb 2011 18:54:38 +0000

I appreciate the quick reply and it's quite informative. I thought it's a bug but was not sure. Now I know if I use -generate- or -egen- command, I have to set type of numeric variables.

I have no desire for a perfect STATA, however, I do see where STATA could make it better. A warning about default data type for numeric variables in respective commands help window could be helpful. Better yet, with increasing computing power, those commands should perform their calculations with the highest accuracy type and then perform a -compress- type operation to finalize the new variables. Just a thought.



-----Original Message-----
From: [] On Behalf Of Nick Cox
Sent: Thursday, February 17, 2011 12:23 PM
Subject: Re: st: A bug in egen and gen?

This isn't a bug.

It may well bite you, but a better description is that you get what you ask for.

The default default [intended] type for new numeric variables is -float-. As well documented, -float- can not hold sufficiently large integers accurately, which is precisely why -long- and -double- are available as alternatives. As well documented, both commands allow you to depart from the default.

So Junlin already documented what is better practice, that you spell out that you want a -long-.

The alternatives include

1. You always have to tell -generate-, etc. what variable type you want created. On the whole, I don't think that would be a popular change.

2. You write your own wrappers for -generate-, etc. that insist on variable type being specified. That is programmable.

3. Your attitude is that Stata should always be smart enough to work out what you want. Good luck on that one.


On Thu, Feb 17, 2011 at 6:09 PM, Liao, Junlin <> wrote:

> I happen to notice a problem with the egen and gen commands. I'm using Stata 2011 SE (64 bit). I do not know if this problem exists in other versions. Please run the following commands to reproduce the problem:
> clear
> set obs 5
> generate y=83085733
> generate long z=83085733
> egen y_mean=mean(y)
> egen z_mean=mean(z)
> egen long y_mean_long=mean(y)
> egen long z_mean_long=mean(z)
> format %10.0g y z  y_mean z_mean y_mean_long z_mean_long list
> By default, both egen and gen command use float for the size of the number and the values generated are not correct. However, if we restrict the numbers to be long integer, we can get correct results.
> Anybody else noticed the bug? Is there an explanation for what
> happens? Thanks,

*   For searches and help try:

Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged.  If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited.  Please reply to the sender that you have received the message in error, then delete it.  Thank you.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index