[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
wgould@stata.com (William Gould, Stata) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: How do you drop the variable -(e)- from the data? |

Date |
Wed, 05 Nov 2003 09:37:00 -0600 |

Roger Newson <roger.newson@kcl.ac.uk> noticed that if, after estimation, he types -list *-, in addition to all the expected variables, a variable named "(e)" also appears in the output. He writes, > I am having a problem with the variable whose name is (e), which appears to > be generated whenever an estimation command is executed, and which contains > the results of the function -e(sample)-. It is a bug that Roger ever saw the variable "(e)", so let me explain: 1. Roger is right: Variable "(e)" has to do with e(sample) and, in fact, is e(sample). 2. The existance of variable "(e)" was supposed to be completely hidden. Had we done that right, I would not now be writing this email. 3. There is no bug except that Roger saw the variable "(e)" (and found some other ways to access it). So we will fix that bug but, until we do, it is not a bug that should bother anybody. For those who are curious, here is what "(e)" is about: T1. When you run an estimation command, Stata needs to store e(sample) -- the function that identifies which observations were used. That information is stored in the dataset in the secret variable named "(e)". T2. The name "(e)" (note the parens) was chosen carefully to be an invalid name. It should not surprise you that inside Stata, we have the ability to create variables named anything we want. We chose an invalid name so that it would never conflict with a valid name a user might want to create. In addition, an invalid name would be rejected by the parser and so make it more difficult that any user would ever discover the secret variable. T3. When you -save- a datwaset, variable "(e)" is *NOT* stored in the dataset. Stata knows to skip that variable. More correctly, variable "(e)" is not stored unless you specify -save-'s -all- option. As it says in the on-line help, "-all- is for use by programmers. If specified, e(sample) will be saved with the dataset. You could run a regression, -save mydata, all-, -use mydata-, and -predict yhat if e(sample)-. T4. The variable "(e)" is dropped (1) whenever a new estimation command is run (in which case a new "(e)" is created), and (2) whenever you type -discard- (which eliminates previous estimation results), and (3) whenever a -drop- command results in a dataset that contains only "(e)". So what happened? Where did we go wrong? In fact, "(e)" has been in Stata for sometime without anyone knowing, but when we added fancier pattern matching for varlists (so that you can type things like "*e*", something that used not to be allowed), we forgot to exclude "(e)", and that opened to the door to Roger's discovery. It was just as Nick Cox <n.j.cox@durham.ac.uk> suspected: "This raises the question of whether it's been there for ages, or it's only recently become visible as a result of some other change in Stata." -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Re: How do you drop the variable -(e)- from the data?***From:*Roger Newson <roger.newson@kcl.ac.uk>

- Prev by Date:
**RE: st: k-sample tests for differences in proportions** - Next by Date:
**RE: st: RE: scalar troubles?** - Previous by thread:
**Re: st: How do you drop the variable -(e)- from the data?** - Next by thread:
**st: Re: How do you drop the variable -(e)- from the data?** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |