Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: Knowing how a variable was generated
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: RE: Knowing how a variable was generated
Date
Mon, 1 Nov 2010 16:26:42 +0000
In reply to Rich Goldstein:
-defv- (John R. Gleason, STB) was one of the older programs referenced in the help for -labgen- (SSC), referred to in one of my replies. John jumped to R without looking back around 2001, so no update is likely there.
But fair's fair: -defv- explains itself as a wrapper for -generate- and -replace- and does not purport to work with -egen-. Still less will it, or any other program of this kind, work for variables generated as side-effects of many commands.
In reply to Uli Kohler:
Just to spell out for anyone confused that your approach using -notes- is the easiest way to define characteristics, as mentioned earlier in the thread. That is, "using -notes-" and "using characteristics" are particular and general versions of the same strategy.
Nick
[email protected]
Richard Goldstein
note that there is a program for this, -defv-; use -findit-
however, this program does not work with -egen- and does not work with
-by- (and does not always work with Stata 11 either)
Ulrich Kohler
> In principle it is also possible to store this information as note:
>
> . sysuse auto
> . gen x = weight - 1
> . note x: gen x = weight - 1
>
> . replace x = weight +1
> . note x : replace x = weight + 1
>
> . note x
>
> Clearly it is possible to write programms (i.e. -gennote- and
> -replacenote-) that do this automatically. The question however arise
> why someone who is not willing to give away his do-files should use
> these programs when creating a data set ...
Louis Boakye-Yiadom
That's correct. I'm looking at a situation where the do-file is not available. Indeed, often you may have to work with a dataset for which you played no role in the generation of the variables. Thanks.
Nick Cox
>>> Indeed. But Louis' question, and my
>>> answers, presuppose that was not done.
Michael McCulloch
>>> Wouldn't it be sufficient to simple record the work in a
>>> do-file that documents the command:
>>> gen B = (A*C) + D, or
>>> gen B = A*(C + D)?
>>>
>>> On Oct 31, 2010, at 9:46 AM, Nick Cox wrote:
>>>
>>>> There are programs that enable users to record
>>> definitions of variables as they generate or replace them.
>>> See e.g. -labgen- from SSC and especially its references.
>>>>
>>>> More generally, if users employed variable labels or
>>> characteristics to record the definition of variables --
>>> then your problem is indeed soluble.
>>>>
>>>> I didn't imagine that's what you had in mind, as if
>>> you knew that definitions were stored that way it's hard to
>>> see why your question arises.
Louis Boakye-Yiadom
>>> Nick, thanks for the reply. I was thinking that if it's
>>> possible for Stata to store information on the generation of
>>> the variable (at least in simple cases), it might be
>>> possible to have this feature in Stata.
Nick Cox
>>>>> In general, no. How could there be?
>>>>>
>>>>> However, in simple cases for Y calculated somehow
>>> from X,
>>>>> looking at graphs of Y vs X might give a
>>> clue.
Louis Boakye-Yiadom
>>>>> If some of the variables in a dataset were
>>> generated by a
>>>>> transformation or combination of some other
>>> variable(s) in
>>>>> the data, is it possible to know this without
>>> seeing the
>>>>> relevant log or do file? For example, consider a
>>> situation
>>>>> where the variables in the data include A, B, C,
>>> and D, and
>>>>> B was generated as follows:
>>>>> B = A*C + D
>>>>> Is there a command for determining how B was
>>> generated?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/