Louis Boakye-Yiadom <lby_kw@yahoo.co.uk>

statalist@hsphsun2.harvard.edu

Re: st: RE: Knowing how a variable was generated

Mon, 1 Nov 2010 15:46:23 +0000 (GMT)

Thanks for this and all the previous comments. Louis --- On Mon, 1/11/10, Richard Goldstein <richgold@ix.netcom.com> wrote: > From: Richard Goldstein <richgold@ix.netcom.com> > Subject: Re: st: RE: Knowing how a variable was generated > To: statalist@hsphsun2.harvard.edu > Date: Monday, 1 November, 2010, 13:46 > note that there is a program for > this, -defv-; use -findit- > > however, this program does not work with -egen- and does > not work with > -by- (and does not always work with Stata 11 either) > > Rich > > On 11/1/10 5:43 AM, Ulrich Kohler wrote: > > I principal it is also possible to store this > information as note: > > > > . sysuse auto > > . gen x = weight - 1 > > . note x: gen x = weight - 1 > > > > . replace x = weight +1 > > . note x : replace x = weight + 1 > > > > . note x > > > > Clearly it is possible to write programms (i.e. > -gennote- and > > -replacenote-) that do this automatically. The > question however arise > > why someone who is not willing to give away his > do-files should use > > these programs when creating a data set ... > > > > Uli > > > > > > Am Sonntag, den 31.10.2010, 17:41 +0000 schrieb Louis > Boakye-Yiadom: > >> That's correct. I'm looking at a situation where > the do-file is not available. Indeed, often you may have to > work with a dataset for which you played no role in the > generation of the variables. Thanks. > >> > >> Louis > >> > >> > >> --- On Sun, 31/10/10, Nick Cox <n.j.cox@durham.ac.uk> > wrote: > >> > >>> Indeed. But Louis' question, and my > >>> answers, presuppose that was not done. > >>> > >>> Nick > >>> n.j.cox@durham.ac.uk > >>> > >>> > >>> Michael McCulloch > >>> > >>> Wouldn't it be sufficient to simple record the > work in a > >>> do-file that documents the command: > >>> gen B = (A*C) + D, or > >>> gen B = A*(C + D)? > >>> > >>> On Oct 31, 2010, at 9:46 AM, Nick Cox wrote: > >>> > >>>> There are programs that enable users to > record > >>> definitions of variables as they generate or > replace them. > >>> See e.g. -labgen- from SSC and especially its > references. > >>>> > >>>> More generally, if users employed variable > labels or > >>> characteristics to record the definition of > variables -- > >>> then your problem is indeed soluble. > >>>> > >>>> I didn't imagine that's what you had in > mind, as if > >>> you knew that definitions were stored that way > it's hard to > >>> see why your question arises. > >>> > >>> Louis Boakye-Yiadom > >>> > >>> Nick, thanks for the reply. I was > thinking that if it's > >>> possible for Stata to store information on the > generation of > >>> the variable (at least in simple cases), it > might be > >>> possible to have this feature in Stata. > >>> > >>> Nick Cox > >>> > >>>>> In general, no. How could there be? > >>>>> > >>>>> However, in simple cases for Y > calculated somehow > >>> from X, > >>>>> looking at graphs of Y vs X > might give a > >>> clue. > >>> > >>> Louis Boakye-Yiadom > >>> > >>>>> If some of the variables in a dataset > were > >>> generated by a > >>>>> transformation or combination of some > other > >>> variable(s) in > >>>>> the data, is it possible to know this > without > >>> seeing the > >>>>> relevant log or do file? For example, > consider a > >>> situation > >>>>> where the variables in the data > include A, B, C, > >>> and D, and > >>>>> B was generated as follows: > >>>>> B = A*C + D > >>>>> Is there a command for determining how > B was > >>> generated? > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

