Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: basic programming tips


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: basic programming tips
Date   Wed, 11 Oct 2006 21:19:15 +0100

Thanks for the clarification, which looks like 
a correction to me, e.g. for "from 0 to 1" 
read "from 1 to 0". 

It wastes your time -- and for it's worth 
other people's -- if you say what you don't mean
and mean what you don't say. 

In your previous posting you discussed the
variable -sa-, sexually active, in terms "is sexually
active", "no longer sexually active", which prompted
my comments. 

Now it appears that -sa- means "ever sexually active?", 
which manifestly difers. 

Could you please check these postings more carefully
before sending them off and confusing readers? 

Nick 
[email protected] 

Scott Cunningham
 
> Thanks for the response.  I'm going over it carefully, but I wanted  
> to quickly clarify something.  The contradictions that I'm worried  
> about are not going from 0 to 1, but rather going from 1 to 0 
> - which  
> is impossible, given that the nature of the event I'm describing  
> (e.g., did the person ever have vaginal intercourse with a member of  
> the opposite sex).  This would be straightforward if there was just  
> one question to appeal to, but unfortunately, the way the NLSY97 is  
> set up, that simple question is asked in a variety of different ways  
> to the 9000 different respondents, depending on their answers 
> to many  
> other questions.
> 
> I'm reading more closely your recommendations now.  Just wanted to  
> clarify that point about the contradiction.
 
> On Oct 11, 2006, at 2:57 PM, Nick Cox wrote:

> >> 1.  I am occasionally worried that I am replacing variables with
> >> values that are incorrect.  In this example, it is easy to find
> >> contradictions, though.  If someone is sexually active in 
> an earlier
> >> wave (say 1997) but then later reports that they are no longer
> >> sexually active (say 2002), then it would mean the person
> >> reported he
> >> was not a virgin in 1997 but is a virgin in 2002.  How do others of
> >> you check to make sure you do not have mistakes like this 
> - once you
> >> have already reshaped the data into a panel, for instance?  I
> >> think I
> >> do not possess enough of these checks in my programming, in
> >> fact, and
> >> am making many mistakes along the way that I'm not catching.
> >
> > I don't want to start a discussion on Statalist on quite what
> > is virginity, but unfortunately you seem to need to define exactly
> > what _you_ understand by it. I don't regard your example here
> > as contradictory at all as long as virgin means here "not
> > sexually active". Alternatively, if a person was ever previously
> > sexually active, I do not see how they can revert to being
> > a virgin (barring some legalistic redefinition).
> >
> > More generally, you can check for correctness if you independently
> > have correct answers or have some rule that guesses correct
> > answers for you (e.g. a majority vote). I don't see either here.
> >
> >> 3.  Finally, sexual activity has holes, as I said, which if
> >> there are
> >> no contradictions (like going from 0 to 1 over time), can be
> >> corrected by filling all missing observations with a 0 or 1,
> >> assuming
> >> the first time a 1 appears is truly the first year the person made
> >> their sexual debut.  What is the best way to fill in a 
> missing value
> >> in the context of this type of duration modeling?  I need to tell
> >> Stata to make all missing observations a 0, unless a 1 had appeared
> >> at some point earlier, in which case replace with a 1.
> >
> > Again, going from 0 to 1 over time does not seem 
> contradictory to me.
> >
> > The maximum of -sa- seen so far is just
> >
> > gen max_sa_sofar = .
> > bysort id (year) : replace max_sa_sofar = max(sa, 
> max_sa_sofar[_n-1])
> >
> > The way that the -max()- function works is that -max(0,.)- is 0, - 
> > max(1,.)-
> > is 1, etc., so that the usual rule that . is arbitrarily large
> > is set aside. (This is a feature not a bug.)
> >
> > This principle is implemented in the -egen- function -record()- from
> > -egenmore- on SSC, attributable to Kit Baum and S.B. Else.
> >
> > Thus you just need to copy across from this -max_sa_sofar- variable
> > whenever -sa- is missing. That still leaves open for discussion  
> > whether
> > this method of imputation is socially or sexually valid, as I doubt.
> >

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index