I just said be aware of -fillin-. Like you,
I don't think it helps in this case. But
if the problem turned out to be a panel
problem, -fillin- would look more like an
answer.
Nick
n.j.cox@durham.ac.uk
Le Wang
> When I saw the question, my first reaction is to use -fillin-. But I
> couldn't figure it out how. Would you please give an example? Thanks.
Nick Cox
> > There are small terminology problems here
> > we all share, new or not so new.
> >
> > In particular, I'd like to urge that we
> > refer to observations that should be
> > in the dataset, but are not, as _omitted_,
> > not _missing_. Values in the dataset can
> > naturally be _missing_ with respect to any number of
> > variables. In Stata, "missing" has a very specific
> > meaning. It is not equivalent to, in British
> > idiom, "gone missing", meaning "nowhere to be seen".
> >
> > That's picky, as the question was very clear.
> >
> > I would do it in place, as I am not a -merge- maven
> > skilled in choreographing a pas de deux between
> > two files.
> >
> > clear
> > set obs 30
> > gen id = _n + 11
> > gen frog = uniform()
> > local N = _N
> > qui forval i = 1/41 {
> > count if id == `i'
> > if r(N) == 0 {
> > set obs `=_N + 1'
> > replace id = `i' in l
> > }
> > }
> > gen extra = _n > `N'
> > l id frog extra
> >
> > This hinges on the fact that if the observation
> > we would like included has indeed been omitted, then
> > -count- will return 0. In that case, we bump
> > up the number of observations. The extra observation
> > is always added at the end.
> >
> > The -frog- example here underlines that
> > extra values are born missing.
> >
> > Also, be aware of -fillin- and -tsfill-.
Maarten Buis
> > > I would do this as follows: If you know
> > > the lowest and highest number your id variable can take than
> > > it is pretty simple to create a new file that will contain
> > > all integers between these numbers. Than you can merge that
> > > file with your dataset, which will create the new cases and
> > > the _merge variable that is created by -merge- will tell you
> > > which cases are added. See the example below.
> > >
> > > *------------- begin example -----------
> > > clear
> > > set obs 30
> > > gen mpg = _n + 11 /*I want to fill in all missing
> integers of mpg*/
> > > list in 1/10
> > > sort mpg
> > > tempfile numbers /*this way the file `numbers' will only be
> > > available*/
> > > save `numbers' /*during this do session, see: -help tempfile-*/
> > >
> > > sysuse auto, clear
> > > sort mpg
> > > list mpg foreign in 1/10
> > > merge mpg using `numbers'
> > > tab _merge /*a case is added if _merge == 2, see: -help merge-*/
> > > sort mpg
> > > gen var1skippedvalue = _merge==2 /*this uses a logical expression
> > > var1skipped value equals 1 if it is added and zero if it is not*/
> > > list mpg foreign var1skippedvalue in 1/10
> > > *----------- end example ---------------
> >
> > Patrick Woodburn
> >
> > > If I have an id variable called "var1" with a selection of
> > > unique values
> > > in a given range of integers (eg the values 1, 3, 5, 6, 7,
> > > and 9), and I
> > > want to create new observations which contain each missing
> > > value in that
> > > range and are blank for all other variables (eg new observations
> > > containing 2, 4, 8 and 10) and a new variable to flag
> that they have
> > > been artificially generated, what do I do? Currently,
> all I can think
> > > of is the rather roundabout way of doing it below, but I
> > > can't help but
> > > think that surely there must be a more efficient method.
> >
> > >
> > > *Code begins (dataset already open)
> > >
> > > preserve
> > > keep var1
> > > drop if var1==.
> > > bysort var1: assert _n==1
> > > gen flag=0
> > > gen id=1
> > > reshape wide flag, i(id) j(var1)
> > > forvalues i=1/10 {
> > > cap gen flag`i'=1
> > > }
> > > reshape long flag, i(id) j(var1)
> > > drop id
> > > keep if flag==1
> > > save var1skippedvalues
> > > restore
> > > append using var1skippedvalues
> >
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/