| ![]() |
||||
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Re: RE: Re: counting and eliminating data
From
"Michael Blasnik" <michael.blasnik@verizon.net>
To
<statalist@hsphsun2.harvard.edu>
Subject
st: Re: RE: Re: counting and eliminating data
Date
Wed, 25 Jan 2006 13:39:47 -0500
Woops...I forgot about that. When I look at how many lines of code it
required, I started to think that going back to first principles might be
better than using nvals and actually require fewer lines of code and be
faster:
bysort subarea (year): gen byte persubyear=_n==1
by subarea (year): gen nyears=sum(persubyear)
by subarea (year): gen firstfive=sum(persubyear*(year<=5))
by subarea (year): gen lastfive=sum(persubyear*(year>21))
by subarea (year): gen tokeep= nyears[_N]>=13 & lastfive[_N]>=2 &
firstfive[_N]>=2
keep if tokeep
drop nyears firstfive lastfive tokeep
Michael Blasnik
----- Original Message -----
From: "Nick Cox" <n.j.cox@durham.ac.uk>
To: <statalist@hsphsun2.harvard.edu>
Sent: Wednesday, January 25, 2006 12:09 PM
Subject: st: RE: Re: counting and eliminating data
Note that use of the -egen- function -nvals()-
depends on prior installation of the -egenmore-
package from SSC.
Nick
n.j.cox@durham.ac.uk
Michael Blasnik
There are a couple of approaches you could take, but I think
using egen
nvals is the best bet.
sort subarea
by subarea: egen nyears=nvals(year)
keep if nyears>=13
sort subarea
by subarea: egen firstfive=nvals(year) if year<=5
by subarea: egen lastfive=nvals(year) if year>21
* fill out missing values within subarea
bysort subarea (firstfive): replace firstfive=firstfive[1]
bysort subarea (lastfive): replace lastfive=lastfive[1]
keep if firstfive>=2 & lastfive>=2
drop nyears firstfive lastfive
Jennifer Devine
*> Can someone please set me in the right direction for coding
a program to
> count and eliminate data if it doesn't meet a certain criteria?
>
> I have survey data taken over 26 years and the survey area
is divided
> into subareas. I want to only include a subarea if data was
collected 13
> years out of the 26 and data must have been collected 2
years of the first
> 5 years and 2 years of the last 5 years. If the subarea
does not meet that
> criteria, I want Stata to drop that subarea from the
analysis. At the
> moment, I'm having to look at everything individually and
it takes several
> days to eliminate subareas.
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
| © Copyright 1996–2013 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |