Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re:st: Looping the Loop

From   "Nick Cox" <>
To   <>
Subject   Re:st: Looping the Loop
Date   Mon, 31 Mar 2003 14:27:05 +0100

Joel Clovis

> A couple of weeks (more like 5 or 6 ) Nick Cox helped me with a
> dropping schema for my large dataset when observations fell below a
> predefined number (30).  This prog (below) works quite well when
there is
> only one country in the dataset.  That is, I have been using drop to
> eliminate the other 130 countries from my dataset then deleted the
> saving and repeating the process for another country. This seem to
> crying  out for a simplier method.

> I have been following the recent exchange between Chris Rohlfs, Nick
> Winter and Edwin Leuven and their solution requires that you call
> macro by name, (in Nick's solution `C' and `V' and in Edwin's `v1'
and `v2' ),
> and this calling is not going to help me.    I need to say something
like:  by
> County:  drop var if obs <=30 but I don't have the skill to do it,
can anyone
> help?

> 2. How to -drop- according to your criterion
> ============================================
> I'd do it this way:
> . foreach v of var cba2tfina-region {
> .  	qui count if !missing(`v')
> .	if r(N) < 30 {
> .		drop `v'
> .	}
> . }

Kit Baum

> It is not clear, in a panel context, how you want to handle this.
> you drop a variable, it is dropped for all countries. Do you mean
> a variable should be dropped if ANY country fails the test of having
> obs, or that it should only be dropped if ALL countries fail? The
> latter will probably never happen, and the former will probably
> all variables to be thrown away. Perhaps you would like to set a
> particular variable to missing if it fails the test for each country
> which it does? Here is a fragment of code in which I do something
> similar to a panel of firm data:

> g byte enn = 1
> bys gvkey : egen byte nobsf=sum(enn)
> bys gvkey : drop if nobsf<4

> here 'gvkey' is the firm identifier, and I want only those firms who
> have four or more years of data. I could, instead, set some variable
> missing if nobsf<4 (or <30, in your case). Of course, you want to
> whether that variable is missing, not merely whether that number of
> obs. exist.

The limit is not Joel's skill. It is that Stata lets you
drop observations or variables at any one time, but not

Kit's code can be condensed to

bysort gvkey : drop if _N < 4

Perhaps what Joel wants is something like

bysort country : drop if _N < 30

That is, his problem sounds more like one of
dropping observations than of dropping variables.


*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index