[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
egen nummiss = rowmiss(<regress variables>)
gen run = .
by id: replace run = cond(L.run == ., 1, L.run + 1)
by id: egen maxrun = max(run)
gen byte OK = (maxrun > 4) & (nummiss == 0)
This (consciously) is only part of what you
specify, but the FAQ specifies more of what
you need to know.
> I have an unbalanced panel, with companies across time
> (quarters). I would like to restrict my sample to
> those companies and periods with a minimum of, say, 5
> quarters of *continuous* non-missing data on all 7
> variables that I want to use in my regression. In
> other words, I would leave out a company with only 4
> continuous observations of "complete" data; and if I
> had a company with some irregular observations at some
> times and then a group of 5 continuous observations
> later, I would keep the company but omit all the
> irregular observations and missing variables.
> The reason why I thought that might be a good idea is
> to omit companies with just a few scattered
> observations. Also, I want to use some lags and if I
> have many missing obsevations, then the sample that
> actually goes into a regression will depend on how
> many lags I specify. (I know that cleaning to data in
> the way I want to will not completely solve that
> problem but at least should help).
> If possible I would prefer not to drop observations
> completely, but perhaps to have a dummy if an
> observation is "OK", in other words part of a series
> of at least 5 continuous observations with valid data.
* For searches and help try: