Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: data management help

From   mnitkin <>
Subject   Re: st: data management help
Date   Mon, 04 Oct 2004 09:24:29 -0400

Thanks for all the help. The programming below worked great and got me the output that I needed.

Ulrich Kohler wrote

However, as far as I understand, this can be done with

. by firmid, sort: gen nyears = _N
. keep if nyears == 14

where "firmid" is the idenfier variable for the firms and "years" is the variable holding the year of observation.

many regards

mnitkin wrote:

Your assumptions about the data are correct.

I tried your suggestion, but "duplicates" is not a valid command on my
version of Stata. Is there a similar command that might work with pre
7.0 versions of Stata.
Thanks for the help,

Richard Williams wrote:

At 12:09 AM 10/4/2004 -0400, mnitkin wrote:

I have a data set with 131,000 firm observations over 14 years.
Individual firms may be in the data set between 1 and 14 times. I want
to keep only those firms that have observations for the entire 14 year

I've tried all the tricks I know as well as a number of suggestions on
the stata website, but I haven't had any luck.
Something like this might do it.  Lets suppose you want to keep those
cases where the same id number occurs 14 times (i.e. there is a first
occurrence and then 13 "duplicates").  Lets further assume each firm has
a maximum of 1 record per year. Then,

duplicates tag id, gen(nyears)
keep if nyears == 13

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX:    (574)288-4373
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW (personal):
WWW (department):

*   For searches and help try:
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index