Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: matched data


From   "Nick Winter" <nwinter@policystudies.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: matched data
Date   Mon, 24 Jun 2002 12:42:03 -0400

> -----Original Message-----
> From: Theodoropoulos, N. [mailto:nt18@leicester.ac.uk] 
> Sent: Monday, June 24, 2002 12:22 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: matched data 
> 
> 
> Dear Statalisters,
> 
> I have a matched employer-employee dataset. Each firm is 
> recognised by a unique identifier, and in each firm I have up 
> to 25 employees. From the existing data I want to generate 
> some new variables which capture only firms that employ a 
> positive proportion of old and/or young workers. In other 
> words I want to generate variables by selecting only the 
> firms with some sampled old and/or young employees.
> Also, I have the variables that capture the proportion of old 
> and young workers. 
> Is there a quick way of doing this instead of going through 
> the firms one by one.
> 
> Any hints will be highly appreciated,

I'm not entirely clear what you want to do, but the -by: varlist- and/or
-egen ..., by()-constructs will probably be your friends here.  For
example, to to generate a variable tagging records based on the
proportion of old and young employees in a firm, you might do this:

	. generate old = (age>55) if age!=.		/* or whatever
age */
	. egen tot_old = sum(old), by(firm)
	. egen tot_yng = sum(1-old), by(firm)
	. gen prop_old = tot_old / (tot_old+tot_yng)

--Nick W

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index