Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: how to generate groups based on some characteristics and obtain the mean/median value for each group


From   "Nick Winter" <nwinter@policystudies.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: how to generate groups based on some characteristics and obtain the mean/median value for each group
Date   Wed, 19 Jun 2002 09:38:47 -0400

> -----Original Message-----
> From: Yi, Bingsheng [mailto:byi@coba.usf.edu] 
> Sent: Tuesday, June 18, 2002 7:11 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: how to generate groups based on some 
> characteristics and obtain the mean/median value for each group
> 
> 
> Dear Statalisters,
> 
> I wonder whether you will help me figure out the codes to solve the
> following problem:
> 
> I have  12 years panel data containing these four variables: 
> Tobin's q,
> size, 4-digit industry code (ind4), and id. For each year, I 
> want to make
> some adjusments in one variable (Tobin's q) based on the 
> other two variables
> (industry and size). First I need to ensure that there are 
> lat least 10
> firms within each industry. If the number of firms within a 
> 4-digit industry
> code is less than 10, I use 3-digit industry code generated 
> by gen str4
> ind3=substr(ind4,1,3), see whether the number of firms with 
> the same 3-digit
> industry code is greater or equal to 10, if not, then generate and use
> 2-digit  industry code. So in the end there are at least 10 
> firms within an
> industry ( which are classified by 4-digit, 3-digit, 2-digit, 
> or 1-digit
> industry code). The  problem is how to get and record the 
> number of firms in
> each industry.

For this piece, try something like this.  First, generate four variables
indicating the 4, 3, 2 and 1-digit industry codes for **ALL** records,
named ind1, ind2, ind3, ind4.  Then:

	* generate the number of records in each group
	forval i=1/4 {
		sort ind`i'
		by ind`i': gen num`i'=_N
	}

	* group the records
	gen finalgrp = ind1
	forval i=2/4 {
		replace finalgrp = ind`i' if num`i'>=10
	}

This should create the variable "finalgrp", which will contain the
grouping you desire.  THen you can calculate whatever statistics you
want in those groups.

Nick Winter
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index