Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re RE: multiple sample split (year-by-year)


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Re RE: multiple sample split (year-by-year)
Date   Tue, 13 Nov 2007 17:31:36 -0000

Question 2: I can't say what makes sense for your project. 

Question 1: Here is one recipe. 

. egen present = total(response < .), by(year) 
. gen n1 = present/3 
. gen n2 = 2 * present/3 

. bysort year (response) : gen response_class = 
	cond(_n <= n1, 1, 
	cond(_n <= n2, 2, 
	cond(_n <= present, 3, .)
	)) 

This maps missings to missings. 

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of
west--@libero.it
Sent: 13 November 2007 16:53
To: statalist
Cc: baum
Subject: st: Re RE: multiple sample split (year-by-year)

I'm sorry, I was unclear with my question. I restate it, dropping the
first part.

I start with a panel data set consisting of "x" firm-year observations.

Question 1: I need to divide firms evenly into three groups according to
their value in each year, getting these groups: firms residing in the
low-, in the middle- and in the high-var1 group. I underline that the
calculation should be done for each year, and that each year I have
"x/3" observations for each group. I really don't know how realize this
result. Any illustrative example would be of help. Thank you.

Question2: if the variable I want to split is incomplete in my initial
data set (i.e. "." for some firm-year data), should I balance my data
set before the sample splitting? Thanks for any suggestion!

D.


---------- Initial Header -----------




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index