[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re RE: multiple sample split (year-by-year)

From   "Nick Cox" <>
To   <>
Subject   st: RE: Re RE: multiple sample split (year-by-year)
Date   Tue, 13 Nov 2007 17:31:36 -0000

Question 2: I can't say what makes sense for your project. 

Question 1: Here is one recipe. 

. egen present = total(response < .), by(year) 
. gen n1 = present/3 
. gen n2 = 2 * present/3 

. bysort year (response) : gen response_class = 
	cond(_n <= n1, 1, 
	cond(_n <= n2, 2, 
	cond(_n <= present, 3, .)

This maps missings to missings. 

-----Original Message-----
[] On Behalf Of
Sent: 13 November 2007 16:53
To: statalist
Cc: baum
Subject: st: Re RE: multiple sample split (year-by-year)

I'm sorry, I was unclear with my question. I restate it, dropping the
first part.

I start with a panel data set consisting of "x" firm-year observations.

Question 1: I need to divide firms evenly into three groups according to
their value in each year, getting these groups: firms residing in the
low-, in the middle- and in the high-var1 group. I underline that the
calculation should be done for each year, and that each year I have
"x/3" observations for each group. I really don't know how realize this
result. Any illustrative example would be of help. Thank you.

Question2: if the variable I want to split is incomplete in my initial
data set (i.e. "." for some firm-year data), should I balance my data
set before the sample splitting? Thanks for any suggestion!


---------- Initial Header -----------

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index