# st: RE: Re RE: multiple sample split (year-by-year)

 From "Nick Cox" To Subject st: RE: Re RE: multiple sample split (year-by-year) Date Tue, 13 Nov 2007 17:31:36 -0000

```Question 2: I can't say what makes sense for your project.

Question 1: Here is one recipe.

. egen present = total(response < .), by(year)
. gen n1 = present/3
. gen n2 = 2 * present/3

. bysort year (response) : gen response_class =
cond(_n <= n1, 1,
cond(_n <= n2, 2,
cond(_n <= present, 3, .)
))

This maps missings to missings.

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of
west--@libero.it
Sent: 13 November 2007 16:53
To: statalist
Cc: baum
Subject: st: Re RE: multiple sample split (year-by-year)

I'm sorry, I was unclear with my question. I restate it, dropping the
first part.

I start with a panel data set consisting of "x" firm-year observations.

Question 1: I need to divide firms evenly into three groups according to
their value in each year, getting these groups: firms residing in the
low-, in the middle- and in the high-var1 group. I underline that the
calculation should be done for each year, and that each year I have
"x/3" observations for each group. I really don't know how realize this
result. Any illustrative example would be of help. Thank you.

Question2: if the variable I want to split is incomplete in my initial
data set (i.e. "." for some firm-year data), should I balance my data
set before the sample splitting? Thanks for any suggestion!

D.