Just a matter of style, but to note first that this
can be telescoped:
bysort year : egen median_by_year = median(ta)
gen large = ta > median_by_year
However, a more serious point is that missings should usually be trapped:
gen large = cond(missing(ta), ., (ta > median_by_year))
Nick
n.j.cox@durham.ac.uk
Zou Hong
> sort year
> by year: egen median_by_year=median(ta)
> gen large==0
> replace large=1 if ta>median_by_year
Christian Andres
> > I have a problem generating a dummy variable in a panel dataset.
> >
> > My dataset contains 300 companies over 10 years. For each
> company, I
> > have a size measure, total assets (ta). What I want to do
> now, is to
> > generate a dummy variable which is 1 if the company's size
> is larger
> > than the sample median in each year. This means that I need ONE
> > variable, say "large" which first lists this measure for
> ten years for
> > company 1, then 10 years for company 2...
> >
> > To make things easier to understand:
> > In year 1996 the median is 50, in 1997 60, in 1998 55:
> >
> > company year ta DUMMY_LARGE
> > 1 1996 51 1
> > 1 1997 53 0
> > 1 1998 56 1
> > 1....
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/