Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Aw: Re: st: Controlling for median via industry

From   "Peter Miller" <>
Subject   Aw: Re: st: Controlling for median via industry
Date   Wed, 12 Feb 2014 21:48:15 +0100

Hi Jorge,
Many, many thanks for the great and detailed Email! I tried to use the code in Stata and it worked perfectly!
I’ve one last question and was wondering if you could provide your expertise one more time? In particular, I need to implement the constraint that the median shall be calculated only if the group has 15 firms or more within the first 5 years and if not the 1-digit SIC code (Industry) instead of the 2-digit shall be used to calculate the median – the constraint is implemented to a have a sufficient group size to calculate the median.
Hopefully you are able and willing to provide additional input, since you’re actually the one who came up with the code in the first place!!
Again, many many thanks!

Gesendet: Mittwoch, 12. Februar 2014 um 16:45 Uhr
Von: "Jorge Eduardo Pérez Pérez" <>
An: "" <>
Betreff: Re: st: Controlling for median via industry
* Get a dataset of firms
webuse abdata
bys id: keep if _n==1
keep id ind year wage
* I am going to give some variation to the years by adding a random
integer, 1 to 10
set seed 100
replace year=year+1+int((10-1+1)*runiform())
sort ind id
* Save this original database
tempfile all
save `all'

* Create median variable to be filled
gen median=.
* Loop over firms
levelsof id, local(ids)
foreach x in `ids' {
* Keep the firm
keep if id==`x'
ren year yearo
keep id yearo ind
* Merge with firms of same industry
merge 1:n ind using `all', keepusing(year wage) keep(match)
* Keep only the older ones
keep if year>=yearo
* Calculate median only if group has 15 firms or more
if r(N)>=15 {
_pctile wage, p(50)
glo median=r(r1)
else glo median=.
* Replace empty median variable with calculated median
replace median=${median} if id==`x'
Jorge Eduardo Pérez Pérez
Graduate Student
Department of Economics
Brown University

On Wed, Feb 12, 2014 at 4:09 AM, Peter Miller <> wrote:
> Dear Stata-Community,
> I've a problem, which I can't figure out for quite some while. Maybe one of you has an idea how to solve this problem.
> Here is the setting: I've a large sample of firms each identified with an ID, Date, Industry and Earnings for example. For each firm now, I have to control the earnings by the median earnings of the particular industry the firm belongs to.
> For example: Firm X belongs to Industry Y then it should be something like this:
> Adjusted earnings = Earnings firm X - Median of Industry Y
> In addition, the groups that build the median should have at least 15 firms, are not allowed to be older than 5 years compared to firm X, and they are not allowed to be younger, i.e. they should not exist before firm X.
> I struggle with it for quite some while. Any help would be much appreciated!!!
> Best regards,
> Peter
> *
> * For searches and help try:
> *
> *[]
> *[]

* For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index