Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: add up variable / quantile

From   "Scharnigg, Stan (Stud. SBE)" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: add up variable / quantile
Date   Wed, 13 Apr 2011 21:42:06 +0200

I still have a problem with this.

My goal is to identify the top quantile for each years (total 6 years) in different variables

What I did so far:
gen year=year(newdate) // create years instead of normal dates
egen gross_performance_years=total(gross_performance), by(accountID year) // create gross_performance per year
egen tag_year=tag(RekeningID year) // tag a 1 for each year per accountID

this all works fine, however now I want to create the top quantile variable for each year. So i did the following:

gen topq_2000=gross_performance_years >=r(p75) & year==y(2000), however this doesn't work. I only get "0" as value.

I also tried this:

generate topq_2000 = 0
replace topq_2000 = 1 if gross_performance_year >=r(p75) & tag_year==1 & year==2002

but without succes
Does anybody has some tips how I can do this? 

Van: [email protected] [[email protected]] namens Nick Cox [[email protected]]
Verzonden: dinsdag 29 maart 2011 12:44
Aan: [email protected]
Onderwerp: Re: st: add up variable / quantile

On Tue, Mar 29, 2011 at 11:30 AM, Scharnigg, Stan (Stud. SBE)
<[email protected]> wrote:
> Look at the help for -egen-. You want
> egen total_gross = total(gross), by(accountID)
> -----------
> Thank you, but I have some additional questions:
> A. I have data for 6 years (72 months). What if I want to add up the gross_performance for e.g the first 12 months. So, I guess I need to
> create different variables for different time periods, but I am not sure how to do that. One extensive possibility might be that I create a different dataset
> for every period, but I guess there might also be another solution
> accountID; gross_performance; date
> 1                -.1                              jan_00
> 1                 .2                              febr_00
> 3                 .1                              jan_00
> 3                 .1                              febr_00

You can specify -if- on an -egen- command. Different summaries will in
general require new variables (but not new datasets).

> B. If I use egen total_gross = total(gross_performance), by(accountID) I get many duplicate values. In some cases I have
> 72 duplicate values. What is the best way to delete the duplicate values, so that they won't show up if I do some tests. I don't
> think that renaming them to "0" is an option then.

You can use

egen tag = tag(accountID)

and then add

... if tag

to commands to ensure that each summary is used once only. You cannot
delete (in Stata -drop-) without losing other information.

Alternatively, -collapse- will yield a reduced dataset with one
observation for each account.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index