Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: add up variable / quantile


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: add up variable / quantile
Date   Thu, 14 Apr 2011 07:17:29 +0100

Good. I had in my mind "Why not use -xtile- anyway?", but sorry, that
didn't make it to my previous post.

Nick

On Wed, Apr 13, 2011 at 10:26 PM, Scharnigg, Stan (Stud. SBE)
<s.scharnigg@student.maastrichtuniversity.nl> wrote:
> Thank you, that was indeed the problem. I solved it with the xtile command.
> ________________________________________
> Van: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] namens Nick Cox [n.j.cox@durham.ac.uk]
> Verzonden: woensdag 13 april 2011 22:07
> Aan: 'statalist@hsphsun2.harvard.edu'
> Onderwerp: RE: st: add up variable / quantile
>
> I don't follow this well, but I see that you are including a comparison with r(p75), which is presumably thought to be left over from some previous command.
>
> However, r-class results are ephemeral and don't stick around forever. In particular, they get overwritten by -egen-, which does its own internal -count- at some point.
>
> If, however, r(p75) is undefined, then it's treated as missing and your comparison would be whether values were missing or greater than that, which is evidently not true for your data.
>
> I rarely understand why people who have quantitative data want to degrade it to indicator variables. That sounds most unstatistical to me.
>
> That aside, I think you need to reissue whatever command produced the r(p75) before you try to use it.
>
> Nick
> n.j.cox@durham.ac.uk
>
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Scharnigg, Stan (Stud. SBE)
> Sent: 13 April 2011 20:42
> To: statalist@hsphsun2.harvard.edu
> Subject: RE: st: add up variable / quantile
>
> I still have a problem with this.
>
> My goal is to identify the top quantile for each years (total 6 years) in different variables
>
> What I did so far:
> -----------------------------------------------------------
> gen year=year(newdate) // create years instead of normal dates
> egen gross_performance_years=total(gross_performance), by(accountID year) // create gross_performance per year
> egen tag_year=tag(RekeningID year) // tag a 1 for each year per accountID
> -----------------------------------------------------------
>
> this all works fine, however now I want to create the top quantile variable for each year. So i did the following:
>
> gen topq_2000=gross_performance_years >=r(p75) & year==y(2000), however this doesn't work. I only get "0" as value.
>
> I also tried this:
>
> generate topq_2000 = 0
> replace topq_2000 = 1 if gross_performance_year >=r(p75) & tag_year==1 & year==2002
>
> but without succes
> Does anybody has some tips how I can do this?
>
> ________________________________________
> Van: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] namens Nick Cox [njcoxstata@gmail.com]
> Verzonden: dinsdag 29 maart 2011 12:44
> Aan: statalist@hsphsun2.harvard.edu
> Onderwerp: Re: st: add up variable / quantile
>
> On Tue, Mar 29, 2011 at 11:30 AM, Scharnigg, Stan (Stud. SBE)
> <s.scharnigg@student.maastrichtuniversity.nl> wrote:
>> Look at the help for -egen-. You want
>>
>> egen total_gross = total(gross), by(accountID)
>> -----------
>>
>> Thank you, but I have some additional questions:
>>
>> A. I have data for 6 years (72 months). What if I want to add up the gross_performance for e.g the first 12 months. So, I guess I need to
>> create different variables for different time periods, but I am not sure how to do that. One extensive possibility might be that I create a different dataset
>> for every period, but I guess there might also be another solution
>>
>> accountID; gross_performance; date
>> 1                -.1                              jan_00
>> 1                 .2                              febr_00
>> 3                 .1                              jan_00
>> 3                 .1                              febr_00
>
> You can specify -if- on an -egen- command. Different summaries will in
> general require new variables (but not new datasets).
>
>> B. If I use egen total_gross = total(gross_performance), by(accountID) I get many duplicate values. In some cases I have
>> 72 duplicate values. What is the best way to delete the duplicate values, so that they won't show up if I do some tests. I don't
>> think that renaming them to "0" is an option then.
>
> You can use
>
> egen tag = tag(accountID)
>
> and then add
>
> ... if tag
>
> to commands to ensure that each summary is used once only. You cannot
> delete (in Stata -drop-) without losing other information.
>
> Alternatively, -collapse- will yield a reduced dataset with one
> observation for each account.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index