Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: ranking variables on the basis of total values of observations


From   tashi lama <ltashi32@hotmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: ranking variables on the basis of total values of observations
Date   Thu, 29 Mar 2012 14:02:41 +0000

Dear Nick, 



Yes, it does look like a rowranks problem but i think it is lil deeper because I have a time series data and I want to be able to provide time range. So, if you look at the following dataset, rowranks will give me a rank for each row.. But that is not exactly what I want. I want to be able to rank not only for 05jan2010, 06jan2010 and so on but 05jan2010-08jan2010. Now, in order to this, I have to be able to sum the values from 05jan2010-08jan2010 for all var1 var2 and var3 like given below. If I can do that, there would be only one row where obs are the sum. If i can do that, I can use rowranks. So, 



date                     totalvar2 totalvar3 totalvar4

05jan2010-08jan2010           10      10       22

 



Yes, now I could use rowranks. Ahh...actually this analogy might make things clear. You know how when you graph pie, you could give do sth like 



graph pie var1 var2 if tin(05jan2010, 08jan2010), plabel(_all percent) => stata will sum the values of obs for both var1 and var2 for that period and compute the percentage. Rowranks doesn't allow ifs looking at the syntax. 

 

Let me know if I could make any clearer. 



Thanking and really appreciating your time and wisdom, 

Tashi 









> Date: Wed, 28 Mar 2012 23:22:15 +0100
> Subject: Re: st: ranking variables on the basis of total values of observations
> From: njcoxstata@gmail.com
> To: statalist@hsphsun2.harvard.edu
> 
> If I understand your problem correctly, it is that of -rowranks-.
> 
> See also for a review in that territory
> 
> SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
> (help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
> Q1/09 SJ 9(1):137--157
> shows how to exploit functions, egen functions, and Mata
> for working rowwise; rowsort and rowranks are introduced
> 
> This is now accessible to all at
> http://www.stata-journal.com/sjpdf.html?articlenum=pr0046
> under the Stata Journal's three-year window.
> 
> Nick
> 
> On Wed, Mar 28, 2012 at 9:44 PM, tashi lama <ltashi32@hotmail.com> wrote:
> > Hello Nick,
> >
> > My apology, I mean rowranks, not rowsum. So, let's see if I could simplify it...
> >
> >
> >
> > Essentially I want to be able to rank var2 in the following dataset looking at its total values of obs from 05jan2010 to 09jan2010. so, I would use generate sum to get a rumsum. That would look like
> >
> >
> >
> >
> >
> > date var2 var3 var4 totalv~2 totalv~3 totalv~4
> > 1. 05jan2010 3 1 7 3 1 7
> > 2. 06jan2010 2 3 6 5 4 13
> > 3. 07jan2010 4 4 5 9 8 18
> > 4. 08jan2010 1 2 4 10 10 22
> > 5. 09jan2010 5 8 3 15 18 25
> > 6. 10jan2010 8 9 2 23 27 27
> > 7. 11jan2010 4 6 3 27 33 30
> > 8. 12jan2010 3 3 8 30 36 38
> > 9. 13jan2010 1 2 3 31 38 41
> >
> >
> >
> >
> > So, I would look at the observations at 09jan2010. So, as you can see, var2 has 5, var3 has 8 and so on. since 5 is the second lowest in that row after 3, I would say var2 is 2nd. Now, I have to able to code this and that is exactly I think you guys can help me or give me a lead.
> >
> >
> >
> > Thank you and let me know if I could make more clea.
> >
> >
> >
> > Tashi
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >> Date: Wed, 28 Mar 2012 21:17:19 +0100
> >> Subject: Re: st: ranking variables on the basis of total values of observations
> >> From: njcoxstata@gmail.com
> >> To: statalist@hsphsun2.harvard.edu
> >>
> >> I don't know what -rowsum- is here.
> >>
> >> Thanks, but this doesn't get me much closer. Of course, if other
> >> people can work out what you want, they should chip in.
> >>
> >> Please _show_ what you want it changed to, as I asked earlier. No
> >> word description, just what the dataset would like look after changes.
> >>
> >> Nick
> >>
> >> On Wed, Mar 28, 2012 at 8:49 PM, tashi lama <ltashi32@hotmail.com> wrote:
> >>
> >> > date var2 var3 var4
> >> > 1. 05jan2010 3 1 7
> >> > 2. 06jan2010 2 3 6
> >> > 3. 07jan2010 4 4 5
> >> > 4. 08jan2010 1 2 4
> >> > 5. 09jan2010 5 8 3
> >> > 6. 10jan2010 8 9 2
> >> > 7. 11jan2010 4 6 3
> >> > 8. 12jan2010 3 3 8
> >> > 9. 13jan2010 1 2 3
> >> >
> >>
> >> > Say, I would like to rank var2 for certain period of time say (05jan2010-09jan2010). So, I would have Stata add the obs values from 05jan2010 to 09jan2010 for all the variables and rank var2 like 1st or 2nd or 3rd. I looked at rowsum. I don't think it will help.
> >>
> >> >> Date: Wed, 28 Mar 2012 20:34:58 +0100
> >> >> Subject: Re: st: ranking variables on the basis of total values of observations
> >> >> From: njcoxstata@gmail.com
> >> >> To: statalist@hsphsun2.harvard.edu
> >> >>
> >> >> As far as I can see this could mean several things. I am not clear
> >> >> that when you say "rank" you don't mean "order" instead.
> >> >>
> >> >> I don't think you want -rowranks- (SSC).
> >> >>
> >> >> Why not just give us a toy dataset, with say 5 observations and 5
> >> >> variables, and what you want to change it to? Then it should be easier
> >> >> to see what you want.
> >> >>
> >> >> By the way, adding lots of blank lines just makes your posts more
> >> >> difficult to read. I've edited what came in.
> >> >>
> >> >> Nick
> >> >>
> >> >> On Wed, Mar 28, 2012 at 8:22 PM, tashi lama <ltashi32@hotmail.com> wrote:
> >> >> >
> >> >> > I have a list of variables and I would like to rank the variables or any given variable for that matter on the basis of total values of their observations. I thought of approaching this problem in the following way.
> >> >> >
> >> >> > 1. find the running sum of all the variables using generate or the total using egen although I think I would prefer generate.
> >> >> >
> >> >> > say there are var1 var2 var3
> >> >> >
> >> >> > gen tvar1=sum(var1)
> >> >> > gen tvar2=sum(var2)
> >> >> > gen tvar3=sum(var3)
> >> >> >
> >> >> > 2. then compare tvar1, tvar2 and tvar3 using if conditions.
> >> >> >
> >> >> > Once this is done, I would like to expand such that I can rank those variables but for given period of time. And this is essentially why I think generate sum is better for this problem because it is a running sum.
> >> >>
> >> >> > say date var1 var2 var3
> >> >> >
> >> >> > I would like to be able to rank variables say for 01jan2011 to 01feb2011 and so on.
> >> >> >
> >> >> > Has anyone worked in this kind of problem before or does anyone have any idea or thought regarding this problem? Any help or lead would be highly appreciated? I saw that there is a stata module rowranks to calculate ranks in a row but i don't know how can it be useful to me. Most probably not....
> >>
> >> *
> >> * For searches and help try:
> >> * http://www.stata.com/help.cgi?search
> >> * http://www.stata.com/support/statalist/faq
> >> * http://www.ats.ucla.edu/stat/stata/
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> 
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index