Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Austin Nichols <austinnichols@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Log of the mean vs mean of the log |
Date | Wed, 23 Apr 2014 08:05:02 -0400 |
Estrella Gomez <estrellastata@gmail.com>: Neither sounds right to me. You want to take the sum over many songs for one country with few downloads, ranking say 200th out of many countries on all those songs, and take the log of the sum? Or compute the sum of many ln(200) values? What interpretation would this variable have--movements up or down in percentage terms in rank of downloads is some kind of measure of changes in intrinsic propensity to engage in internet trade? I would think you could get much more interesting information by preserving the data at the song level, because it could inform who are likely to be trading partners, if you have country of origin of the song (as least language can play a role, if not other cultural factors). Also, numbers of downloads is no doubt more informative than rank. If you are committed to using ranks instead of numbers, I would think computing ranks from 0 to 1, or 1/200 to 1-1/200 for 200 countries, as "percentile" scores, would be better than raw rank. For that kind of rank, logit is a more natural transformation than log, but I doubt any transformation is required here--just keep it on the scale from 0 to 1. On Wed, Apr 23, 2014 at 4:51 AM, Estrella Gomez <estrellastata@gmail.com> wrote: > Hi, > > I have a variable that is the number of downloads in a country at the > song level, so each observation is song & artist & number of downloads > & country & rank. I want to aggregate this at the country level and > introduce the sum of the ranks as dependent variable in a gravity > equation. I have aggregated taking the sum of the ranks and then the > logarithm of this sum. My question is: is this correct or should I > instead take first the logarithm of the ranks at the song level and > then take the sum of this logarithms? I am not very clear on the > difference between the sum of the log ranks and the log of the sum of > the ranks * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/