Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Estrella Gomez <estrellastata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Log of the mean vs mean of the log |
Date | Wed, 23 Apr 2014 14:40:19 +0200 |
ranks, that is, the top 300 songs per each country, and I want to use this (inverted) variable as a proxy for sales (downloads), because I don't have real downloads. I have already done the analysis at the song level, but I would also like to aggregate at the country level to see the total cross border sales per country. That's why I would like to sum all the ranks, because I understand that the sum of all (inverted) ranks would be a proxy for total sales from a country to another. Then I use this as dependent variable in a gravity equation, which requires the use of logarithms, but I'm not clear if first take the logarithms of rank and them sum all the logs (by country) or either if I should first sum all the ranks (by country) and then take the logarithm of this sum Thank you very much, Estrella 2014-04-23 14:05 GMT+02:00 Austin Nichols <austinnichols@gmail.com>: > Estrella Gomez <estrellastata@gmail.com>: > > Neither sounds right to me. You want to take the sum over many songs > for one country with few downloads, ranking say 200th out of many > countries on all those songs, and take the log of the sum? Or compute > the sum of many ln(200) values? What interpretation would this > variable have--movements up or down in percentage terms in rank of > downloads is some kind of measure of changes in intrinsic propensity > to engage in internet trade? I would think you could get much more > interesting information by preserving the data at the song level, > because it could inform who are likely to be trading partners, if you > have country of origin of the song (as least language can play a role, > if not other cultural factors). Also, numbers of downloads is no doubt > more informative than rank. If you are committed to using ranks > instead of numbers, I would think computing ranks from 0 to 1, or > 1/200 to 1-1/200 for 200 countries, as "percentile" scores, would be > better than raw rank. For that kind of rank, logit is a more natural > transformation than log, but I doubt any transformation is required > here--just keep it on the scale from 0 to 1. > > On Wed, Apr 23, 2014 at 4:51 AM, Estrella Gomez <estrellastata@gmail.com> wrote: >> Hi, >> >> I have a variable that is the number of downloads in a country at the >> song level, so each observation is song & artist & number of downloads >> & country & rank. I want to aggregate this at the country level and >> introduce the sum of the ranks as dependent variable in a gravity >> equation. I have aggregated taking the sum of the ranks and then the >> logarithm of this sum. My question is: is this correct or should I >> instead take first the logarithm of the ranks at the song level and >> then take the sum of this logarithms? I am not very clear on the >> difference between the sum of the log ranks and the log of the sum of >> the ranks > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/