Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Log of the mean vs mean of the log

From   Estrella Gomez <[email protected]>
To   [email protected]
Subject   Re: st: Log of the mean vs mean of the log
Date   Wed, 23 Apr 2014 14:40:19 +0200

ranks, that is, the top 300 songs per each country, and I want to use
this (inverted) variable as a proxy for sales (downloads), because I
don't have real downloads. I have already done the analysis at the
song level, but I would also like to aggregate at the country level to
see the total cross border sales per country. That's why I would like
to sum all the ranks, because I understand that the sum of all
(inverted) ranks would be a proxy for total sales from a country to
another. Then I use this as dependent variable in a gravity equation,
which requires the use of logarithms, but I'm not clear if first take
the logarithms of rank and them sum all the logs (by country) or
either if I should first sum all the ranks (by country) and then take
the logarithm of this sum

Thank you very much,

2014-04-23 14:05 GMT+02:00 Austin Nichols <[email protected]>:
> Estrella Gomez <[email protected]>:
> Neither sounds right to me.  You want to take the sum over many songs
> for one country with few downloads, ranking say 200th out of many
> countries on all those songs, and take the log of the sum?  Or compute
> the sum of many ln(200) values? What interpretation would this
> variable have--movements up or down in percentage terms in rank of
> downloads is some kind of measure of changes in intrinsic propensity
> to engage in internet trade? I would think you could get much more
> interesting information by preserving the data at the song level,
> because it could inform who are likely to be trading partners, if you
> have country of origin of the song (as least language can play a role,
> if not other cultural factors). Also, numbers of downloads is no doubt
> more informative than rank. If you are committed to using ranks
> instead of numbers, I would think computing ranks from 0 to 1, or
> 1/200 to 1-1/200 for 200 countries, as "percentile" scores, would be
> better than raw rank.  For that kind of rank, logit is a more natural
> transformation than log, but I doubt any transformation is required
> here--just keep it on the scale from 0 to 1.
> On Wed, Apr 23, 2014 at 4:51 AM, Estrella Gomez <[email protected]> wrote:
>> Hi,
>> I have a variable that is the number of downloads in a country at the
>> song level, so each observation is song & artist & number of downloads
>> & country & rank. I want to aggregate this at the country level and
>> introduce the sum of the ranks as dependent variable in a gravity
>> equation. I have aggregated taking the sum of the ranks and then the
>> logarithm of this sum. My question is: is this correct or should I
>> instead take first the logarithm of the ranks at the song level and
>> then take the sum of this logarithms? I am not very clear on the
>> difference between the sum of the log ranks and the log of the sum of
>> the ranks
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index