Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Log of the mean vs mean of the log
From
Austin Nichols <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Log of the mean vs mean of the log
Date
Wed, 23 Apr 2014 08:05:02 -0400
Estrella Gomez <[email protected]>:
Neither sounds right to me. You want to take the sum over many songs
for one country with few downloads, ranking say 200th out of many
countries on all those songs, and take the log of the sum? Or compute
the sum of many ln(200) values? What interpretation would this
variable have--movements up or down in percentage terms in rank of
downloads is some kind of measure of changes in intrinsic propensity
to engage in internet trade? I would think you could get much more
interesting information by preserving the data at the song level,
because it could inform who are likely to be trading partners, if you
have country of origin of the song (as least language can play a role,
if not other cultural factors). Also, numbers of downloads is no doubt
more informative than rank. If you are committed to using ranks
instead of numbers, I would think computing ranks from 0 to 1, or
1/200 to 1-1/200 for 200 countries, as "percentile" scores, would be
better than raw rank. For that kind of rank, logit is a more natural
transformation than log, but I doubt any transformation is required
here--just keep it on the scale from 0 to 1.
On Wed, Apr 23, 2014 at 4:51 AM, Estrella Gomez <[email protected]> wrote:
> Hi,
>
> I have a variable that is the number of downloads in a country at the
> song level, so each observation is song & artist & number of downloads
> & country & rank. I want to aggregate this at the country level and
> introduce the sum of the ranks as dependent variable in a gravity
> equation. I have aggregated taking the sum of the ranks and then the
> logarithm of this sum. My question is: is this correct or should I
> instead take first the logarithm of the ranks at the song level and
> then take the sum of this logarithms? I am not very clear on the
> difference between the sum of the log ranks and the log of the sum of
> the ranks
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/