Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Estrella Gomez <[email protected]> |

To |
[email protected] |

Subject |
Re: st: Log of the mean vs mean of the log |

Date |
Wed, 23 Apr 2014 16:23:51 +0200 |

Hi We would like to include downloads as dependent variable, but the problem is that we don't have this information, so that's why we use rank as a proxy for downloads (or sales). Distance is supposed to capture not only the physical effect but mainly the cultural distance between countries. How should I implement this log link in Stata? Thank you very much, Estrella 2014-04-23 15:29 GMT+02:00 Austin Nichols <[email protected]>: > Estrella Gomez <[email protected]>: > > I misread your first post; I thought you meant to include the > downloads as an explanatory variable in a gravity model (which seemed > an interesting idea, as that might be a proxy for levels of trade that > would obtain without respect to distance between countries). The > gravity model would then be estimated using -glm- with a log link, not > by taking logs and then running a linear regression. See e.g. refs in > http://www.stata.com/meeting/boston10/boston10_nichols.pdf > > If downloads are your depvar, then I can't see how a gravity model is > appropriate, since distance in the traditional sense is irrelevant for > song downloads. > > I cannot see why you want ranks at all, but perhaps there was more > information at the start of this post that got cut off: > > On Wed, Apr 23, 2014 at 8:40 AM, Estrella Gomez <[email protected]> wrote: >> ranks, that is, the top 300 songs per each country, and I want to use >> this (inverted) variable as a proxy for sales (downloads), because I >> don't have real downloads. I have already done the analysis at the >> song level, but I would also like to aggregate at the country level to >> see the total cross border sales per country. That's why I would like >> to sum all the ranks, because I understand that the sum of all >> (inverted) ranks would be a proxy for total sales from a country to >> another. Then I use this as dependent variable in a gravity equation, >> which requires the use of logarithms, but I'm not clear if first take >> the logarithms of rank and them sum all the logs (by country) or >> either if I should first sum all the ranks (by country) and then take >> the logarithm of this sum >> >> Thank you very much, >> Estrella >> >> 2014-04-23 14:05 GMT+02:00 Austin Nichols <[email protected]>: >>> Estrella Gomez <[email protected]>: >>> >>> Neither sounds right to me. You want to take the sum over many songs >>> for one country with few downloads, ranking say 200th out of many >>> countries on all those songs, and take the log of the sum? Or compute >>> the sum of many ln(200) values? What interpretation would this >>> variable have--movements up or down in percentage terms in rank of >>> downloads is some kind of measure of changes in intrinsic propensity >>> to engage in internet trade? I would think you could get much more >>> interesting information by preserving the data at the song level, >>> because it could inform who are likely to be trading partners, if you >>> have country of origin of the song (as least language can play a role, >>> if not other cultural factors). Also, numbers of downloads is no doubt >>> more informative than rank. If you are committed to using ranks >>> instead of numbers, I would think computing ranks from 0 to 1, or >>> 1/200 to 1-1/200 for 200 countries, as "percentile" scores, would be >>> better than raw rank. For that kind of rank, logit is a more natural >>> transformation than log, but I doubt any transformation is required >>> here--just keep it on the scale from 0 to 1. >>> >>> On Wed, Apr 23, 2014 at 4:51 AM, Estrella Gomez <[email protected]> wrote: >>>> Hi, >>>> >>>> I have a variable that is the number of downloads in a country at the >>>> song level, so each observation is song & artist & number of downloads >>>> & country & rank. I want to aggregate this at the country level and >>>> introduce the sum of the ranks as dependent variable in a gravity >>>> equation. I have aggregated taking the sum of the ranks and then the >>>> logarithm of this sum. My question is: is this correct or should I >>>> instead take first the logarithm of the ranks at the song level and >>>> then take the sum of this logarithms? I am not very clear on the >>>> difference between the sum of the log ranks and the log of the sum of >>>> the ranks > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: Log of the mean vs mean of the log***From:*Austin Nichols <[email protected]>

- Prev by Date:
**st: tick with date variable in graph rbar** - Next by Date:
**Re: st: graph box: outsides color** - Previous by thread:
**Re: st: Log of the mean vs mean of the log** - Next by thread:
**st: Using Aalen's additive hazard model in Stata** - Index(es):