Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Gabriel Nelson <lgabrielnelson@gmail.com> |

To |
statalist <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: ladder question for right-skewed variable |

Date |
Fri, 26 Apr 2013 15:05:22 -0700 |

David and Nick, Thanks very much for your very helpful suggestions. I do have data on population, and am considering making the variable a ratio of number displaced/total population of the municipality. 1010 is the number of municipalities for which I have data (out of a total 1,119). It is surprising that there are no counts of 0 for people displaced by violence but I have gone back and double checked the data and there are no municipalities that report 0 people displaced. I am avoiding a Poisson or negative binomial model for now because there is likely to be clustering geographically in the amount of people displaced since violent episodes were localized. I am collecting data on the number of attacks by guerilla forces and the number of attacks by paramilitary groups right now, which will hopefully help to account for structure in the data. Thank you again for your insightful comments. Gabriel On Fri, Apr 26, 2013 at 2:35 PM, David Hoaglin <dchoaglin@gmail.com> wrote: > Gabriel, > > I second Nick's advice to abandon -ladder-. Choosing a transformation > involves a fair amount of judgment, and I would not delegate the > choice to an automated process. I also have some other comments. > > The number of people who reported being displaced by violence is a > count. Sometimes the square root is a reasonable transformation for > counts, but large counts often need a logarithm. > > As Nick suggested, however, a Poisson model may be appropriate or > perhaps a negative binomial model. Before I tried such models, > however, I would want to know why your data did not include any zeros. > Is 1010 the total number of municipalities in Colombia, or do your > data include only municipalities in which at least 1 person reported > being displaced? Either a Poisson distribution or a negative binomial > distribution would have positive probability of producing some zeros. > If zeros have been excluded, the model would have to handle that > feature. > > Another consideration, perhaps important, is that the usual Poisson > and negative binomial models assume that the occurrences are > independent. The nature of your data suggests that some types of > clustering are likely to be involved. An episode of violence is > likely to cause a number of people to be displaced simultaneously, and > it might affect nearby municipalities similarly. > > Yet another feature of the data is the size of the municipality. The > number of people displaced might be related to the population of the > municipality. Do you have data on the populations? > > You said that the data do not show bimodal structure, but I could > easily imagine that they represent a mixture of distributions, maybe > having several components. Do you have other variables that might > help to account for structure in the data (geographic and otherwise)? > > I am probably making your analysis more complicated, but I hope I am > making it more realistic. > > David Hoaglin > > > On Fri, Apr 26, 2013 at 4:57 PM, Gabriel Nelson > <lgabrielnelson@gmail.com> wrote: >> Thanks very much for your suggestions Nick. It makes sense that the >> problem might lie within -sktest-. I won't worry any more about this >> problem and just proceed with the qnorm command, as you suggested. >> Thanks again. >> >> Gabriel > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: ladder question for right-skewed variable***From:*Gabriel Nelson <lgabrielnelson@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*Gabriel Nelson <lgabrielnelson@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*Gabriel Nelson <lgabrielnelson@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*Gabriel Nelson <lgabrielnelson@gmail.com>

**Re: st: ladder question for right-skewed variable***From:*David Hoaglin <dchoaglin@gmail.com>

- Prev by Date:
**st: foreign language symbols not recognized in string variables** - Next by Date:
**Re: st: Importing an Excel file directly from https URL** - Previous by thread:
**Re: st: ladder question for right-skewed variable** - Next by thread:
**st: st :Endogenous variables in Survival analysis** - Index(es):