Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <n.j.cox@durham.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: testing for bimodality in survey data |

Date |
Thu, 10 Nov 2011 08:37:40 +0000 |

Joerg has in effect already answered your question. Bimodality implies some generating process that is bimodal, so should you want to investigate it formally, it is arguably best to think up a model with that kind of behaviour as one possibility and then estimate its parameters. For the most part, I have found that bimodality is convincing if and only if (a) it shows up consistently on density estimates with a range of kernels and a range of kernel widths and (b) there is some substantive expectation of a mix of two kinds (males and females, whatever). Nick n.j.cox@durham.ac.uk Dana Shills Thank you Joerg. That was very helpful. If I understand this correctly, once you have the kdens plot you can visually see if there are two modes. So there is no statistical test thatconfirms the number of modes in the distribution? > From: joerg.luedicke@gmail.com > I am not sure what "testing" is supposed to mean in this context, but > if you want to explore the possibility of a multimodal distribution > you could indeed go for a non-parametric density estimation. I > recommend using Ben Jann's -kdens- (available from SSC, -findit > kdens-), which is a quite powerful package and supports probability > weights. I would also recommend using an adaptive kernel estimate, as > this is usually the best kernel estimate when dealing with multimodal > data (at least in my experience). What you could do in addition is > checking whether the multimodality is due to distributional mixtures > (which is often the case when you find more than one mode). For > example, say you find your distribution being bimodal, you could fit a > 2-component mixture model to estimate the underlying parameters of the > mixed distributions via maximum likelihood (to do this you could use > -fmm- which is also available from SSC; if the model does not converge > make sure you provide starting values; for Gaussian mixtures you could > use the modes from the kernel estimate and guess the variance). You > could also check how well the (in this case) 2 distributions can be > separated with using an entropy measure which you could calculate with > -fmmlc-, also available from SSC. > > On Wed, Nov 9, 2011 at 1:50 PM, Dana Shills <shills52@hotmail.com> wrote: > > I am using survey data on firms in Ghana. The survey methodology uses stratified random sampling and I have the probability weights. I want to be able to plot a distribution of firm sizes (incorporating the weights) and test for bimodality in the firm size distribution. I looked at the "adgakern" program but I don't think it allows for survey weights. Could someone please point me to what commands I should be looking at? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: testing for bimodality in survey data***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

**st: * mark indicating between group significans in box plots***From:*Lars Folkestad <lfolkestad@health.sdu.dk>

**References**:**st: testing for bimodality in survey data***From:*Dana Shills <shills52@hotmail.com>

**Re: st: testing for bimodality in survey data***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

**RE: st: testing for bimodality in survey data***From:*Dana Shills <shills52@hotmail.com>

- Prev by Date:
**st: RE: How to index the variables' name as variable's value?** - Next by Date:
**st: RE: Analysing patient's admission over time** - Previous by thread:
**RE: st: testing for bimodality in survey data** - Next by thread:
**st: * mark indicating between group significans in box plots** - Index(es):