[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
David Airey <david.airey@Vanderbilt.Edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Dependent var is a proportion, with large spike in .95+ |

Date |
Thu, 4 Sep 2008 06:56:53 -0400 |

.

Here is an article I used for a spiked distribution. It is probably not the same situation as yours, however.

Genetics. 2003 Mar;163(3):1169-75.

Mapping quantitative trait loci in the case of a spike in the phenotype

distribution.

Broman KW.

Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21205,

USA. kbroman@jhsph.edu

A common departure from the usual normality assumption in QTL mapping concerns a

spike in the phenotype distribution. For example, in measurements of tumor mass,

some individuals may exhibit no tumors; in measurements of time to death after a

bacterial infection, some individuals may recover from the infection and fail to

die. If an appreciable portion of individuals share a common phenotype value

(generally either the minimum or the maximum observed phenotype), the standard

approach to QTL mapping can behave poorly. We describe several alternative

approaches for QTL mapping in the case of such a spike in the phenotype

distribution, including the use of a two-part parametric model and a

nonparametric approach based on the Kruskal-Wallis test. The performance of the

proposed procedures is assessed via computer simulation. The procedures are

further illustrated with data from an intercross experiment to identify QTL

contributing to variation in survival of mice following infection with Listeria

monocytogenes.

PMCID: PMC1462498

PMID: 12663553 [PubMed - indexed for MEDLINE]

On Sep 3, 2008, at 3:22 PM, Dan Weitzenfeld wrote:

Hi Statalist, I am trying to determine which testing factors drive a proportion dependent variable, PercentNoise. In searching the archives, I came across -betafit-, and a link to the FAQ: "How do you fit a model when the dependent variable is a proportion?" In that response, Allen McDowell and Nic Cox write, "In practice, it is often helpful to look at the frequency distribution: a marked spike at zero or one may well raise doubt about a single model fitted to all data." That describes my situation exactly: I have a marked spike in my histogram at the top bin, roughly .95 - 1. I am wondering how to account for this. Does -betafit- take such a possibility into account? Can someone briefly describe how I could use multiple models to fit all the data, as implied in the FAQ response? My fallback is setting a pass/fail bar and converting my proportions to a binary, then using probit/logit. But the obvious drawback is that I am throwing away information by collapsing the continuous (albeit bounded) proportion variable to a binary. Thanks in advance for any suggestions, Dan * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Dependent var is a proportion, with large spike in .95+***From:*"Dan Weitzenfeld" <dan.weitzenfeld@emsense.com>

- Prev by Date:
**st: Changing property of a variable into number** - Next by Date:
**st: Using Gr39** - Previous by thread:
**Re: st: Dependent var is a proportion, with large spike in .95+** - Next by thread:
**st: RE: Dependent var is a proportion, with large spike in .95+** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |