[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Jason Ferris" <JasonF@TURNINGPOINT.ORG.AU> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: highly skewed, highly zeroed data |

Date |
Thu, 26 Nov 2009 17:20:34 +1100 |

Thank you Carlo and Kieran for your guidance. I guess I now question myself about the advantage of transforming skewed data for univariate analysis. Should I worry about the issues of central tendency in order to obtain means and CIs? Jase -----Original Message----- From: Carlo Lazzaro [mailto:carlo.lazzaro@tin.it] Sent: Wednesday, 25 November 2009 7:22 PM To: statalist@hsphsun2.harvard.edu Cc: Jason Ferris Subject: R: highly skewed, highly zeroed data As an alternative to Kieran's hint, due to the positive skewness of his data Jason may find useful to calculate the desired 95CI% by fitting a Gamma distribution and drawing 10,000 random values from it (for two interesting references, please see: Briggs, A. and Nixon, R. and Dixon, S. and Thompson, S. (2005). Parametric modelling of cost data: some simulation evidence. Health Economics 14(4):pp. 421-428; free downloadable at http://eprints.gla.ac.uk/4151/; Briggs A, Sculpher M, Claxton K. Decision Modelling for Health Economic Evaluation. Oxford: Oxford University Press, 2006: 77-120). ............................begin example................................. input time wt mean time [fweight = wt] Mean estimation Number of obs = 647 -------------------------------------------------------------- | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ time | 1.605873 .2343624 1.145669 2.066077 -------------------------------------------------------------- set obs 10000 g Gamma=(.2343624^2/1.605873)*invgammap((1.605873/.2343624)^2, uniform()) sum Gamma Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- Gamma | 10000 1.605746 .2343959 .8457972 2.601775 centile Gamma, centile (2.5 97.5) -- Binom. Interp. -- Variable | Obs Percentile Centile [95% Conf. Interval] -------------+------------------------------------------------------------- Gamma | 10000 2.5 1.177285 1.170511 1.187588 | 97.5 2.09881 2.083514 2.114182 ............................end example.................................... HTH and Kind Regards, Carlo -----Messaggio originale----- Da: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Jason Ferris Inviato: mercoledì 25 novembre 2009 3.07 A: statalist@hsphsun2.harvard.edu Oggetto: st: highly skewed, highly zeroed data Hi, I have tried to find my answer in the statalist repository but nothing has quite hit the mark. I would like to calculate a mean and 95% CI of this data - which is highly skewed and the majority are zeros. I am aware of adding a constant and the transforming on the log scale (with antilog) for interpretation. However after adding a constant to overcome the zero issue and then transforming on the log scale I am still left with a highly skewed distribution. Which gets me no close to a mean and CI. PS. As this is survey data I would be most keen for the 'right' answer to be addressed in svy: terms Jason time (hrs) | Freq. Percent Cum. ------------+----------------------------------- 0 | 518 80.06 80.06 .25 | 2 0.31 80.37 .5 | 3 0.46 80.83 1 | 15 2.32 83.15 1.5 | 1 0.15 83.31 2 | 23 3.55 86.86 3 | 10 1.55 88.41 3.5 | 1 0.15 88.56 4 | 11 1.70 90.26 5 | 13 2.01 92.27 6 | 9 1.39 93.66 7 | 3 0.46 94.13 8 | 19 2.94 97.06 20 | 10 1.55 98.61 45 | 9 1.39 100.00 ------------+----------------------------------- ------------------------------------------ DISCLAIMER: This message (including any attachments) is intended solely for the addressee(s) named and may contain confidential or privileged information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender,and are not necessarily the views of the Turning Point Alcohol and Drug Centre (ABN: 68 223 819 017). <a href="http://www.turningpoint.org.au";>Turning Point Alcohol and Drug Centre</a> Although this message and any attachments have been scanned for viruses by 'Trend Micro InterScan' at the time of sending, you are advised to rescan on receipt. The whole or parts of this email may be subject to copyright of Turning Point Alcohol and Drug Centre (ABN: 68 223 819 017), and/or third parties. You can only re-transmit, distribute or use the material if you are authorised to do so. Please consider the environment before printing this email or attachments. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ ------------------------------------------ DISCLAIMER: This message (including any attachments) is intended solely for the addressee(s) named and may contain confidential or privileged information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender,and are not necessarily the views of the Turning Point Alcohol and Drug Centre (ABN: 68 223 819 017). <a href="http://www.turningpoint.org.au";>Turning Point Alcohol and Drug Centre</a> Although this message and any attachments have been scanned for viruses by 'Trend Micro InterScan' at the time of sending, you are advised to rescan on receipt. The whole or parts of this email may be subject to copyright of Turning Point Alcohol and Drug Centre (ABN: 68 223 819 017), and/or third parties. You can only re-transmit, distribute or use the material if you are authorised to do so. Please consider the environment before printing this email or attachments. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: highly skewed, highly zeroed data***From:*"Jason Ferris" <JasonF@TURNINGPOINT.ORG.AU>

**st: R: highly skewed, highly zeroed data***From:*"Carlo Lazzaro" <carlo.lazzaro@tin.it>

- Prev by Date:
**st: Mata MP** - Next by Date:
**st: egen index = group(X), label** - Previous by thread:
**st: R: highly skewed, highly zeroed data** - Next by thread:
**Re: st: highly skewed, highly zeroed data** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |