Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: R: highly skewed, highly zeroed data


From   "Carlo Lazzaro" <carlo.lazzaro@tin.it>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: R: highly skewed, highly zeroed data
Date   Wed, 25 Nov 2009 09:21:56 +0100

As an alternative to Kieran's hint, due to the positive skewness of his data
Jason may find useful to calculate the desired 95CI% by fitting a Gamma
distribution and drawing 10,000 random values from it (for two interesting
references, please see: 
Briggs, A. and Nixon, R. and Dixon, S. and Thompson, S. (2005). Parametric
modelling of cost data: some simulation evidence. Health Economics 14(4):pp.
421-428; free downloadable at http://eprints.gla.ac.uk/4151/;
Briggs A, Sculpher M, Claxton K. Decision Modelling for Health Economic
Evaluation. Oxford: Oxford University Press, 2006: 77-120).

............................begin example.................................
input time wt
mean time [fweight = wt]
Mean estimation                     Number of obs    =     647

--------------------------------------------------------------
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
        time |   1.605873   .2343624      1.145669    2.066077
--------------------------------------------------------------

set obs 10000
g Gamma=(.2343624^2/1.605873)*invgammap((1.605873/.2343624)^2, uniform())
sum Gamma
Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       Gamma |     10000    1.605746    .2343959   .8457972   2.601775

centile Gamma, centile (2.5 97.5)
                                                   -- Binom. Interp. --
    Variable |     Obs  Percentile      Centile        [95% Conf. Interval]
-------------+-------------------------------------------------------------
       Gamma |   10000        2.5      1.177285        1.170511    1.187588
             |               97.5       2.09881        2.083514    2.114182
............................end example....................................

HTH and Kind Regards,
Carlo

-----Messaggio originale-----
Da: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Jason Ferris
Inviato: mercoledì 25 novembre 2009 3.07
A: statalist@hsphsun2.harvard.edu
Oggetto: st: highly skewed, highly zeroed data

Hi, 
I have tried to find my answer in the statalist repository but nothing
has quite hit the mark.

I would like to calculate a mean and 95% CI of this data - which is
highly skewed and the majority are zeros.

I am aware of adding a constant and the transforming on the log scale
(with antilog) for interpretation.  However after adding a constant to
overcome the zero issue and then transforming on the log scale I am
still left with a highly skewed distribution.  Which gets me no close to
a mean and CI.

PS. As this is survey data I would be most keen for the 'right' answer
to be addressed in svy: terms

Jason

 time (hrs) |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |        518       80.06       80.06
        .25 |          2        0.31       80.37
         .5 |          3        0.46       80.83
          1 |         15        2.32       83.15
        1.5 |          1        0.15       83.31
          2 |         23        3.55       86.86
          3 |         10        1.55       88.41
        3.5 |          1        0.15       88.56
          4 |         11        1.70       90.26
          5 |         13        2.01       92.27
          6 |          9        1.39       93.66
          7 |          3        0.46       94.13
          8 |         19        2.94       97.06
         20 |         10        1.55       98.61
         45 |          9        1.39      100.00
------------+-----------------------------------

------------------------------------------
DISCLAIMER: This message (including any attachments) is intended solely for
the addressee(s) named and may contain confidential or privileged
information. 
If you are not the intended recipient, please delete it and notify the
sender. 
Views expressed in this message are those of the individual sender,and are
not necessarily the views of the Turning Point Alcohol and Drug Centre (ABN:
68 223 819 017).  

<a href="http://www.turningpoint.org.au";>Turning Point Alcohol and Drug
Centre</a>

Although this message and any attachments have been scanned for viruses by
'Trend Micro InterScan' at the time of sending, you are advised to rescan on
receipt.

The whole or parts of this email may be subject to copyright of Turning
Point Alcohol and Drug Centre (ABN: 68 223 819 017), and/or third parties. 
You can only re-transmit, distribute or use the material if you are
authorised to do so.

Please consider the environment before printing this email or attachments.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index