Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: monthly means & CI


From   "Svend Juul" <SJ@SOCI.AU.DK>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: monthly means & CI
Date   Wed, 14 Mar 2007 10:35:59 +0100

Gaby Serdan wrote:

I have data on deaths. I need to calculate the mean &
CI of females in proportion to all population. Im
trying first to create a variable for each month then
take the total number of female per month and then
divide by total number of deaths per month. 

- and Clive Nicholas gave suggestions.
---------------------------------------------------------------

I understand that you want to estimate the proportion
of females among the persons who died each month. The
data you provided are a bit surprising for the purpose,
with one female, three males, and 21 with unknown sex.
To create some more illustrative data, I:

   clear
   set obs 200
   set seed 54321
   gen year = int(2004+2*uniform())
   gen month = int(1+12*uniform())
   drop if year==2004 & month<10
   drop if year==2005 & month>3
   gen x=uniform()
   gen female=1 if x<0.4
   gen male=1 if x>0.4 & x<0.8
   gen sex_unknown=1 if x>0.8
   recode female male sex_unknown (.=0)
   generate persons = female + male + sex_unknown

This dataset includes:

. table month female , by(year)
----------------------
year and  |   female  
month     |    0     1
----------+-----------
2004      |
       10 |    5     3
       11 |    4     4
       12 |    4     3
----------+-----------
2005      |
        1 |    5     4
        2 |    4     4
        3 |    4     3
----------------------

One possibility is the -proportion- command:

. proportion female , over(year month)

Proportion estimation               Number of obs    =      47

      _prop_1: female = 0
      _prop_2: female = 1

         Over: year month
    _subpop_1: 2004 10
    _subpop_2: 2004 11
    _subpop_3: 2004 12
    _subpop_4: 2005 1
    _subpop_5: 2005 2
    _subpop_6: 2005 3

--------------------------------------------------------------
             |                               Binomial Wald
        Over | Proportion   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
_prop_1      |
   _subpop_1 |       .625   .1829813      .2566778    .9933222
   _subpop_2 |         .5   .1889822      .1195985    .8804015
   _subpop_3 |   .5714286   .2020305      .1647622    .9780949
   _subpop_4 |   .5555556   .1756821      .2019258    .9091853
   _subpop_5 |         .5   .1889822      .1195985    .8804015
   _subpop_6 |   .5714286   .2020305      .1647622    .9780949
-------------+------------------------------------------------
_prop_2      |
   _subpop_1 |       .375   .1829813      .0066778    .7433222
   _subpop_2 |         .5   .1889822      .1195985    .8804015
   _subpop_3 |   .4285714   .2020305      .0219051    .8352378
   _subpop_4 |   .4444444   .1756821      .0908147    .7980742
   _subpop_5 |         .5   .1889822      .1195985    .8804015
   _subpop_6 |   .4285714   .2020305      .0219051    .8352378
--------------------------------------------------------------

To get exact binomial confidence intervals, use -ci- :

. by year month: ci female , binomial

------------------------------------------------------------------------
-------
-> year = 2004, month = 10
                                                         -- Binomial
Exact --
    Variable |        Obs        Mean    Std. Err.       [95% Conf.
Interval]
-------------+----------------------------------------------------------
-----
      female |          8        .375    .1711633        .0852334
.7551368

------------------------------------------------------------------------
-------
-> year = 2004, month = 11
                                                         -- Binomial
Exact --
    Variable |        Obs        Mean    Std. Err.       [95% Conf.
Interval]
-------------+----------------------------------------------------------
-----
      female |          8          .5    .1767767        .1570128
.8429872
.....


To get one time variable, use the time series facilities:

. gen mdate = ym(year,month)
. format mdate %tm
. tab1 mdate

-> tabulation of mdate  
      mdate |      Freq.     Percent        Cum.
------------+-----------------------------------
    2004m10 |          8       17.02       17.02
    2004m11 |          8       17.02       34.04
    2004m12 |          7       14.89       48.94
     2005m1 |          9       19.15       68.09
     2005m2 |          8       17.02       85.11
     2005m3 |          7       14.89      100.00
------------+-----------------------------------
      Total |         47      100.00

Hope this helps
Svend
__________________________________________

Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6
DK-8000  Aarhus C, Denmark
Phone: +45 8942 6090
Home:  +45 8693 7796
Email: sj@soci.au.dk
__________________________________________ 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index