# RE: st: Analyzing a subpopulation in Stata 10.1

 From "Karadogan, Figen" <[email protected]> To "[email protected]" <[email protected]> Subject RE: st: Analyzing a subpopulation in Stata 10.1 Date Fri, 26 Jun 2009 18:03:00 +0000

```I am truly sorry for the attachment. Please find my question re-stated below without having any attachment.

I would like to find out the proportion of diabetics older than 25yrs old, who go to dentist by education.  The N of cases jointly non-missing on diabetes (dmrc_dum), dentist  (dentistrc), age older than 25 (agerc25), education  (edurc) is 3772. The N of cases jointly non-missing on diabetes and dentist is 3783.

I know that Stata needs information from every observation in the sample to compute the variance, standard error and confidence intervals even though only the information in the subsamples is needed to compute proportions.  But, I am confused which one of those syntaxes gives me the correct total number of observations = 3772 or 3783??

Output results:

svy, subpop(dmrc_dum if agerc25==1): proportion dentistrc if !missing(dmrc_dum, dentistrc, agerc25, edurc), over(edurc)(running proportion on estimation sample)

(running proportion on estimation sample)
Survey: Proportion estimation

﻿Number of strata = 1                                                  Number of obs = 3772
Number of PSUs = 3772                                             Population size = 365510
N. of poststrata = 108                                             Subpop. no. obs = 508
Subpop. size = 43946.8
Design df = 3771

_prop_1: dentistrc = No Dentist
_prop_2: dentistrc = Yes Dentist
_subpop_1: edurc = Less than HS
_subpop_3: edurc = At least some college

Over Proportion Std. Err. [95% Conf. Interval]
_prop_1
_subpop_1 .6325268 .0584768 .5178775 .747176
_subpop_2 .4420064 .0312256 .3807856 .5032272
_subpop_3 .322104 .0343318 .2547933 .3894146
_prop_2
_subpop_1 .3674732 .0584768 .252824 .4821225
_subpop_2 .5579936 .0312256 .4967728 .6192144

svy, subpop(dmrc_dum if agerc25==1): proportion dentistrc if !missing(dmrc_dum, dentistrc, agerc25), over(edurc)

(running proportion on estimation sample)
Survey: Proportion estimation

Number of strata = 1                                             Number of obs = 3783
Number of PSUs = 3783                                        Population size = 365510
N. of poststrata = 108                                        Subpop. no. obs = 508
Subpop. size = 43724.1
Design df = 3782

_prop_1: dentistrc = No Dentist
_prop_2: dentistrc = Yes Dentist
_subpop_1: edurc = Less than HS
_subpop_3: edurc = At least some college

_prop_1
_subpop_1 .633312 .0584333 .5187483 .7478758
_subpop_2 .442029 .0312243 .3808109 .5032471
_subpop_3 .322692 .0343688 .2553088 .3900752
_prop_2
_subpop_1 .366688 .0584333 .2521242 .4812517
_subpop_2 .557971 .0312243 .4967529 .6191891
_subpop_3 .677308 .0343688 .6099248 .7446912

________________________________________
From: [email protected] [[email protected]] on behalf of Marcello Pagano [[email protected]]
Sent: Friday, June 26, 2009 1:38 PM
To: [email protected]
Subject: Re: st: Analyzing a subpopulation in Stata 10.1

No, don't, Figen.
This creates havoc for a lot of people.
Direct people to a website if need be.
The FAQ is there for a reason.

Thanks,

m.p.

>  Hello,
>
> I do apologize for the attachment, but given the complexity of the question I would like to provide you all the information I have.
>
> Please find the result of my output attached. There, I would like to find out the proportion of diabetics older than 25yrs old, who go to dentist by education.  The N of cases jointly non-missing on diabetes (dmrc_dum), dentist  (dentistrc), age older than 25 (agerc25), education  (edurc) is 3772. The N of cases jointly non-missing on diabetes and dentist is 3783.
>
>
> I know that Stata needs information from every observation in the sample to compute the variance, standard error and confidence intervals even though only the information in the subsamples is needed to compute proportions.  But, I am confused which one of those syntaxes gives me the correct total number of observations = 3772 or 3783??
>
> Thank you very much for your help.
>
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```