Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Svy commands for cluster sampling

From   Hajime SATO <>
To   Statalist <>
Subject   st: Svy commands for cluster sampling
Date   Thu, 09 Feb 2012 16:02:31 +0900

Dear Statalisters,

My colleague brought me questions on svy commands in Stata, which I could not answer quickly. I would appreciate it if any of you provide your insights.

Here are questions:
Suppose there are N hospitals and M clinics in the study area. From them, we randomly sampled n hospitals and m clinics, and asked them how many flu patients visited them in the past month, and what characteristics they had.

Some hospitals reported that they had 5 patients, while some others had none. In the same way, some clinics had 3 patients, while some others had none.

In the situation described above, I think we can infer (1) the number of patients in the study area in the past month, using the sampling weights of n/N and m/M. In this case, (how) can we use svy commands (or what others in Stata)?

Next, we wish to know the mean age and overall sex ratio of all the flu patients in the study area.

Can we (2) calculate the mean age and overall male ratio of the whole patients, using the mean age and male ratio of the patients who visited each hospital/ clinic (and using the sampling weights for hospitals and clinics)? There are hospitals/ clinics that had no patient.

If we use the record of each patient, instead of the means (averaged values) of the patients' characteristics reported by each hospital/ clinics, again, (how) can we use svy commands (or what others in Stata)?

Then, we wish to (3) describe the characteristics of flu patients, and test a set of hypotheses, such as difference in age by sex. Every hospital has its id number Hi, while every clinic has the one Mi. Can (and how can) we use svy commands (or others in Stata) to do, for example, a t-test to examine difference in mean age by sex.

Each patient's record looks like the following:
------------------------ --------------------------------------------
id    hosp/clin   h/c_no   age    sex   highest_fever  duration
1     h           1        16     m     100 (F)        4 (days)
2     c           5        43     f      94            6            
There are only data on flu cases (none about those not suffering from flu).

-- Looking forward to any suggestions. ---------------------------------------------------------- Hajime SATO, MD, MPH, DrPH, PhD Director Department of Health Policy and Technology Assessment National Institute of Public Health Japan

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index