Govind Bell Acharya <ga47@cornell.edu>:
I assume you saw the response from Nick Cox.
Assuming you want variables named a-f as below,, and you've got an
interviewer id/name variable a on some data with a variable q which is
the reponse to an item, where missing is appropriately coded (. or .a
through .z), and surveyid indexing each survey, you can
bys a surveyid: gen fq=_n==1
g miss=mi(q)
bys a: g d=_N
collapse (sum) b=fq (sum) c=miss d, by(a)
gen e=c/d
You can get the mean over some group of interviewers with
egen mpcmi=mean(e), by(groupvar)
though if you want the mean over everyone, so that f is the same for
everyone, you should just
su e, meanonly
gen f=e-r(mean)
See also -help collapse- and
http://www.stata.com/support/faqs/data/weighted.html among other
resources.
On 7/21/07, Govind Bell Acharya <ga47@cornell.edu> wrote:
For our research, we use telephone interviewers to conduct a number of
surveys. At the moment, it is a challenge to detect whether the number
of items where the interviewer coded the missing data (don't know or
refused options) is above or below the overall mean of missing values.
I did something like that using the proc sql command in SAS, but it is
(as SAS is in general), extremely unwieldy and creates major issues such
as (f) below. In any case, here is what I have in mind
(a)Name (b)# surveys complete (c)# missing (d)# questions asked
(e) (c)/(d) (f) [sum of (c)]/[sum of (d)] (g) [(e)-(f)]