[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: patterns of missing data by interviewers

From   "Austin Nichols" <>
Subject   Re: st: patterns of missing data by interviewers
Date   Mon, 23 Jul 2007 17:53:35 -0400

Govind Bell Acharya <>:
I assume you saw the response from Nick Cox.

Assuming you want variables named a-f as below,, and you've got an
interviewer id/name variable a on some data with a variable q which is
the reponse to an item, where missing is appropriately coded (. or .a
through .z), and surveyid indexing each survey, you can

bys a surveyid: gen fq=_n==1
g miss=mi(q)
bys a: g d=_N
collapse (sum) b=fq (sum) c=miss d, by(a)
gen e=c/d

You can get the mean over some group of interviewers with
egen mpcmi=mean(e), by(groupvar)
though if you want the mean over everyone, so that f is the same for
everyone, you should just

su e, meanonly
gen f=e-r(mean)

See also -help collapse- and among other

On 7/21/07, Govind Bell Acharya <> wrote:
For our research, we use telephone interviewers to conduct a number of
surveys.   At the moment, it is a challenge to detect whether the number
of items where the interviewer coded the missing data (don't know or
refused options) is above or below the overall mean of missing values.
I did something like that using the proc sql command in SAS, but it is
(as SAS is in general), extremely unwieldy and creates major issues such
as (f) below.  In any case, here is what I have in mind

(a)Name   (b)# surveys complete   (c)# missing   (d)# questions asked
 (e) (c)/(d)   (f) [sum of (c)]/[sum of (d)]   (g) [(e)-(f)]
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index