Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: patterns of missing data by interviewers


From   Govind Acharya <ga47@cornell.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: patterns of missing data by interviewers
Date   Tue, 24 Jul 2007 10:11:35 -0400

Wow, this is great! I can see why folks prefer Stata over SAS! With SAS, I would spend so much time with the code, that I didn't have the time for the stats. This worked like a charm. If anyone has other suggestions, feel free to bring them up :-).

Govind

-
Govind Acharya
Assistant Director/Senior Research Associate
Survey Research Institute, Cornell University
391 Pine Tree Rd.
Ithaca, NY 14850
phone: (607) 255-0375; fax: (607) 255-7118
http://www.sri.cornell.edu



Austin Nichols wrote:

Govind Bell Acharya <ga47@cornell.edu>:
I assume you saw the response from Nick Cox.

Assuming you want variables named a-f as below,, and you've got an
interviewer id/name variable a on some data with a variable q which is
the reponse to an item, where missing is appropriately coded (. or .a
through .z), and surveyid indexing each survey, you can

bys a surveyid: gen fq=_n==1
g miss=mi(q)
bys a: g d=_N
collapse (sum) b=fq (sum) c=miss d, by(a)
gen e=c/d

You can get the mean over some group of interviewers with
egen mpcmi=mean(e), by(groupvar)
though if you want the mean over everyone, so that f is the same for
everyone, you should just

su e, meanonly
gen f=e-r(mean)

See also -help collapse- and
http://www.stata.com/support/faqs/data/weighted.html among other
resources.

On 7/21/07, Govind Bell Acharya <ga47@cornell.edu> wrote:
For our research, we use telephone interviewers to conduct a number of
surveys.   At the moment, it is a challenge to detect whether the number
of items where the interviewer coded the missing data (don't know or
refused options) is above or below the overall mean of missing values.
I did something like that using the proc sql command in SAS, but it is
(as SAS is in general), extremely unwieldy and creates major issues such
as (f) below.  In any case, here is what I have in mind

(a)Name   (b)# surveys complete   (c)# missing   (d)# questions asked
 (e) (c)/(d)   (f) [sum of (c)]/[sum of (d)]   (g) [(e)-(f)]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index