[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: patterns of missing data by interviewers

From	Govind Acharya <[email protected]>
To	[email protected]
Subject	Re: st: patterns of missing data by interviewers
Date	Tue, 24 Jul 2007 10:11:35 -0400

Wow, this is great! I can see why folks prefer Stata over SAS! With SAS, I would spend so much time with the code, that I didn't have the time for the stats. This worked like a charm. If anyone has other suggestions, feel free to bring them up :-).

Govind

-
Govind Acharya
Assistant Director/Senior Research Associate
Survey Research Institute, Cornell University
391 Pine Tree Rd.
Ithaca, NY 14850
phone: (607) 255-0375; fax: (607) 255-7118
http://www.sri.cornell.edu

Austin Nichols wrote:

Govind Bell Acharya <[email protected]>:
I assume you saw the response from Nick Cox.

Assuming you want variables named a-f as below,, and you've got an
interviewer id/name variable a on some data with a variable q which is
the reponse to an item, where missing is appropriately coded (. or .a
through .z), and surveyid indexing each survey, you can

bys a surveyid: gen fq=_n==1
g miss=mi(q)
bys a: g d=_N
collapse (sum) b=fq (sum) c=miss d, by(a)
gen e=c/d

You can get the mean over some group of interviewers with
egen mpcmi=mean(e), by(groupvar)
though if you want the mean over everyone, so that f is the same for
everyone, you should just

su e, meanonly
gen f=e-r(mean)

See also -help collapse- and
http://www.stata.com/support/faqs/data/weighted.html among other
resources.

On 7/21/07, Govind Bell Acharya <[email protected]> wrote:

For our research, we use telephone interviewers to conduct a number of
surveys.   At the moment, it is a challenge to detect whether the number
of items where the interviewer coded the missing data (don't know or
refused options) is above or below the overall mean of missing values.
I did something like that using the proc sql command in SAS, but it is
(as SAS is in general), extremely unwieldy and creates major issues such
as (f) below.  In any case, here is what I have in mind

(a)Name   (b)# surveys complete   (c)# missing   (d)# questions asked
 (e) (c)/(d)   (f) [sum of (c)]/[sum of (d)]   (g) [(e)-(f)]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: patterns of missing data by interviewers
  - From: Govind Bell Acharya <[email protected]>
- Re: st: patterns of missing data by interviewers
  - From: "Austin Nichols" <[email protected]>

Prev by Date: st: HBS job listing
Next by Date: Re: Re: st: RE: Splitting numeric values
Previous by thread: Re: st: patterns of missing data by interviewers
Next by thread: st: What is wrong with this syntax?
Index(es):
- Date
- Thread