Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: using -mixed- with clustered data that includes probability weights

From	Stephen Henry <[email protected]>
To	[email protected]
Subject	Re: st: using -mixed- with clustered data that includes probability weights
Date	Fri, 28 Feb 2014 09:06:13 -0800

Hi,
Thanks for the thoughtful responses.
To clarify, in my data doctors were recruited first;patients were
recruited only if they were seeing a doctor enrolled in the study.
Patient recruitment and over-sampling, however, was done independent
of doctor assignment as long as the patient was seeing an eligible
doctor. Each doctor has 1500-2000 patients and only a few (1-10)
patients were enrolled per doctor, so I don't think accounting for
total # of patients within doctor (which I cannot do anyway) would
change results.

All eligible doctors in the study area were recruited, but there was
no attempt to have doctors represent a national sample beyond the
study area. So I thought about 3 options to account for any
doctor-level effects.

a) designate doctors as individual strata using svy commands.
b) using robust SEs (ie -reg- , cluster())
c) using -mixed- to fit a 2-level hierarchical model.

For all 3 approaches, I would still account for oversampling among patients.
I don't have experience with option (a) , but it doesn't seem the best
option because I'm not trying to generalize about doctors beyond the
study area.

Stephen


On Fri, Feb 28, 2014 at 5:58 AM, Alfonso Sánchez-Peñalver
<[email protected]> wrote:
> Hi Stas,
>
> I've been giving this some further thought and I think you're right. My thought experiment has been the following. Consider that we are modeling this as a fixed doctors effects model, instead of a random effects (for weights it helps me thing this way without adding randomness to the intercepts). When including dummy (binary) variables for the doctors, the coefficients would be calculated right because it uses the weights of all the observations, and thus we would be capturing the relative weight of the doctor's patients in the rest of the population. I interpret, then, that what you're saying is that you would only use weights at the doctor level when, for any reason, the sampling of the doctor produces any additional bias than the sampling of the patients themselves to correct for that additional bias. My first thought was that well, if the patients for a specific doctor are over-sampled, then that doctor is over-sampled. But it seems reasonable that that is already capt!
 ur!
>  ed by the weights applied to the patients, so at the top level you would only apply any additional correction needed because of a selection process that favored some doctors over others, and since we don't know that was the case we leave it be.  Is this correct?
>
> Thanks again,
>
> Alfonso.
>
> On Feb 27, 2014, at 10:25 PM, Stas Kolenikov <[email protected]> wrote:
>
>> Alfonso, I would say that this would lead to double counting of who
>> was oversampled. Doctors are domains, in terms of survey statistics;
>> and, as I said, I would not touch them since they were not sampled
>> directly.
>>
>> On Thu, Feb 27, 2014 at 10:09 PM, Alfonso Sánchez-Peñalver
>> <[email protected]> wrote:
>>> If doctors were not sampled, then they basically are a consequence of the sampling of the patients. Since we know that some patients have been over-sampled and others under-sampled, the question really is what proportion of each type of patients does each doctor have. Because a doctor would then be over-sampled or under-sampled depending on the over-sampling and under-sampling of the patients. Wouldn't they? Thus it may be appropriate to estimate a weighted average of the patients' weights for a doctor, where the weights for this weighted average could be the illness severity score, since it's the basis for the over-sampling. Don't you agree?
>>>
>>> Alfonso Sanchez-Penalver, PhD

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: using -mixed- with clustered data that includes probability weights
  - From: Stephen Henry <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Stas Kolenikov <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Steve Samuels <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Alfonso Sánchez-Peñalver <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Stas Kolenikov <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Alfonso Sánchez-Peñalver <[email protected]>

Prev by Date: Re: st: using -mixed- with clustered data that includes probability weights
Next by Date: st: Interpreting streg, time dist(weibull) coefficients as a time metric
Previous by thread: Re: st: using -mixed- with clustered data that includes probability weights
Next by thread: st: FW: How to combine linearly ATTs produced with PSMATCH2?
Index(es):
- Date
- Thread