Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: using -mixed- with clustered data that includes probability weights

From	Alfonso Sánchez-Peñalver <[email protected]>
To	Stata List <[email protected]>
Subject	Re: st: using -mixed- with clustered data that includes probability weights
Date	Fri, 28 Feb 2014 08:58:53 -0500

Hi Stas,

I've been giving this some further thought and I think you're right. My thought experiment has been the following. Consider that we are modeling this as a fixed doctors effects model, instead of a random effects (for weights it helps me thing this way without adding randomness to the intercepts). When including dummy (binary) variables for the doctors, the coefficients would be calculated right because it uses the weights of all the observations, and thus we would be capturing the relative weight of the doctor's patients in the rest of the population. I interpret, then, that what you're saying is that you would only use weights at the doctor level when, for any reason, the sampling of the doctor produces any additional bias than the sampling of the patients themselves to correct for that additional bias. My first thought was that well, if the patients for a specific doctor are over-sampled, then that doctor is over-sampled. But it seems reasonable that that is already captur!
 ed by the weights applied to the patients, so at the top level you would only apply any additional correction needed because of a selection process that favored some doctors over others, and since we don't know that was the case we leave it be.  Is this correct?

Thanks again,

Alfonso.

On Feb 27, 2014, at 10:25 PM, Stas Kolenikov <[email protected]> wrote:

> Alfonso, I would say that this would lead to double counting of who
> was oversampled. Doctors are domains, in terms of survey statistics;
> and, as I said, I would not touch them since they were not sampled
> directly.
> 
> 
> -- Stas Kolenikov, PhD, PStat (ASA, SSC)
> -- Principal Survey Scientist, Abt SRBI
> -- Opinions stated in this email are mine only, and do not reflect the
> position of my employer
> -- http://stas.kolenikov.name
> 
> 
> 
> On Thu, Feb 27, 2014 at 10:09 PM, Alfonso Sánchez-Peñalver
> <[email protected]> wrote:
>> If doctors were not sampled, then they basically are a consequence of the sampling of the patients. Since we know that some patients have been over-sampled and others under-sampled, the question really is what proportion of each type of patients does each doctor have. Because a doctor would then be over-sampled or under-sampled depending on the over-sampling and under-sampling of the patients. Wouldn't they? Thus it may be appropriate to estimate a weighted average of the patients' weights for a doctor, where the weights for this weighted average could be the illness severity score, since it's the basis for the over-sampling. Don't you agree?
>> 
>> Alfonso Sanchez-Penalver, PhD
>> 
>> On Feb 27, 2014, at 9:48 PM, Steve Samuels <[email protected]> wrote:
>> 
>>> To elaborate on Stas's post: if doctors were not sampled, then you can
>>> define the doctor-level weight with pweight(1).
>>> 
>>> Steve
>>> [email protected]
>>> 
>>> On Feb 27, 2014, at 8:21 PM, Stas Kolenikov <[email protected]> wrote:
>>> 
>>> Whether your results are biased really depends on your study design.
>>> -mixed- knows nothing about your design (Stata 14 request: a command
>>> to read the pdf file and extract the design information from the
>>> narrative -- may I suggest -pdf2svyset- as a prospective name? I am
>>> sure a lot of researchers would find this handy), and just warns you
>>> in case you had sampling at several levels. If there were no sampling
>>> at the doctor's level, then weighting only at the patient level that
>>> you have is appropriate.
>>> 
>>> -- Stas Kolenikov, PhD, PStat (ASA, SSC)
>>> -- Principal Survey Scientist, Abt SRBI
>>> -- Opinions stated in this email are mine only, and do not reflect the
>>> position of my employer
>>> -- http://stas.kolenikov.name
>>> 
>>> 
>>> 
>>> On Thu, Feb 27, 2014 at 7:55 PM, Stephen Henry <[email protected]> wrote:
>>>> Hi,
>>>> 
>>>> I want to know whether I can used -mixed- in Stata 13.1 to analyze
>>>> clustered data that include probability weights.
>>>> 
>>>> My data were collected to study patients during clinic visits.
>>>> Each patient is unique, and patients are clustered within doctors.
>>>> In addition, patients were sampled based on an illness severity score.
>>>> Patients with more severe symptoms were over-sampled.
>>>> Patient sampling was done independent of which doctor was seeing the
>>>> patient.
>>>> 
>>>> I have been analyzing data using the -reg- command and cluster option as
>>>> follows:
>>>> 
>>>> reg v1 v2 v3 [pweight=weight], cluster(doctor_id)
>>>> 
>>>> However, I'd like to use -mixed- instead to take advantage of the
>>>> additional postestimation commands.
>>>> 
>>>> Stata will run the following command:
>>>> 
>>>> mixed v1 v2 v3 [pweight=weight] || doctor_id:
>>>> 
>>>> but warns me that "Sampling weights were specified only at the first level
>>>> in a multilevel model."
>>>> Are my results with the -mixed- command potentially biased?  If so, is
>>>> there an easy way to fix this?
>>>> 
>>>> Thanks in advance,
>>>> 
>>>> Stephen Henry
>>>> University of California Davis
>>>> Sacramento, California
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>> 
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Stephen Henry <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Stas Kolenikov <[email protected]>

References:
- st: using -mixed- with clustered data that includes probability weights
  - From: Stephen Henry <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Stas Kolenikov <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Steve Samuels <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Alfonso Sánchez-Peñalver <[email protected]>
- Re: st: using -mixed- with clustered data that includes probability weights
  - From: Stas Kolenikov <[email protected]>

Prev by Date: Re: st: charlist syntax error
Next by Date: Re: st: overlay of stacked bar and line?
Previous by thread: Re: st: using -mixed- with clustered data that includes probability weights
Next by thread: Re: st: using -mixed- with clustered data that includes probability weights
Index(es):
- Date
- Thread