[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Question about svyset command

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: Question about svyset command
Date	Thu, 19 Feb 2009 08:17:07 -0500

Thomas,

1. The finite population corrections should affect only standarderrors and confidence intervals, not estimates of means, proportions,or confidence intervals.

2. fpc's should be employed only for descriptive analyses(proportions, means). These analyses describe the specific finitepopulation that you sampled: tort, contract, and real property trialsin the 75 counties.

If the purpose of your model is analystic: to develop predictions,estimate odds ratios, compare proportions, or otherwise testhypotheses, you should *omit* the finite population corrections. Thereasoning is interesting (Cochran, 1977, p.39): It is seldom ofscientific interest to ask if a null hypothesis (e.g. that twoproportions are equal) is exactly true in a finite population .Except by a very rare chance, a null hypothesis will never be true.You would discover this by enumerating the entire population. Thisleads to the adoption of a "superpopulation" viewpoint, which istaken by almost all statisticians these days. See also Deming(1966)pp 247-261 "Distinction between enumerative and analystic studies";Korn and Graubard (1999), p. 227.

In other words, you should use one -svyset- for describing the targetpopulation and another for the logistic regression.


Two questions came to mind:

1. If a trial had >1 plaintiff or >1 defendant, would that notincrease the probability of a post trial motion? How are you goingto account for that?2. For descriptive analyses, counties selected with certainty needspecial treatment. Look up the "singleunit" option for -svyset-.


Good luck!

-Steve

References
Cochran, W. G. (1977). Sampling techniques (3ded.). New York: Wiley.

Deming, W. E. (1966). Some theory of sampling. New York: DoverPublications.Korn, E. L., & Graubard, B. I. (1999). Analysis of health surveys(Wiley series in probability and statistics). New York: Wiley.




On Feb 19, 2009, at 12:04 AM, [email protected] wrote:

Iâm a beginner Stata user and have a question about the svysetcommand in Stata that I hope someone can help me with.
For some background, I'm engaged in a logistic regression modelthat examines the likelihood of either a plaintiff or defendantfiling a post trial motion. The database I'm working with is theCivil Justice Survey of State Courts (CJSSC). The CJSSC providescase level data for all t conclude in a sample of 46 of thenation's 75 most populous counties in 2005. Data are collected onabout 8,000 trials in these 46 counties which are weighted torepresent about 10,500 trials concluded in the nation's 75 mostpopulous counties. I understand that one of the nice features ofStata is that it allows you to take into account the samplingstructure of a dataset when doing logistic regression modeling.Here is the Stata code that I used to take in account the samplingstructure of these civil trial data:
svyset sitecode [pweight=bwgt0], strata(strata) fpc(fpc1) || su2,fpc(fpc2)
Where
Sitecode = County where the civil trial took place
Bwgt0 = Weights to weight the data from 46 to the 75 most populouscountiesStrata = Strata where the counties are located. The dataset has 5stratafpc1 = The probability of a county appearing in the sample. Forexample, a county with a weight of 2 would have a 50% probabilityof appearing in the sampl
e
su2 = Unique identifier that identifies the trials that occurred ineach of the 46 countiesFpc2 = 1 for all 8,000 trials disposed in the 46 counties. I gavefpc2 a value of 1 because I wanted to tell Stata that the trialshad a 100% probability of showing up in these 46 counties.I think that I got the part of this programming that deals with thefirst level of the sample design correct. Itâ??s the second levelthat Iâ??m having some problems with At the second level of thesample design, I'm trying to correct for the fact that I have datafor every civil trial concluded in the 46 counties. Basically, Iwant to tell Stata that part of this sample is actually a census ofall trials concluded in the 46 counties in 2005. I understand Statahas a finite population correction command that takes into accountthe census like format of these data. The logistic regressionresults were the same irrespective of whether I used the 1st or 2ndstages in the sample design. I think this is telling me that Statais not correcting for the census like aspect of this sample. Cananyone give me some guidance as to whether I'm correctly takinginto account the sampling structure of these data. In particular, Iwould like to know whether I'm using the fpc2 factor correctly. Anyassistance you could give on this matter would be very muchappreciated.
Thanks
Thomas Cohen


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Question about svyset command
  - From: "Michael I. Lichter" <[email protected]>

References:
- st: Question about svyset command
  - From: [email protected]

Prev by Date: AW: AW: st: Re: determining differences between intercepts after regression
Next by Date: Re: st: Re: determining differences between intercepts after regression
Previous by thread: Re: st: suest and micombine
Next by thread: Re: st: Question about svyset command
Index(es):
- Date
- Thread