Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: clustered standard errors
Steve Samuels <email@example.com>
Re: st: RE: clustered standard errors
Thu, 29 Apr 2010 14:07:43 -0400
Are any of the "local forces" or "national forces" identical for all
voters in a region? For those forces that do not vary within a
region, your sample size is effectively n = 17.
It appears that regions are not "clusters", but are strata, which by
definition are units that together constitute the entire population of
interest. One approach to this analysis, and the one I recommend in
absence of other information, is to treat regional differences as
fixed effects , and to use Stata's -svyset- command to specify the
design. The strata would be the original strata in the 17 surveys,
suitably coded so that there are no duplicate numbers. In the
analysis, you would have a fixed indicator of region, but also
regional variables that might explain the regional differences.
Although the samples in each region can be considered random, they
are not "simple random samples". Pooling the samples without
adjustment for the original sample designs will give biased estimates
(if the analysis is not weighted) and improper standard errors.
On Thu, Apr 29, 2010 at 11:46 AM, Robert Lineira <firstname.lastname@example.org> wrote:
> The population are the 17 Spanish regions and the samples are post-election
> surveys in each region. The purpose of the analysis is to look for variances
> on the strength of local and national forces on voting and turnout.
> Although the multi-stage sampling procedure takes advantage of some strata
> and clusters to select the individuals, the samples may be considered as
> random samples of voters in each region. The pool of samples consists in the
> aggregation of this random samples.
> I hope this helps in having a better idea of the research.
> Thanks in advance!
> Al 29/04/2010 14:06, En/na Steve Samuels ha escrit:
>> I wonder what the purpose of the analysis is, what the sampled
>> populations are, and what the sample designs are. Survey samples can
>> be complex creations with their own strata and clusters. Until Robert
>> provides more detail, I'm not sure that 1 sample = 1 cluster.
>> On Thu, Apr 29, 2010 at 6:03 AM, Schaffer, Mark E<M.E.Schaffer@hw.ac.uk>
>>>> -----Original Message-----
>>>> From: email@example.com
>>>> [mailto:firstname.lastname@example.org] On Behalf Of
>>>> Robert Lineira
>>>> Sent: 29 April 2010 10:08
>>>> To: email@example.com
>>>> Subject: st: clustered standard errors
>>>> Dear all,
>>>> I found on the net a presentation by Austin Nichols and Mark
>>>> Schaffer on the net on clustered standard errors. After
>>>> reading it, some questions emerged to me on how to use them.
>>>> I want to run an analysis using a pool of 17 survey samples.
>>>> Supposedly, standard errors will be correlated within the
>>>> clusters, but the presentation advises that to use clustered
>>>> standard error might be a very bad solution. They suggest to
>>>> perform some test before using the corrected errors running
>>>> 'cltest' and 'xtcltest' stata commands.
>>>> Unfortunately, I just found 'cltest' command, I am not sure
>>>> is the same they use given that is previous to the Kédzi
>>>> (2007) paper they quote.
>>> No, that's a different test. The test code Austin and I referred to in
>>> our presentation is still languishing in alpha testing.
>>> But I'm not sure it or other tests can help you.
>>> The problem is that this test, like White's general heteroskedasticity
>>> test and related tests, works via a vector-of-contrasts. The contrast is
>>> between the elements of the robust and non-robust VCVs.
>>> Under the null, the robust VCV is consistent. If the non-robust VCV is
>>> also consistent, its elements will be similar to those of the robust VCV,
>>> and the vector of contrasts will be small. If the non-robust VCV is
>>> inconsistent, the contrast will be large.
>>> You can see the problem now. To do this or a related test in your
>>> application, you need a robust VCV that is consistent. Your cluster-robust
>>> VCV is indeed consistent, but with only 17 clusters, you are not very far
>>> along the way to infinity, and it's likely to be a poor estimator of the
>>> VCV. Contrasting it with the non-robust VCV is not going to give you a
>>> reliable test - the contrast could be big because the cluster-robust VCV is
>>> poor, for example.
>>> Hope this helps.
>>>> My question is if anyone knows a test which I could use
>>>> before applying clustered standard errors and (if not) which
>>>> solution do you find better in a case such as this.
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/statalist/faq
>>>> * http://www.ats.ucla.edu/stat/stata/
>>> Heriot-Watt University is a Scottish charity
>>> registered under charity number SC000278.
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
> Robert Liñeira
> Dpt. Ciència Política - UAB
> 08193 Bellaterra - Barcelona
> Tlf: +34 93 581 46 33
> Despatx B1-185
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
18 Cantine's Island
Saugerties NY 12477
* For searches and help try: