[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Mabel Andalon <mabel.andalon@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: gllamm & stratified sampling design |

Date |
Tue, 22 Apr 2008 18:40:34 -0400 |

Many thanks to Steven for your helpful comments. I have one person per household, so I think I´m OK.

Stats, thanks to you as well. I agree with you that there is no strict nesting. In fact, that is why I was so confused

as of how to define the hierarchical nested clusters. FYI, Stata actually reported an error. When I sent the initial

email the program had been running for some hours. I got an error message this morning...

I guess there is no clear way to proceed, but I will try to use a combination of fixed-random effects, as you

suggested.

Cheers,

Mabel

Stas Kolenikov wrote:

On Mon, Apr 21, 2008 at 9:51 AM, Mabel Andalon <mabel.andalon@gmail.com> wrote:

Dear All,-gllamm- is not a survey command (that can easily go with -svy-

I am estimating a model of community participation (1-0) using

individual-level data. These data are of immigrants in the US and comes from

a stratified simple random sampling survey. The strata are US states

(usstate). I've always used the svy option when analyzing these data

setting:

svyset [pweight=wt_natio], strata(usstate)

prefix), so there won't be much use for this statement.

I just merged these data with contextual data from people's state of originYes, but of the ugliest cross-classified kind. If an individual is

in a foreign country based on year of arrival to the US. And I also merged

US state-level data based on current state of residence. That is, any two

people who arrived in the same year from the same state and country and who

live in the same US state were merged the same state-level data.

My questions are two:

1. Is this considered multilevel data?

level 1, what is your level 2? US state? The country they came from?

There is no strict nesting, and instead there is a web of links:

people from all different countries come to all different states at

all different points in time. It is difficult to analyze data of this

kind in any of the existing software packages, because the likelihood

for this kind of data can only be obtained by integration over the

whole data set at once, rather than by contiguous units within the

same cluster. In your shoes, I would probably consider two of the

three to be fixed effects, and model the third one as a random effect.

For instance, treat the states and countries as fixed effects (if

there are really big systematic differences you are expecting between

states), and year as random effect (provided you have at least a dozen

of different values there, and the decision when to move is more

reasonably assumed random than the state they wanted to move to -- I

am thinking this is the case since different states might have quite

different immigration conditions, such as how easy it is to get a

driver's license or SSN).

2. If so, how can I conduct a true multilevel analysis using glamm andAs I said above, you don't really have individuals nested in fostate

still include the features of sampling design (i.e. stratification).

So far, I have estimated:

gllamm participation $xvars , i(individual fostate year usstate)

pweight(wt) f(binom) l(logit) adapt

i = individuals/inmigrants

fostate = foreign state of residence

year= year of arrival to the US

usstate= current state of residence

I'm not even sure that I have correctly defined the hierarchical, nested

clusters in the i() option. The weights are individual's sampling weights.

nested in year nested in usstate. True, individuals are nested in any

of those conglomerates, but there is no nesting structure of the

remaining identifiers. -gllamm- should've given you an error saying

that your identifiers are not nested.

You would need to specify weights for all levels; the level 1 weights

that would be in the variable -wt1- will be your sampling weights, and

the higher level weights -wt2-, -wt3-, ... will probably be 1, since

you did not have any sampling on those levels. You would have to bid

farewell to your stratification information: there is no way to

accommodate that.

If -xtmelogit- allowed for weights, that would be a notably faster

alternative to -gllamm-, but it does not appear to support them.

That's an impressive list to sample from, BTW. I did not know such

lists of addresses existed, let alone spanning relatively elusive

Hispanic households.

Looks like you have some 6000 observations at least. I wouldn't have

high expectations with -gllamm- for models like that in terms of

computational time unless you have Stata/MP8 on an appropriate

computational cluster. And if you want to do four levels of random

effects, you will probably need to prepare yourself for a few hours

per one instance of likelihood calculation, meaning likely about a

week per iteration.

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Chosing the EDF in GAM***From:*Gauthier Tshiswaka-Kashalala <s24027589@tuks.co.za>

**References**:**Re: st: Re: Question Tobit model***From:*Johannes Geyer <JGeyer@diw.de>

**st: gllamm & stratified sampling design***From:*Mabel Andalon <mabel.andalon@gmail.com>

**Re: st: gllamm & stratified sampling design***From:*"Stas Kolenikov" <skolenik@gmail.com>

- Prev by Date:
**Re: st: gllamm & stratified sampling design** - Next by Date:
**Re: st: dangerous preserve** - Previous by thread:
**Re: st: gllamm & stratified sampling design** - Next by thread:
**st: Chosing the EDF in GAM** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |