Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: xtmixed with nonrtolerance. What happens?

From	Joerg Luedicke <[email protected]>
To	[email protected]
Subject	Re: st: xtmixed with nonrtolerance. What happens?
Date	Thu, 23 Jun 2011 22:56:54 -0400
A couple of points:

1) You surely assume a hierarchical structure: you have observations
at Level-1, cross-classified across 2 Level-2 factors.

2) It is still not clear what your unit of analysis is? What are those
6192? They cannot be yearly observations within country as that would
only result in 640 observations.

3) If your dependent variable is supposed to represent the number of
exports (in a given year?), why does it contain decimals and not only
integers?

4) Have you ever spent some time looking at the distribution of your
dependent variable? When you standardize it, it ranges from .07
standard deviations below the mean to 34 standard deviations above the
mean!! My guess is that you are looking at some crazy distribution
like this:

gen x=rgamma(.001,100000)

with some very high values but with the majority of values being zeros
or close to zero. I suspect that there is either something wrong with
this variable or with your entire data set-up.

5) If it turns out that everything in your data is correct, then
trying to fit a linear model to these data is certainly the wrong
approach.


Joerg


On Thu, Jun 23, 2011 at 4:40 PM, "Lukas Bösch" <[email protected]> wrote:
> Because i didnt transform the year and the export, named as quantity, into z-scores they kept their original names in the first models.
> I just did the transformation and ran the model again, but it still doesnt converge, however seems to work a little better.
>
> . xtmixed centquantity2 centyear2 centforestarea2 centgdp2 centlandarea2 centpopulation2|| _all: R.country || _all: R.genus
>
> Performing EM optimization:
> Performing gradient-based optimization:
>
> Iteration 0:   log restricted-likelihood = -4875.1075
> Iteration 1:   log restricted-likelihood = -4870.6476
> Iteration 2:   log restricted-likelihood = -4870.5095
> Iteration 3:   log restricted-likelihood = -4870.4438
> Iteration 4:   log restricted-likelihood = -4870.4118  (backed up)
> Iteration 5:   log restricted-likelihood = -4870.4039  (backed up)
> Iteration 6:   log restricted-likelihood = -4870.3999  (backed up)
> Iteration 7:   log restricted-likelihood = -4870.3979  (backed up)
> Iteration 8:   log restricted-likelihood = -4870.3969  (backed up)
> Iteration 9:   log restricted-likelihood = -4870.3967  (backed up)
> numerical derivatives are approximate
> nearby values are missing
> Iteration 10:  log restricted-likelihood = -4870.3966  (backed up)
> numerical derivatives are approximate
> nearby values are missing
> Iteration 11:  log restricted-likelihood = -4870.3966  (backed up)
> numerical derivatives are approximate
> nearby values are missing
> numerical derivatives are approximate
> nearby values are missing
> Hessian has become unstable or asymmetric
>
> Mixed-effects REML regression                   Number of obs      =      6192
> Group variable: _all                            Number of groups   =         1
>                                                Obs per group: min =      6192
>                                                               avg =    6192.0
>                                                               max =      6192
>                                                Wald chi2(5)       =      9.26
> Log restricted-likelihood = -4875.1075          Prob > chi2        =    0.0991
>
> centquanti~2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
>   centyear2 |  -.0169763    .008528    -1.99   0.047     -.033691   -.0002616
> centfores~a2 |  -.0846178   .0568262    -1.49   0.136    -.1959951    .0267595
>    centgdp2 |  -.0173484   .0354612    -0.49   0.625    -.0868509    .0521542
> centlandar~2 |  -.4531947   .5468347    -0.83   0.407    -1.524971    .6185816
> centpopul~n2 |   .1910553   .0876979     2.18   0.029     .0191707      .36294
>       _cons |   .2434439   .4596746     0.53   0.596    -.6575018     1.14439
>
>  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
> _all: Identity               |
>               sd(R.country) |   2.684813          .
> _all: Identity               |
>                 sd(R.genus) |   .0579011          .
>                sd(Residual) |   .5155702          .
> LR test vs. linear regression:       chi2(2) =  7810.42   Prob > chi2 = 0.0000
>
> Note: LR test is conservative and provided only for reference.
> Warning: convergence not achieved; estimates are based on iterated EM
>
>
>
> Here the summarize output of all the variables:
>
> . sum centquantity2
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
> centquanti~2 |      6192    2.17e-09           1  -.0732263    34.3665
>
> . sum centyear2
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>   centyear2 |      6192           0    1.000024  -1.626886   1.626886
>
> . sum centforestarea2
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
> centfores~a2 |      6192   -.0043667     1.00682 -2.396995   2.746216
>
> . sum centgdp2
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>    centgdp2 |      6192   -.0835699    .8318088  -.3333735   5.257175
>
> . sum centlandarea2
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
> centlandar~2 |      6192   -.0336882    .9528875  -.6987395   2.490177
>
> . sum centpopulation2
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
> centpopul~n2 |      6192   -.0018452    1.069818  -.6711841   8.741787
>
> At last a short extract of the dataset, with the original quantity, in order to see the structure:
>
> quantity country genus   centyear2   centagriculturalland2
> 0        USA    Tursiops 1.19305        .6379885
> 0        USA    Tursiops 1.409968       .6239355
> 6.08     USA    Tursiops 1.626886       .6126072
> 40.29    USA    Ursus   -1.626886       .7022238
> 65.375   USA    Ursus   -1.409968       .7022238
> 140.255  USA    Ursus   -1.19305        .6926397
>
> In total 40 countries, 213 genus and 21 independendt variables over a period of 16 years with 6192 observations. As there is no hirarchical structure, there are no different levels. There is one level for the quantity exported and two randome effects at this level, the country and genus.
> After having transformed the quantity to z-scores, would you still recommend dividing it by 100k?
>
> Thank you
>
> Lukas
>
>
> -------- Original-Nachricht --------
>> Datum: Thu, 23 Jun 2011 14:41:44 -0400
>> Von: Joerg Luedicke <[email protected]>
>> An: [email protected]
>> Betreff: Re: st: xtmixed with nonrtolerance. What happens?
>
>> k stands for 1000 (as in kb=1000 bytes, for instance). What are your
>> Level 1 observations (i.e., the  6192)? If only 72 bears were exported
>> from the US in a given year then figures in the ballpark of hundreds
>> of thousands appear fairly high to me?
>>
>> J.
>>
>> On Thu, Jun 23, 2011 at 2:14 PM, "Lukas Bösch" <[email protected]> wrote:
>> > In my opinion the scales dont differ wildly.
>> > I am not a statistician though, so maybe you have a different opinion.
>> >
>> >
>> > . sum centgdp2
>> >
>> >    Variable |       Obs        Mean    Std. Dev.       Min
>>        Max
>> > -------------+--------------------------------------------------------
>> >    centgdp2 |      6192   -.0835699    .8318088  -.3333735
>> 5.257175
>> >
>> > . sum centlandarea2
>> >
>> >    Variable |       Obs        Mean    Std. Dev.       Min
>>        Max
>> > -------------+--------------------------------------------------------
>> > centlandar~2 |      6192   -.0336882    .9528875  -.6987395
>> 2.490177
>> >
>> > . sum centpopulation2
>> >
>> >    Variable |       Obs        Mean    Std. Dev.       Min
>>        Max
>> > -------------+--------------------------------------------------------
>> > centpopul~n2 |      6192   -.0018452    1.069818  -.6711841
>> 8.741787
>> >
>> > . sum centyear2
>> >
>> >    Variable |       Obs        Mean    Std. Dev.       Min
>>        Max
>> > -------------+--------------------------------------------------------
>> >   centyear2 |      6192           0    1.000024  -1.626886
>>   1.626886
>> >
>> > . sum centforestarea2
>> >
>> >    Variable |       Obs        Mean    Std. Dev.       Min
>>        Max
>> > -------------+--------------------------------------------------------
>> > centfores~a2 |      6192   -.0043667     1.00682 -2.396995
>> 2.746216
>> >
>> > The dependent variable is export. The export of wild animal and plant
>> products from one country to the rest of the world. For example: US export of
>> Bears in 1992: 72.
>> > Because I cannot sum up the export of different species to one export
>> figure, obviously bears and pearls are not the same, i have to deal with
>> those mixed models. Socioeconomic factors are set as fixed effects and the
>> genus and countries as the variable effects.
>> > As one species can be exported by different countries, the data is not
>> hierarchic and country and genus are cross-classified. Or i think this is
>> what it means. Two random effects at the same level for all observations.
>> Joerge, can you explain what you mean with dividing by 100k? What does the k
>> stand for?
>> >
>> > Thank you
>> >
>> > Lukas
>> >
>> > mixed modells-------- Original-Nachricht --------
>> >> Datum: Thu, 23 Jun 2011 09:47:55 -0400
>> >> Von: Joerg Luedicke <[email protected]>
>> >> An: [email protected]
>> >> Betreff: Re: st: xtmixed with nonrtolerance. What happens?
>> >
>> >> Your model did not converge using the default convergence criteria and
>> >> with -nonrtolerance- you just turned off that default criteria
>> >> (though, I do not know what criteria is used instead?). However, you
>> >> should be very cautious with regard to the results.
>> >>
>> >> What is your dependent variable? From your output I gather that its
>> >> predicted mean is roughly 900k at average values of your covariates.
>> >> Maybe you should transform your dependent variable and fit the model
>> >> again (e.g., dividing it by 100k).
>> >>
>> >> A question in regards to your random effects: are -country- and
>> >> -genus- cross-classified?
>> >>
>> >> J.
>> >>
>> >> On Thu, Jun 23, 2011 at 6:21 AM, "Lukas Bösch" <[email protected]>
>> wrote:
>> >> > I transformed the data to z-scores (score-mean/stdeviation) before
>> doing
>> >> the regression.
>> >> > What do you mean with differing scales? I have either percents, for
>> >> example % forest area, or absolute figures, for example land area, in
>> my
>> >> dataset, but they are all transformed and should therefore be uniform.
>> >> > What about nonrtolerance?
>> >> >
>> >> > Thank you
>> >> >
>> >> > Lukas
>> >> >
>> >> > -------- Original-Nachricht --------
>> >> >> Datum: Wed, 22 Jun 2011 18:48:22 -0400
>> >> >> Von: Stas Kolenikov <[email protected]>
>> >> >> An: [email protected]
>> >> >> Betreff: Re: st: xtmixed with nonrtolerance. What happens?
>> >> >
>> >> >> It looks like you have data with wildly differing scales. I
>> understand
>> >> >> that you need to interpret the results in the original scales, but
>> >> >> maybe you could rescale your variables so that all of your
>> >> >> coefficients would be about 1. Whether that will help convergence is
>> >> >> anybody's telling, of course, but usually differences in the scales
>> >> >> (and hence coefficients) of the order of 1e3-1e4 are detrimental to
>> >> >> numeric convergence.
>> >> >>
>> >> >> On Wed, Jun 22, 2011 at 4:33 PM, "Lukas Bösch" <[email protected]>
>> >> wrote:
>> >> >> > Dear Statalist community.
>> >> >> >
>> >> >> > I am using Stata 10.0 and doing a mixed model analysis of export
>> >> data.
>> >> >> > After trying different options and always having trouble to get a
>> >> >> propper output i finally found a way to get to my results. I however
>> >> could not
>> >> >> find any information about why it works and if it is allright. But
>> let
>> >> us
>> >> >> first start with the problem:
>> >> >> >
>> >> >> > 1) This is the command i enter and the output stata creates:
>> >> >> >
>> >> >> > xtmixed quantity year centforestarea2 centgdp2 centlandarea2
>> >> >> centpopulation2 || _all: R.country || _all: R.genus
>> >> >> >
>> >> >> > Performing EM optimization:
>> >> >> >
>> >> >> > Performing gradient-based optimization:
>> >> >> >
>> >> >> > Iteration 0:   log restricted-likelihood = -77051.164
>> >> >> > Iteration 1:   log restricted-likelihood = -77046.704
>> >> >> > Iteration 2:   log restricted-likelihood = -77046.565
>> >> >> > Iteration 3:   log restricted-likelihood =   -77046.5
>> >> >> > Iteration 4:   log restricted-likelihood = -77046.468  (backed
>> up)
>> >> >> > Iteration 5:   log restricted-likelihood =  -77046.46  (backed
>> up)
>> >> >> > Iteration 6:   log restricted-likelihood = -77046.456  (backed
>> up)
>> >> >> > Iteration 7:   log restricted-likelihood = -77046.454  (backed
>> up)
>> >> >> > numerical derivatives are approximate
>> >> >> > nearby values are missing
>> >> >> > Iteration 8:   log restricted-likelihood = -77046.453  (backed
>> up)
>> >> >> > numerical derivatives are approximate
>> >> >> > nearby values are missing
>> >> >> > Hessian has become unstable or asymmetric
>> >> >> >
>> >> >> > Mixed-effects REML regression                   Number of
>> >> obs
>> >> >>      =      6192
>> >> >> > Group variable: _all
>>  Number
>> >> of
>> >> >> groups   =         1
>> >> >> >
>> >> >> >
>> >> >>  Obs per group: min =      6192
>> >> >> >
>> >> >>               avg =    6192.0
>> >> >> >
>> >> >>               max =      6192
>> >> >> >
>> >> >>  Wald chi2(5)       =      9.26
>> >> >> > Log restricted-likelihood = -77051.164          Prob > chi2
>> >> >>    =    0.0991
>> >> >> >    quantity |      Coef.   Std. Err.      z    P>|z|
>> >> >> [95% Conf. Interval]
>> >> >> >        year |  -429.7599   215.8898    -1.99   0.047
>> >> >>  -852.8961   -6.623654
>> >> >> > centfores~a2 |  -9875.264   6631.861    -1.49   0.136
>> >> >>  -22873.47    3122.945
>> >> >> >    centgdp2 |  -2024.629   4138.469    -0.49   0.625
>> >> >>  -10135.88    6086.621
>> >> >> > centlandar~2 |  -52889.76   63817.96    -0.83   0.407
>> >> >>  -177970.7    72191.13
>> >> >> > centpopul~n2 |   22296.98   10234.72     2.18   0.029
>> >> >> 2237.304    42356.66
>> >> >> >       _cons |   895402.2   433369.4     2.07   0.039
>> >> >> 46013.74     1744791
>> >> >> >
>> >> >> >  Random-effects Parameters  |   Estimate   Std. Err.
>> [95%
>> >> >> Conf. Interval]
>> >> >> >
>> >> >> > _all: Identity               |
>> >> >> >               sd(R.country) |   313329.2          .
>> >> >> > _all: Identity               |
>> >> >> >                 sd(R.genus) |   6757.304          .
>> >> >> >                sd(Residual) |   60169.26          .
>> >> >> > LR test vs. linear regression:       chi2(2) =  7810.42
>> Prob >
>> >> >> chi2 = 0.0000
>> >> >> >
>> >> >> > Note: LR test is conservative and provided only for reference.
>> >> >> > Warning: convergence not achieved; estimates are based on iterated
>> EM
>> >> >> >
>> >> >> > Obviously Stata has a problem and can't calculate the standard
>> errors
>> >> of
>> >> >> the random factors.
>> >> >> >
>> >> >> > 2) With the option nonrtolerance it works however:
>> >> >> >
>> >> >> > xtmixed quantity year centforestarea2 centgdp2 centlandarea2
>> >> >> centpopulation2 || _all: R.country || _all: R.genus, nonrtolerance
>> >> >> >
>> >> >> > Performing EM optimization:
>> >> >> >
>> >> >> > Performing gradient-based optimization:
>> >> >> >
>> >> >> > Iteration 0:   log restricted-likelihood = -77051.164
>> >> >> > Iteration 1:   log restricted-likelihood = -77046.704
>> >> >> > Iteration 2:   log restricted-likelihood = -77046.565
>> >> >> > Iteration 3:   log restricted-likelihood =   -77046.5
>> >> >> > Iteration 4:   log restricted-likelihood = -77046.468  (backed
>> up)
>> >> >> > Iteration 5:   log restricted-likelihood =  -77046.46  (backed
>> up)
>> >> >> > Iteration 6:   log restricted-likelihood = -77046.456  (backed
>> up)
>> >> >> >
>> >> >> > Computing standard errors:
>> >> >> >
>> >> >> > Mixed-effects REML regression                   Number of
>> >> obs
>> >> >>      =      6192
>> >> >> > Group variable: _all
>>  Number
>> >> of
>> >> >> groups   =         1
>> >> >> >
>> >> >> >
>> >> >>  Obs per group: min =      6192
>> >> >> >
>> >> >>               avg =    6192.0
>> >> >> >
>> >> >>               max =      6192
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >>  Wald chi2(5)       =      9.22
>> >> >> > Log restricted-likelihood = -77046.456          Prob > chi2
>> >> >>    =    0.1008
>> >> >> >    quantity |      Coef.   Std. Err.      z    P>|z|
>> >> >> [95% Conf. Interval]
>> >> >> >        year |  -429.7645   216.4073    -1.99   0.047
>> >> >> -853.915   -5.614053
>> >> >> > centfores~a2 |  -9885.307    6647.52    -1.49   0.137
>> >> >>  -22914.21    3143.592
>> >> >> >    centgdp2 |  -2021.312   4148.464    -0.49   0.626
>> >> >>  -10152.15    6109.527
>> >> >> > centlandar~2 |  -52859.75   63778.66    -0.83   0.407
>> >> >>  -177863.6    72144.12
>> >> >> > centpopul~n2 |   22276.96   10257.46     2.17   0.030
>> >> >> 2172.715     42381.2
>> >> >> >       _cons |   895338.1   434389.3     2.06   0.039
>> >> >> 43950.68     1746726
>> >> >> >
>> >> >> >  Random-effects Parameters  |   Estimate   Std. Err.
>> [95%
>> >> >> Conf. Interval]
>> >> >> > _all: Identity               |
>> >> >> >               sd(R.country) |   313133.2    36075.6
>> >> >>  249840.9    392459.4
>> >> >> > _all: Identity               |
>> >> >> >                 sd(R.genus) |   3440.288   1355.694
>> >> >>  1589.157    7447.712
>> >> >> >                sd(Residual) |   60315.87   545.9681
>> >> >>  59255.23     61395.5
>> >> >> > LR test vs. linear regression:       chi2(2) =  7819.83
>> Prob >
>> >> >> chi2 = 0.0000
>> >> >> > Note: LR test is conservative and provided only for reference.
>> >> >> >
>> >> >> > Can someone explain to me why it works with nonrtolerance and tell
>> me
>> >> if
>> >> >> these outputs are as reliable as if they were created without
>> >> >> nonrtolerance. I searched in the stata help and on stata.com but
>> could
>> >> not find more
>> >> >> information about this.
>> >> >> >
>> >> >> > Kind regards
>> >> >> >
>> >> >> > Lukas
>> >> >> >
>> >> >> > --
>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren!
>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone
>> >> >> > *
>> >> >> > *   For searches and help try:
>> >> >> > *   http://www.stata.com/help.cgi?search
>> >> >> > *   http://www.stata.com/support/statalist/faq
>> >> >> > *   http://www.ats.ucla.edu/stat/stata/
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Stas Kolenikov, also found at http://stas.kolenikov.name
>> >> >> Small print: I use this email account for mailing lists only.
>> >> >>
>> >> >> *
>> >> >> *   For searches and help try:
>> >> >> *   http://www.stata.com/help.cgi?search
>> >> >> *   http://www.stata.com/support/statalist/faq
>> >> >> *   http://www.ats.ucla.edu/stat/stata/
>> >> >
>> >> > --
>> >> > NEU: FreePhone - kostenlos mobil telefonieren!
>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone
>> >> > *
>> >> > *   For searches and help try:
>> >> > *   http://www.stata.com/help.cgi?search
>> >> > *   http://www.stata.com/support/statalist/faq
>> >> > *   http://www.ats.ucla.edu/stat/stata/
>> >> >
>> >>
>> >> *
>> >> *   For searches and help try:
>> >> *   http://www.stata.com/help.cgi?search
>> >> *   http://www.stata.com/support/statalist/faq
>> >> *   http://www.ats.ucla.edu/stat/stata/
>> >
>> > --
>> > Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
>> > belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
>> > *
>> > *   For searches and help try:
>> > *   http://www.stata.com/help.cgi?search
>> > *   http://www.stata.com/support/statalist/faq
>> > *   http://www.ats.ucla.edu/stat/stata/
>> >
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> --
> NEU: FreePhone - kostenlos mobil telefonieren!
> Jetzt informieren: http://www.gmx.net/de/go/freephone
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- Re: st: xtmixed with nonrtolerance. What happens?
  - From: "Lukas Bösch" <[email protected]>
References:
- st: xtmixed with nonrtolerance. What happens?
  - From: "Lukas Bösch" <[email protected]>
- Re: st: xtmixed with nonrtolerance. What happens?
  - From: Stas Kolenikov <[email protected]>
- Re: st: xtmixed with nonrtolerance. What happens?
  - From: "Lukas Bösch" <[email protected]>
- Re: st: xtmixed with nonrtolerance. What happens?
  - From: Joerg Luedicke <[email protected]>
- Re: st: xtmixed with nonrtolerance. What happens?
  - From: "Lukas Bösch" <[email protected]>
- Re: st: xtmixed with nonrtolerance. What happens?
  - From: Joerg Luedicke <[email protected]>
- Re: st: xtmixed with nonrtolerance. What happens?
  - From: "Lukas Bösch" <[email protected]>
Prev by Date: st: Sending data to Stata using OLE Automation
Next by Date: st: Interval Regression and missing standard errors after using robust
Previous by thread: Re: st: xtmixed with nonrtolerance. What happens?
Next by thread: Re: st: xtmixed with nonrtolerance. What happens?
Index(es):
- Date
- Thread