Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Convergence never achieved with MI impute chained


From   Lena Lindbjerg Sperling <lenalindbjergsperling@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Convergence never achieved with MI impute chained
Date   Fri, 22 Jun 2012 13:07:03 +0200

Thank you Maarten. 
I am in developing countries:-) And what we are trying to explore is the development in wages and movement of workers across sectors, so I can't really destroy the industry category as this is our final interest. It can't predict that if there are no self-employed in the mining sector, then it should just not assign any of them to that sector? I mean how empty is too empty?

Hmm I have to think of other possibilities...

Best,
Lena


Den Jun 21, 2012 kl. 2:22 PM skrev Maarten Buis:

> I have just seen that you used only the first digit of the ISIC
> classification. However, it contains lots of sparse categories(*).
> These will cause problems. Also inspect education for sparse
> categories. You'll need to combine those sparse categories with
> "adjacent" categories in order to get sufficiently filled cells. Also
> look at a cross tabulation of industry and education, and see if there
> aren't any cells that are too empty. That will probably mean a second
> round of merging categories.
> 
> I would not use ordered models or mvn for imputing industry, that just
> does not make sense.
> 
> Hope this helps,
> Maarten
> 
> (*) If this is recent data from a western country than you have made a
> coding error. In that case there are way way way too many farmers.
> 
> On Thu, Jun 21, 2012 at 1:46 PM, Lena Lindbjerg Sperling
> <lenalindbjergsperling@gmail.com> wrote:
>>> 
>>> Thank you for your answer!
>>> 
>>> It does seem though that all occupations are represented in both private and public sectors.
>>> And I also have another data set where I only impute educational level, industry (ISIC 3 category) and wage and I still get not convergence, even though that's just one mlogit, one ologit and one pmm...so that doesn't seem to be the problem.
>>> 
>>> I got a result out for the mi xeq 0: mlogit for industry however and it looks like this:
>>> ->    mlogit  industry
>> Iteration       0:00    log     likelihood      =       -4875.9554
>> Iteration       1:00    log     likelihood      =       -4875.9554
>> Multinomial     logistic        regression      Number  of      obs     =
>> LR      chi2(0) =       0
>> Prob    >       chi2    =       .
>> Log     likelihood      =       -4875.9554      Pseudo  R2      =
>> industry        Coef.   Std.    Err.    z       P>z     [95%
>> Agriculture__Hunting__etc_      (base   outcome)
>> Mining
>> _cons   -4.982464       0.2896632       -17.2   0       -5.550194       -4.414735
>> Manufacturing
>> _cons   -2.671581       0.0939994       -28.42  0       -2.855816       -2.487345
>> Public_services
>> _cons   -3.42432        0.134593        -25.44  0       -3.688117       -3.160522
>> Construction
>> _cons   -3.204691       0.1210617       -26.47  0       -3.441968       -2.967415
>> Retail__Hotels
>> _cons   -1.714798       0.0612048       -28.02  0       -1.834758       -1.594839
>> Transport_and_telecomnunications
>> _cons   -4.759321       0.2593031       -18.35  0       -5.267546       -4.251096
>> Finance_and_business_serv_
>> _cons   -6.368759       0.5778449       -11.02  0       -7.501314       -5.236204
>> Communal_services
>> _cons   -0.830113       0.0433825       -19.13  0       -0.9151412      -0.7450848
>> Others_not_well_specified
>> _cons   -1.753638       0.0622235       -28.18  0       -1.875594       -1.631683
>>> 
>>> Should I use something else to impute this? It runs from 1 to 10 so maybe ordered is better? I get convergence if I use ordered logit for industry and occupation. They really shouldn't be ordered, but how important is that choice?
>>> 
>>> 
>>> I can get results out if I use mvn, but is that a very bad idea? Seems like the literature disagrees quite a bit on how severe it is to assume normality?
>>> 
>>> Best,
>>> Lena
>>> 
>>> Den Jun 21, 2012 kl. 10:48 AM skrev Maarten Buis:
>>> 
>>>> On Thu, Jun 21, 2012 at 10:15 AM, Lena Lindbjerg Sperling wrote:
>>>>> I just looked at the mail again, and the data is not as bad as it looks, as I'm only imputing on the employed population (lstatus==1) and when we only look at them mi describe shows:
>>>>> mi describe
>>>>> 
>>>>> Style:  wide
>>>>>         last mi update 21jun2012 10:03:51, 18 seconds ago
>>>>> 
>>>>> Obs.:   complete        2,702
>>>>>         incomplete        912  (M = 0 imputations)
>>>>>         ---------------------
>>>>>         total           3,614
>>>>> 
>>>>> Vars.:  imputed:  7; occup(126) ocusec(144) whours(167) edulevel(171) ocu(228) industry(204) mwage(598)
>>>> 
>>>> Just looking at the variable names I suspect that this is an extremely
>>>> hard model to estimate. How many categories do the variables occup,
>>>> ocusec, ocu, and industry have? Are there combinations of three or
>>>> less of these that for some observations perfectly predict one or more
>>>> remaining variables? For example, if we know that someone is a mayor
>>>> than we also know that (s)he is working in the public sector.
>>>> 
>>>> <snip>
>>>>> Iteration 14:  log pseudolikelihood = -2454486.7  (not concave)
>>>>> Not completely sure what this means. Can you see where things are wrong from this?
>>>> 
>>>> It means that this sub-model did not converge, probably because of the
>>>> problems indicated above.
>>>> 
>>>>> When I use -mi xeq 0: mlogit - the result is:
>>>>> m=0 data:
>>>>> -> mlogit
>>>>> last estimates not found
>>>>> r(301);
>>>>> 
>>>>> But I thought it was the observed data...which should be there?
>>>> 
>>>> What you asked for was for Stata to replay the last -mlogit- command,
>>>> and it replied that the last command wasn't -mlogit-. You probably
>>>> pressed break before the model finished estimating, which makes sense
>>>> if it did not converge.
>>>> 
>>>> Hope this helps,
>>>> Maarten
>>>> 
>>>> --------------------------
>>>> Maarten L. Buis
>>>> Institut fuer Soziologie
>>>> Universitaet Tuebingen
>>>> Wilhelmstrasse 36
>>>> 72074 Tuebingen
>>>> Germany
>>>> 
>>>> 
>>>> http://www.maartenbuis.nl
>>>> --------------------------
>>>> 
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> 
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> 
> -- 
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
> 
> 
> http://www.maartenbuis.nl
> --------------------------
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index