# RE: st: programming non-linear squares in Stata

 From "Maarten Buis" To Subject RE: st: programming non-linear squares in Stata Date Fri, 16 Mar 2007 12:19:12 +0100

```--- Maritza <maritzasotomayor@yahoo.com> wrote:

> Thank you, I'll try glm, in fact my dependent variable is a
> proportion, I need to keep the zeros. I read in FAQ that is better to
> replace zeros for 0.0001 , but that option doesn't work for me.

--- Maarten buis replied:
> With -glm- you can keep the zeros. With -betafit- you can't.

This point is discussed in (Papke and Wooldridge 1996). However, if you
believe that replacing zeros with a small positive proportion doesn't
work for you, than that suggest to me that choosing a zero is something
qualitatively different for a respondent than choosing some small
positive proportion. As result you should model two processes: 1) whether
or not to choose zero, and 2) if you don't choose zero, what proportion
to choose. As far as I know there is no ready-made Stata program out
there that does that.

Such a model could be a ``Zero Inflated Logit'', by analogy of the
``Zero Inflated Poisson''. Below you find an example of a zero inflated
logit program. This is just something I quickly wrote, so no guarantees.
See -help zip-, the entry for zip in the manual and (long and freese 2006)
for more on the zip. You should be able to work out the analogy and the
interpretation of this -zilogit- from that.

Hope this helps,
Maarten

*----------- begin example ---------------
set more off
input      prop str1 site variety
0.0005    A       1
0.0000    A       2
0.0000    A       3
0.0010    A       4
0.0025    A       5
0.0005    A       6
0.0050    A       7
0.0130    A       8
0.0150    A       9
0.0150    A       10
0.0000    B       1
0.0005    B       2
0.0005    B       3
0.0030    B       4
0.0075    B       5
0.0030    B       6
0.0300    B       7
0.0750    B       8
0.0100    B       9
0.1270    B       10
0.0125    C       1
0.0125    C       2
0.0250    C       3
0.1660    C       4
0.0250    C       5
0.0250    C       6
0.0000    C       7
0.2000    C       8
0.3750    C       9
0.2625    C       10
0.0250    D       1
0.0050    D       2
0.0001    D       3
0.0300    D       4
0.0250    D       5
0.0001    D       6
0.2500    D       7
0.5500    D       8
0.0500    D       9
0.4000    D       10
0.0550    E       1
0.0100    E       2
0.0600    E       3
0.0110    E       4
0.0250    E       5
0.0800    E       6
0.1650    E       7
0.2950    E       8
0.2000    E       9
0.4350    E       10
0.0100    F       1
0.0500    F       2
0.0500    F       3
0.0500    F       4
0.0500    F       5
0.0500    F       6
0.1000    F       7
0.0500    F       8
0.5000    F       9
0.7500    F       10
0.0500    G       1
0.0010    G       2
0.0500    G       3
0.0500    G       4
0.5000    G       5
0.1000    G       6
0.5000    G       7
0.2500    G       8
0.5000    G       9
0.7500    G       10
0.0500    H       1
0.1000    H       2
0.0500    H       3
0.0500    H       4
0.2500    H       5
0.7500    H       6
0.5000    H       7
0.7500    H       8
0.7500    H       9
0.7500    H       10
0.1750    I       1
0.2500    I       2
0.4250    I       3
0.5000    I       4
0.3750    I       5
0.9500    I       6
0.6250    I       7
0.9500    I       8
0.9500    I       9
0.9500    I       10
end

encode site, gen(sitenum)

program define zilogit_lf
*! MLB 0.0.1 16 Mar 2006
version 8.2
args lnf xb zg
tempvar mu muprime

quietly gen double `mu' = invlogit(`xb')
quietly gen double `muprime' = invlogit(-`xb')

quietly replace `lnf' =  ln(invlogit(-`zg')) +    ///
\$ML_y1*ln(`mu') +        ///
(1-\$ML_y1)*ln(`muprime') ///
if (\$ML_y1 > 0)
quietly replace `lnf' =  ln(invlogit(`zg') +      ///
invlogit(-`zg') *        ///
ln(`muprime'))           ///
if (\$ML_y1 == 0)
end
xi i.site i.variety
ml model lf zilogit_lf (xb:prop = _I*) (zg:_Iv*), robust
ml check
ml search
ml maximize

exit
*--------------- end example ----------------------
(For more on how to use examples I sent to the Statalist, see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

Papke, Leslie E. and Jeffrey M. Wooldridge. 1996. ``Econometric Methods
for Fractional Response Variables with an Application to 401(k) Plan
Participation Rates.'' Journal of Applied Econometrics 11(6):619-632.

J. Scott Long and Jeremy Freese. 2006. ``Regression Models for
Categorical Dependent Variables Using Stata.'' 2nd Edition. Stata Press.

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```