Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: AW: AW: GLM family and link (default)

 From "Martin Weiss" To Subject st: AW: AW: GLM family and link (default) Date Mon, 14 Jun 2010 13:33:59 +0200

```<>

You are trying to estimates >10 parameters from 33 observations. That is a
problem no amount of wizardry in terms of different commands will be able to
overcome...

HTH
Martin

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von
mmolina@uniroma3.it
Gesendet: Montag, 14. Juni 2010 13:16
An: statalist@hsphsun2.harvard.edu
Betreff: st: AW: GLM family and link (default)

Thanks Martin.
I obtain these results with glm and probit commands:

glm	newproc	edu	train	skilled	quality	RD	n5a	rectech
d12b	obst_admin	tax	j30f	environ	if	a4b==1

Iteration	00:00	log	likelihood	=	-2.7185812

Generalized	linear	models	No.	of	obs	=	33
Optimization	:	ML	Residual	df	=	20
Scale	parameter	=	0.1139108
Deviance	=	2.278216706	(1/df)	Deviance	=
0.1139108
Pearson	=	2.278216706	(1/df)	Pearson	=	0.1139108

Variance	function:	V(u)	=	1	[Gaussian]
Link	function	:	g(u)	=	u	[Identity]

AIC	=	0.9526413
Log	likelihood	=	-2.718581171	BIC	=	-67.65193

OIM
newproc	Coef.	Std.	Err.	z	P>z	[95%	Conf.	Interval]

edu	-1.770208	1.005382	-1.76	0.078	-3.740721
0.2003048
train	0.4414615	0.1819173	2.43	0.015	0.08491	0.7980129
skilled	-0.0943376	0.2310215	-0.41	0.683	-0.5471315
0.3584562
quality	0.1508587	0.1473651	1.02	0.306	-0.1379717
0.4396891
RD	-8.58E-11	3.80E-10	-0.23	0.821	-8.31E-10
6.59E-10
n5a	8.95E-13	8.61E-13	1.04	0.299	-7.92E-13
2.58E-12
rectech	0.1898761	0.2048377	0.93	0.354	-0.2115985
0.5913507
d12b	-0.0024934	0.0028224	-0.88	0.377	-0.0080251
0.0030384
obst_admin	0.001645	0.0019013	0.87	0.387	-0.0020814
0.0053714
tax	0.0093363	0.0395494	0.24	0.813	-0.068179
0.0868516
j30f	-0.0119285	0.0366119	-0.33	0.745	-0.0836865
0.0598294
environ	-0.1940408	0.0896125	-2.17	0.03	-0.369678
-0.0184036
_cons	1.822709	0.8467213	2.15	0.031	0.163166
3.482252

probit	newproc	edu	train	skilled	quality	RD	n5a	rectech
d12b	obst_admin	tax	j30f	environ	if	a4b==1

note:	outcome	=	edu	<	0.75	predicts	data
perfectly	except	for
edu	==	0.75	subsample:
edu	dropped	and	4	obs	not	used

note:	rectech	!=	0	predicts	success	perfectly
rectech	dropped	and	7	obs	not	used

Iteration	00:00	log	likelihood	=	-11.791118
Iteration	01:00	log	likelihood	=	-3.7934494
Iteration	02:00	log	likelihood	=	-2.2015708
Iteration	03:00	log	likelihood	=	-1.2485201
Iteration	04:00	log	likelihood	=	-0.40922628
Iteration	05:00	log	likelihood	=	-0.17252348
Iteration	06:00	log	likelihood	=	-0.0510138
Iteration	07:00	log	likelihood	=	-0.01540391
Iteration	08:00	log	likelihood	=	-0.00477621
Iteration	09:00	log	likelihood	=	-0.00153132
Iteration	10:00	log	likelihood	=	-0.00050221
Iteration	11:00	log	likelihood	=	-0.00016739
Iteration	12:00	log	likelihood	=	-0.00005647
Iteration	13:00	log	likelihood	=	-0.00001923
Iteration	14:00	log	likelihood	=	-6.60E-06
Iteration	15:00	log	likelihood	=	-2.28E-06
Iteration	16:00	log	likelihood	=	-7.90E-07
Iteration	17:00	log	likelihood	=	-2.76E-07
Iteration	18:00	log	likelihood	=	-2.19E-07
Iteration	19:00	log	likelihood	=	-7.50E-08
Iteration	20:00	log	likelihood	=	-7.43E-08
Iteration	21:00	log	likelihood	=	-2.40E-08
Iteration	22:00	log	likelihood	=	-2.33E-08
Iteration	23:00	log	likelihood	=	-2.33E-08
(backed	up)
Iteration	24:00:00	log	likelihood	=	-2.42E-08
(backed	up)

Probit	regression	Number	of	obs	=	22
LR	chi2(10)	=	23.58
Prob	>	chi2	=	0.0088
Log	likelihood	=	-2.42E-08	Pseudo	R2	=	1

newproc	Coef.	Std.	Err.	z	P>z	[95%	Conf.	Interval]

train	35.10353	.	.	.	.	.
skilled	-31.46968	.	.	.	.	.
quality	-5.014139	.	.	.	.	.
RD	-2.77E-08	0.0001747	0	1	-0.0003424
0.0003424
n5a	-1.37E-08	8.75E-06	0	0.999	-0.0000172
0.0000171
d12b	-0.3628383	106.9626	0	0.997	-210.0057
209.28
obst_admin	1.958071	1313.844	0	0.999	-2573.129
2577.045
tax	2.494215	.	.	.	.	.
j30f	11.64055	2769.635	0	0.997	-5416.744
5440.026
environ	-10.03463	12953.29	0	0.999	-25398.02
25377.95
_cons	2.51716	7373.722	0	1	-14449.71	14454.75

Note:	3	failures	and	12	successes	completely
determined.

Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: AW: GLM family and link (default)
Date   Mon, 14 Jun 2010 13:01:47 +0200

----------------------------------------------------------------------------
----

<>

" Actually
this seems to work better than the probit command."

"Work better" is not an expression that conveys much to me. In which respect
did it work better?

Note you can replicate the linear probability model, -probit- and -logit-
via -glm-:

*************
sysuse auto, clear
reg foreign length weight
glm foreign length weight, family(gaussian) link(identity)  nolog

prob foreign length weight, nolog
glm foreign length weight, family(binomial 1) link(probit) nolog

logit foreign length weight, nolog
glm foreign length weight, family(binomial 1) link(logit) nolog
*************

-------------------------- Messaggio originale ---------------------------
Oggetto: GLM family and link (default)
Da:      mmolina@uniroma3.it
Data:    Lun, 14 Giugno 2010 12:36 pm
A:       statalist@hsphsun2.harvard.edu
--------------------------------------------------------------------------

Dear Statlist,

Looking at the glm help I found that the distribution of the dependent
variable -by default- is family(gaussian).

I am working with glm command, I did not specify any specific type of
family or link function, and I have a binary dependent variable. Actually
this seems to work better than the probit command.

As I don't have continuous Gaussian responses but binary ones, which
should be the distribution family and link function underlying this
command?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```