# st: RE: why is F statistic still missing even though there is no singleton dummy problem?

 From "Schaffer, Mark E" To Subject st: RE: why is F statistic still missing even though there is no singleton dummy problem? Date Fri, 18 Aug 2006 11:00:15 +0100

```Jian Zhang,

In fact, it is a singleton dummy problem.  The key variables are var5
and var6.  You can drop var3 and

reg var1 var5 var6, nocons robust

gives you the same problem.

The way this arises is as follows.  var5 and var6 are collinear, with
the exception of observation 5:

+-------------+
| var5   var6 |
|-------------|
1. |    0      0 |
2. |    0      0 |
3. |   -4      4 |
4. |    0      0 |
5. |   14     -7 |
|-------------|
6. |    0      0 |
7. |    0      0 |
8. |    0      0 |
9. | -123    123 |
10. |    0      0 |
|-------------|
11. |    0      0 |
12. |    0      0 |
13. |    0      0 |
+-------------+

After your regression (or the regression dropping var3), the residual
for observation 5 is essentially zero:

. reg var1 var5 var6, nocons robust

<snip>

. predict double e, resid

. list e in 5

+-----------+
|         e |
|-----------|
5. | 1.927e-13 |
+-----------+

Recall that the robust var-cov matrix comes from the inverse of X'e*e'X,
where e is the residual and X is the matrix of regressors.  Thus each of
the observations of var5 and var6 is getting weighted by the residual
for that observation.  But after weighting by e, var5 and var6 collinear
because the residual for observation 5 is zero, and observation 5 was
the only thing that stopped them from being collinear.  The result is a
var-cov matrix that is not full rank.

To see that this is the same thing as the singleton dummy problem,
create a new variable var567 which is a linear transformation of var5
and var6:

. gen var567=(var5+var6)/7

. list var5 var6 var567

+----------------------+
| var5   var6   var567 |
|----------------------|
1. |    0      0        0 |
2. |    0      0        0 |
3. |   -4      4        0 |
4. |    0      0        0 |
5. |   14     -7        1 |
|----------------------|
6. |    0      0        0 |
7. |    0      0        0 |
8. |    0      0        0 |
9. | -123    123        0 |
10. |    0      0        0 |
|----------------------|
11. |    0      0        0 |
12. |    0      0        0 |
13. |    0      0        0 |
+----------------------+

This new variable is a singleton dummy.  But since it's a linear
transformation of var5 and var6, you can replace either var5 or var6 and
you get the same regression, e.g.,

. qui regress var1 var5 var6, nocons robust

. di _b[var6]
.84687073

. di e(mss)
91.557676

. qui regress var1 var5 var56, nocons robust

. di _b[var56]
.84687073

. di e(mss)
91.557676

--Mark

Prof. Mark E. Schaffer
Director
Centre for Economic Reform and Transformation
Department of Economics
School of Management & Languages
Heriot-Watt University
Edinburgh EH14 4AS  UK
44-131-451-3494 direct
44-131-451-3296 fax
http://www.sml.hw.ac.uk/cert

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Jian Zhang
> Sent: Friday, August 18, 2006 12:51 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: why is F statistic still missing even though
> there is no singleton dummy problem?
>
> Dear Statalisters,
>
> I am running a ols regression on a small data set.  The
> reported F statistic is still mssing although the data set
> doesn' have so-called singleton dummies.  Is there anyone
> knowing what is going on?  Here are the data set and the
> regression results.  Many thanks!
>
>
> . list var1 var3 var5 var6
>
> +---------------------------+
>       var1   var3   var5   var6
> ---------------------------
> 1.     1    789      0      0
> 2.     3     45      0      0
> 3.     5   2358     -4      4
> 4.     4     65      0      0
> 5.     5     12     14     -7
> ---------------------------
> 6.   453     12      0      0
> 7.     6      4      0      0
> 8.    45      2      0      0
> 9.     8      3   -123    123
> 10.   897      5      0      0
> ---------------------------
> 11.    43     87      0      0
> 12.    43     56      0      0
> 13.     4     25      0      0
> +---------------------------+
>
>
>  reg var1 var3 var5 var6,	robust nocons
>
> Linear regression			Number of obs	=      13
> 			F(  2,    10)	=       .
> 			Prob > F	=       .
> 			R-squared	=  0.0002
> 			Root MSE	=  318.67
>
>
> 	Robust
> var1       Coef.	Std. Err.      t	P>t	[95%
> Conf.	Interval]
>
> var3    .0046225	.0031     1.49	0.167	-.0022847	.0115297
> var5    .7696625	.0060938   126.30	0.000
> .7560848	.7832403
> var6    .8329636	.007418   112.29	0.000
> .8164353	.8494919
>
>
>
>
>
>
>
> Best regards,
> Jian Zhang
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```