Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: RE: AW: RE: RE: RE: RE: RE: Possible bug


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   st: AW: RE: AW: RE: RE: RE: RE: RE: Possible bug
Date   Sun, 8 Nov 2009 17:54:20 +0100

<> 

Frankly I cannot see the p. 1648 reference. The "drop" there is not the one
that I and Apostolos discussed yesterday. Note that in spite of the "w"
being zero for three observations, the number of obs is still 74 in the
analysis presented on this page.

***
sysuse auto, clear
rreg mpg weight foreign, genwt(w)
di in r "Number of obs: `e(N)'"
l if w==0, noo
***


I did not say that the fact was not mentioned in the manuals, my point was
that its prominence should be increased, and possibly augmented by a note...


HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Nick Cox
Gesendet: Sonntag, 8. November 2009 17:46
An: [email protected]
Betreff: st: RE: AW: RE: RE: RE: RE: RE: Possible bug

If the question is: Should users be expected to consult the manual in
case of puzzlement? then my answer is Yes. Evidently, precisely what
-rreg- does can not be gathered from the help, so the manual is the next
port of call. That's true also for most statistical commands. 

As you know, I am not responsible for the manuals, but I do see the
point being made clearly at [R] rreg, pages 1646, 1648, 1649 in version
11, so where's the mystery? 

Nick 
[email protected] 

Martin Weiss

I can see your point, but the fact that one or more of your observations
feature a Cook`s Distance exceeding one is probably lost on most users,
so
it is not unreasonable to regard it as "...an aspect of your data that
you
might not have noticed" as well.

Regarding the "advertisement" of this feature, if you are willing to
rely
solely on a pointer in the manuals, give it considerably more prominence
-
it took me some time and the reply from Apostolos to figure out what was
going on. An example highlighting this feature both in the manual and
the
help file would be a good idea.

If -rreg- is indeed heading for the scrap heap, the whole equation
changes
and I spent yesterday afternoon in vain :-(

Nick Cox

I have no objection to a warning being added. (Although it raises
different issues, I am also in favour of -rreg- going undocumented!) 

I think there is a very defensible difference here, however. It is in
essence an advertised feature that -rreg- will do this on occasion. 

In contrast, the -logit- message highlights an aspect of your data that
you might not have noticed. 

Nick 
[email protected] 

Martin Weiss

All of which makes me think that there is a strong case - even taking
into
account Stata`s reluctance to bombard users with warnings - to alert the
user to the drop and why it occurred. Why does this code (see [R], p.
907)

***
sysuse auto, clear
logit foreign mpg weight gear_ratio if !(foreign==0 & gear_ratio > 3.1)
***

trigger a warning " Note: 4 failures and 0 successes completely
determined."
and -rreg- does not?

Martin Weiss

So here is an illustration of the -drop- behavior of -rreg-: I increase
the
outlier quality of observation # 23 in  the auto dataset, first by
multiplying it by 1.5. In this case, as the -summarize- command shows,
its
cooks D stays below 1, so -rreg- uses all observations. 
In the last example, I increase the multiplier to 1.6, cooks D exceeds
1,
and number 23 now goes unused in -rreg-, as the last -list- command
shows.
You may want to check this for your dataset...

*******
sysuse auto, clear

//normal case: no drop in -rreg-
qui{
	reg mpg weight foreign
	predict cooksdist, cooksd
	su cooksdist, det
}

qreg mpg weight foreign, nolog
rreg mpg weight foreign, nolog

//multiply one obs by 1.5
//no drop in -rreg- yet as max(cooks D)
//still below 1
replace weight=1.5*weight in 23

qui{
	reg mpg weight foreign
	capt drop cooksdist
	predict cooksdist, cooksd
	noi su cooksdist, det
}

qreg mpg weight foreign, nolog
rreg mpg weight foreign, nolog

//reload autos to start anew
sysuse auto, clear

//multiply one obs by 1.6
//now drop in -rreg- yet as max(cooks D)
//exceeds 1
replace weight=1.6*weight in 23

qui{
	reg mpg weight foreign
	capt drop cooksdist
	predict cooksdist, cooksd
	noi su cooksdist, det
}

qreg mpg weight foreign, nolog
rreg mpg weight foreign, nolog

//see unused obs (23)
l if !e(sample)
*******

Apostolos Ballas
Sent: Samstag, 7. November 2009 13:57

Thanks for the suggestion. You are quite right - I cannot reproduce it.
The
observation that is being dropped has the max value in one of the
independent variables. Is there an explanation for this?

Martin Weiss

Can you reproduce this with a built-in dataset? I cannot:

*******
sysuse auto, clear
rreg mpg weight foreign, nolog
qreg mpg weight foreign, nolog
rreg mpg weight foreign length turn, nolog qreg mpg weight foreign
length
turn, nolog
*******

Also capture the one observation that -rreg- omits via -l if !e(sample)-
after estimation of -rreg-, and see what is special about it...

Apostolos Ballas

I am running a regression model using both quantile regression and
robust
regression. In my output, robust regression reports 1 less observation
than
quantile regression (which reports the right number of observations in
my
sample). Is this is a feature of robust regression, am I missing
something,
or is it a bug?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index