[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: AW: RE: AW: RE: RE: RE: RE: RE: Possible bug |

Date |
Sun, 8 Nov 2009 17:03:58 -0000 |

I was referring to "We discover that 3 observations in our data were dropped altogether (they have weight 0)." (p.1648) The general point is still made twice in other places. As there is evidence that this is not enough for two users (presuming Apostolos also consulted the manual), then StataCorp should consider flagging it even more vigorously. Drop if D > 1 is one of the reasons I don't like -rreg-, by the way. I dislike methods based on arbitrary combinations of arbitrary rules, and -rreg- scores high on that. Nick n.j.cox@durham.ac.uk Martin Weiss Frankly I cannot see the p. 1648 reference. The "drop" there is not the one that I and Apostolos discussed yesterday. Note that in spite of the "w" being zero for three observations, the number of obs is still 74 in the analysis presented on this page. *** sysuse auto, clear rreg mpg weight foreign, genwt(w) di in r "Number of obs: `e(N)'" l if w==0, noo *** I did not say that the fact was not mentioned in the manuals, my point was that its prominence should be increased, and possibly augmented by a note... Nick Cox If the question is: Should users be expected to consult the manual in case of puzzlement? then my answer is Yes. Evidently, precisely what -rreg- does can not be gathered from the help, so the manual is the next port of call. That's true also for most statistical commands. As you know, I am not responsible for the manuals, but I do see the point being made clearly at [R] rreg, pages 1646, 1648, 1649 in version 11, so where's the mystery? Nick n.j.cox@durham.ac.uk Martin Weiss I can see your point, but the fact that one or more of your observations feature a Cook`s Distance exceeding one is probably lost on most users, so it is not unreasonable to regard it as "...an aspect of your data that you might not have noticed" as well. Regarding the "advertisement" of this feature, if you are willing to rely solely on a pointer in the manuals, give it considerably more prominence - it took me some time and the reply from Apostolos to figure out what was going on. An example highlighting this feature both in the manual and the help file would be a good idea. If -rreg- is indeed heading for the scrap heap, the whole equation changes and I spent yesterday afternoon in vain :-( Nick Cox I have no objection to a warning being added. (Although it raises different issues, I am also in favour of -rreg- going undocumented!) I think there is a very defensible difference here, however. It is in essence an advertised feature that -rreg- will do this on occasion. In contrast, the -logit- message highlights an aspect of your data that you might not have noticed. Nick n.j.cox@durham.ac.uk Martin Weiss All of which makes me think that there is a strong case - even taking into account Stata`s reluctance to bombard users with warnings - to alert the user to the drop and why it occurred. Why does this code (see [R], p. 907) *** sysuse auto, clear logit foreign mpg weight gear_ratio if !(foreign==0 & gear_ratio > 3.1) *** trigger a warning " Note: 4 failures and 0 successes completely determined." and -rreg- does not? Martin Weiss So here is an illustration of the -drop- behavior of -rreg-: I increase the outlier quality of observation # 23 in the auto dataset, first by multiplying it by 1.5. In this case, as the -summarize- command shows, its cooks D stays below 1, so -rreg- uses all observations. In the last example, I increase the multiplier to 1.6, cooks D exceeds 1, and number 23 now goes unused in -rreg-, as the last -list- command shows. You may want to check this for your dataset... ******* sysuse auto, clear //normal case: no drop in -rreg- qui{ reg mpg weight foreign predict cooksdist, cooksd su cooksdist, det } qreg mpg weight foreign, nolog rreg mpg weight foreign, nolog //multiply one obs by 1.5 //no drop in -rreg- yet as max(cooks D) //still below 1 replace weight=1.5*weight in 23 qui{ reg mpg weight foreign capt drop cooksdist predict cooksdist, cooksd noi su cooksdist, det } qreg mpg weight foreign, nolog rreg mpg weight foreign, nolog //reload autos to start anew sysuse auto, clear //multiply one obs by 1.6 //now drop in -rreg- yet as max(cooks D) //exceeds 1 replace weight=1.6*weight in 23 qui{ reg mpg weight foreign capt drop cooksdist predict cooksdist, cooksd noi su cooksdist, det } qreg mpg weight foreign, nolog rreg mpg weight foreign, nolog //see unused obs (23) l if !e(sample) ******* Apostolos Ballas Sent: Samstag, 7. November 2009 13:57 Thanks for the suggestion. You are quite right - I cannot reproduce it. The observation that is being dropped has the max value in one of the independent variables. Is there an explanation for this? Martin Weiss Can you reproduce this with a built-in dataset? I cannot: ******* sysuse auto, clear rreg mpg weight foreign, nolog qreg mpg weight foreign, nolog rreg mpg weight foreign length turn, nolog qreg mpg weight foreign length turn, nolog ******* Also capture the one observation that -rreg- omits via -l if !e(sample)- after estimation of -rreg-, and see what is special about it... Apostolos Ballas I am running a regression model using both quantile regression and robust regression. In my output, robust regression reports 1 less observation than quantile regression (which reports the right number of observations in my sample). Is this is a feature of robust regression, am I missing something, or is it a bug? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Possible bug***From:*"Apostolos Ballas" <aballas@aueb.gr>

**st: RE: Possible bug***From:*"Martin Weiss" <martin.weiss1@gmx.de>

**st: RE: RE: Possible bug***From:*"Apostolos Ballas" <aballas@aueb.gr>

**st: RE: RE: RE: Possible bug***From:*"Martin Weiss" <martin.weiss1@gmx.de>

**st: RE: RE: RE: RE: Possible bug***From:*"Martin Weiss" <martin.weiss1@gmx.de>

**st: RE: RE: RE: RE: RE: Possible bug***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**st: AW: RE: RE: RE: RE: RE: Possible bug***From:*"Martin Weiss" <martin.weiss1@gmx.de>

**st: RE: AW: RE: RE: RE: RE: RE: Possible bug***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**st: AW: RE: AW: RE: RE: RE: RE: RE: Possible bug***From:*"Martin Weiss" <martin.weiss1@gmx.de>

- Prev by Date:
**st: AW: RE: AW: RE: RE: RE: RE: RE: Possible bug** - Next by Date:
**st: utility to create fake dataset?** - Previous by thread:
**st: AW: RE: AW: RE: RE: RE: RE: RE: Possible bug** - Next by thread:
**st: macro length exceeded** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |