Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
bcoric@efst.hr |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: mmregress question |

Date |
Sat, 3 Dec 2011 11:16:57 +0100 (CET) |

Thank you. I figure out what was the problem. Namely, I did not use command ?set seed- before -sregress- and -mmregress-. In the case I select the same starting subsets, as you suggests, by ?set seed-, I get identical results from run to run. Moreover, when I exclude observations which are detected as outliers (by implementation -mmregress (varlist), outlier graph label (varname)-) from the sample, or include dummy variables for these observations than I get very similar results for OLS (-regress-) and robust regression (-mmregress-). Thank you very much, once again. Bruno Bruno, perhaps you don't need -mmreg- or -msreg- and you can take > another approach for your data. > > 1. Identify multivariate outliers in the predictor variables with the > authors -mcd- command ("findit"). It also depends on p-subsets but > there's no harm trying. > > 2. If you find observations clearly separate from the main group, they are > potential high leverage points. Take them out. > > 3. Now run -rreg- and -qreg- or, better, -bsqreg- on the reduced data set. > These do not depend on subset identification and without likely > high-leverage points should have decent resistance properties > > Without knowing more about your data, seeing your commands, and exactly > where iteration fails (in the estimation of the scale parameter or of the > regression coefficients, it is difficult to say more. > > Good luck, > > Steve > > > > > > > I should have pointed out, as you did not, that -mmregress- is a > contributed package written by Verardi and Croux. It can be found by > -findit-. > > I already stated how to get the identical results from run to run; reread > my post. > > I also mentioned that dummy variables will give -mmregress- and > -sregress- trouble and stated that -msregress-, another part of the > package, must be used. Since you don't mention trying -mmregress-, I > guess that you have no dummy variables. > > > Other issues that could problems: > > * Too many predictors for your number of observations) > > In that case, try a reduced model > > * The model is misspecified > > I can think up data configurations where the wrong model, even for a > single predictor, could cause -mmregress- and -ssregress- to be unstable. > So you might, for example, try a fractional polynomial, instead of linear > terms; or consider interactions (I know this conflicts with the parsimony > recommendation). > > > Lower on the list of causes of your problems: > > * Using too many or too few subsets. The -help- shows how to vary the > number. > > * Not using the latest version > > > I've run -mmregress- repeatedly on several data sets of the same size as > yours and have never encountered the instability you've seen. So I > suspect the problem is unique to your set-up, not "how to select > appropriate starting subsample". > > > Steve > > > > > > > > On Dec 2, 2011, at 2:56 AM, bcoric@efst.hr wrote: > > Than you Steve. > > I was using sregress and mmregress respectively as it stressed at the p. > 444. However, results change significantly in each case when I repeat > these two commands. > > You suggest that the reason is that the algorithm implemented in stata > starts by randomly picking N subset of p observation. > Could you please tell me how to set same N subsets before implement > mmregress. > > Could you also tell me please do you have any recommendation how to select > appropriate results. Namely, selection of different starting subset will > probably result with different results again. If this is the case than, > question is, how to select appropriate starting subsample? > > Bruno > >> >> The FAQ ask that you give precise references; in this case you are >> presumably referring to: Verardi, V., & Croux, C. (2009). Robust >> regression in Stata. Stata Journal, 9(3), 439-453. >> >> The article states that -mmregress- starts with an initial S-estimator >> and >> (page 444) that >> >> "The algorithm implemented in Stata for computing the S-estimator starts >> by randomly picking N subsets of p observation (defined as p-subset) >> where >> p is the number of regression parameters to estimate." >> >> In other words, unless the same seed is -set- before each instance of >> -mmregress-, one would expect the results to differ from run to run. >> >> Also, if there are indicator ("dummy") variables in your model, >> -mmregress- will have problems (p 445). If you have dummies, you should >> instead run -msregress-, part of the same package, and list them only >> with >> the dummy() option. >> >> If you still are having difficulties, I suggest that you contact the >> package's authors. >> >> Steve >> >> >> >> >> >> >> >> >> >> On Dec 1, 2011, at 12:59 PM, bcoric@efst.hr wrote: >> >> I would like to ask for help with implementation of stata command >> mmregress. >> I am doing simple cross section analysis. Since I have just 48 >> observations I am very much concerned about the possible influence of >> outliers. >> Previously I was using stata command rreg for such chases. However, I >> came >> across the paper, Robust regression in Stata (2009), which argues that >> rreg command does not have expected robustness properties and recommend >> mmregress instead. >> However, I faced some problems in its implementation. Namely, its >> subsequent implementation to the same model leads to different values of >> regression coefficients. Moreover, detected outliers also change. >> I presume, that algorithm use iterative procedure taking previous >> estimates as starting values. However, results are very different in >> respect to the sign, size and significance of coefficients, and do not >> converge after subsequent applications. >> Does anyone can tell me is there anything, some procedure that I should >> follow, to get robust (consistent) results. >> Bruno Coric >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: mmregress question***From:*bcoric@efst.hr

**Re: st: mmregress question***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: mmregress question***From:*bcoric@efst.hr

**Re: st: mmregress question***From:*Steve Samuels <sjsamuels@gmail.com>

- Prev by Date:
**Re: st: mmregress question** - Next by Date:
**st: How can I count the number of variables that meet a certain condition per observation?** - Previous by thread:
**Re: st: mmregress question** - Next by thread:
**st: Event within dates** - Index(es):