Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: [problem with outliers in regression]

From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: [problem with outliers in regression]
Date   Tue, 6 Aug 2002 18:15:31 +0100

Nazaria Solferino
> Hello! I'm a new Stata user and I'm not very good at
> using it yet. I hope some one can help with my
> problem. I've a large dataset, with some outliers, and
> I'd like to manage variables, that I have, only in a
> restricted range(without dropping observations) I've
> thought I could give a zero value to all veriables
> outside a certain range so I mean I should generate a
> newvar=oldvar then replace newvar=0 if outside the
> range. First, is this a sattistical correct procedure?

(Please use informative titles on Statalist messages.) 

No. Stata will take the new values of 0 just as literally 
as the old outlying values. 

One way to exclude outliers is by an -if- condition: 

regress y x1 x2 x3 if y < 10000 

Naturally, there are other approaches to your problem 

1. a robust technique. I've found -qreg- very good. 

2. transformation. 

3. -glm- with a nonlinear link (e.g. log). 

> Second, if it's correct, how coukld I realize that
> with stat without find each interval with centile
> command for each variable, but realize a general
> program that I can apply to each variables?

That depends partly on what your project is. But 
my guess is that -qreg- or -glm- might offer 
a more general approach than what you propose 
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index