Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: AW: RE: AW: questions about panel data analysis and outliers.


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: AW: RE: AW: questions about panel data analysis and outliers.
Date   Mon, 17 May 2010 12:59:24 +0100

"Not hard" here could be read as "easy": although I do try to use
simpler words if they serve the purpose, the tone is not the same. 

Any way, here is a Stata example: 

set obs 100
set seed 2803
gen y = _n + 10 * rnormal()
gen x = _n
scatter y x
replace y = 0 if x == 90
scatter y x

Nick 
n.j.cox@durham.ac.uk 

P.S. See also 
<http://en.wikipedia.org/wiki/Sidney_Morgenbesser> 

"During a lecture the Oxford linguistic philosopher J. L. Austin made
the claim that although a double negative in English implies a positive
meaning, there is no language in which a double positive implies a
negative. To which Morgenbesser responded in a dismissive tone, "Yeah,
yeah." (Some have it quoted as "Yeah, right.")"

Martin Weiss
============

" True, although it's not hard to find outliers on a bivariate
distribution which aren't outliers on either marginal."


The "double negation" is baffling me. Is it hard or not to find them?

Nick Cox
========

True, although it's not hard to find outliers on a bivariate
distribution which aren't outliers on either marginal. 

Other answers include 

1. Omit the putative outlier and see how much difference it makes.

2. Decide you should be using a transformation or non-identity link
function. 

Martin Weiss
============

" My second question is: how can I estimate correlation without
outliers?"

You can qualify on the candidate variables not being outliers based on
their
univariate distribution, if that is what you mean:


*************
sysuse auto, clear
qui su mpg,d
gen byte within=inrange(mpg, r(p5), r(p95))
qui su weight,d
gen byte within2=inrange(weight, r(p5), r(p95))
corr mpg weight if within & within2
corr mpg weight
*************

Amatoallah ouchen

I have a panel data (T=3 and N=45) and  I want to perform a robust
regression, so I would like to know if it is ok if I cope with this
just as a simple  cross sectional  analysis (because the number of my
time serie is so thin).
what do you think about that?
My second question is: how can I estimate correlation without outliers?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index