Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: AW: RE: AW: questions about panel data analysis and outliers.

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: AW: RE: AW: questions about panel data analysis and outliers.
Date	Mon, 17 May 2010 12:59:24 +0100

"Not hard" here could be read as "easy": although I do try to use
simpler words if they serve the purpose, the tone is not the same. 

Any way, here is a Stata example: 

set obs 100
set seed 2803
gen y = _n + 10 * rnormal()
gen x = _n
scatter y x
replace y = 0 if x == 90
scatter y x

Nick 
[email protected] 

P.S. See also 
<http://en.wikipedia.org/wiki/Sidney_Morgenbesser> 

"During a lecture the Oxford linguistic philosopher J. L. Austin made
the claim that although a double negative in English implies a positive
meaning, there is no language in which a double positive implies a
negative. To which Morgenbesser responded in a dismissive tone, "Yeah,
yeah." (Some have it quoted as "Yeah, right.")"

Martin Weiss
============

" True, although it's not hard to find outliers on a bivariate
distribution which aren't outliers on either marginal."


The "double negation" is baffling me. Is it hard or not to find them?

Nick Cox
========

True, although it's not hard to find outliers on a bivariate
distribution which aren't outliers on either marginal. 

Other answers include 

1. Omit the putative outlier and see how much difference it makes.

2. Decide you should be using a transformation or non-identity link
function. 

Martin Weiss
============

" My second question is: how can I estimate correlation without
outliers?"

You can qualify on the candidate variables not being outliers based on
their
univariate distribution, if that is what you mean:


*************
sysuse auto, clear
qui su mpg,d
gen byte within=inrange(mpg, r(p5), r(p95))
qui su weight,d
gen byte within2=inrange(weight, r(p5), r(p95))
corr mpg weight if within & within2
corr mpg weight
*************

Amatoallah ouchen

I have a panel data (T=3 and N=45) and  I want to perform a robust
regression, so I would like to know if it is ok if I cope with this
just as a simple  cross sectional  analysis (because the number of my
time serie is so thin).
what do you think about that?
My second question is: how can I estimate correlation without outliers?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: RE: AW: RE: AW: questions about panel data analysis and outliers.
  - From: amatoallah ouchen <[email protected]>

References:
- st: questions about panel data analysis and outliers.
  - From: amatoallah ouchen <[email protected]>
- st: AW: questions about panel data analysis and outliers.
  - From: "Martin Weiss" <[email protected]>
- st: RE: AW: questions about panel data analysis and outliers.
  - From: "Nick Cox" <[email protected]>
- st: AW: RE: AW: questions about panel data analysis and outliers.
  - From: "Martin Weiss" <[email protected]>

Prev by Date: RE: st: How to rename value label names to match variable names?
Next by Date: st: RE: Comparing Variable Name Labels Between Datasets
Previous by thread: st: AW: RE: AW: questions about panel data analysis and outliers.
Next by thread: Re: st: RE: AW: RE: AW: questions about panel data analysis and outliers.
Index(es):
- Date
- Thread