Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: dropping observations (outputs)


From   [email protected]
To   [email protected]
Subject   RE: st: dropping observations (outputs)
Date   Mon, 18 Jun 2007 15:49:27 +0200

dear all,
here there are the outputs I obtain when I drop observations (after the 
heading OUTLIERS) in two different ways, which should be equivalent in terms 
of the regression sample I use, but in fact are not...
thank you very much for your attention
Mariarosaria


-------------------------------------------------------------------------------
--------------------------------------------------------
      
. clear

. set memory 400m

. use c:\data\prin\stockingdatapanel\stockdata

. tsset id year
       panel variable:  id (strongly balanced)
        time variable:  year, 2000 to 2006

. 
. 
.  *OUTLIERS
. centile(earn_shar bookvalshar  closp_jun), centile(0.5, 99.5)

                                                       -- Binom. Interp. --
    Variable |     Obs  Percentile      Centile        [95% Conf. Interval]
-------------+-------------------------------------------------------------
   earn_shar |     535         .5     -155.0676        -1435.97   -3.818947*
             |               99.5        702.09        407.5342     1013.94*
 bookvalshar |     542         .5             0             -.7         .36*
             |               99.5      5846.003        5143.451    10543.75*
   closp_jun |    1214         .5           .05             .04    .3109532
             |               99.5       4660.25        2092.229    5164.371

 Lower (upper) confidence limit held at minimum (maximum) of sample


. 
. drop if earn_shar<  -155.0676 
(2 observations deleted)

. drop if earn_shar>   702.09 
(860 observations deleted)

. drop if bookvalshar< 0 
(1 observation deleted)

. drop if bookvalshar>  5846.003 
(1 observation deleted)

. drop if closp_jun<    .05 
(0 observations deleted)

. drop if closp_jun>      4660.25 
(26 observations deleted)


. 
*******************************************************************************
*
. **  Dep Var= closing price of june       ****
. 
*******************************************************************************
*
. g p_6mafter=f.closp_jun
(189 missing values generated)

. 
. ************************************
. *******FE model
. ************************************

. 
. xi: xtreg p_6mafter earn_shar  bookvalshar    i.year  , fe
i.year            _Iyear_2000-2006    (naturally coded; _Iyear_2000 omitted)

Fixed-effects (within) regression               Number of obs      =       314
Group variable (i): id                          Number of groups   =       157

R-sq:  within  = 0.1391                         Obs per group: min =         1
       between = 0.4608                                        avg =       2.0
       overall = 0.4804                                        max =         6

                                                F(7,150)           =      3.46
corr(u_i, Xb)  = -0.6242                        Prob > F           =    0.0018

------------------------------------------------------------------------------
   p_6mafter |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   earn_shar |   2.463018    1.20279     2.05   0.042       .08642    4.839617
 bookvalshar |   .3692797   .1047816     3.52   0.001     .1622412    .5763183
 _Iyear_2001 |   17.42329   36.93282     0.47   0.638    -55.55247    90.39905
 _Iyear_2002 |   58.12118   36.98952     1.57   0.118    -14.96661     131.209
 _Iyear_2003 |   38.97863   36.85583     1.06   0.292    -33.84499    111.8023
 _Iyear_2004 |    44.2795   36.24992     1.22   0.224    -27.34691    115.9059
 _Iyear_2005 |   46.22128   42.66067     1.08   0.280    -38.07216    130.5147
 _Iyear_2006 |  (dropped)
       _cons |   -55.7415    36.0336    -1.55   0.124    -126.9405    15.45748
-------------+----------------------------------------------------------------
     sigma_u |  221.96966
     sigma_e |   129.6183
         rho |  .74571609   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0:     F(156, 150) =     4.32            Prob > F = 0.0000

. 
. 
end of do-file









. clear

. set memory 400m

. 
. use c:\data\prin\stockingdatapanel\stockdata

. tsset id year
       panel variable:  id (strongly balanced)
        time variable:  year, 2000 to 2006

. 
. 
.  *OUTLIERS
. centile(earn_shar bookvalshar  closp_jun), centile(0.5, 99.5)

                                                       -- Binom. Interp. --
    Variable |     Obs  Percentile      Centile        [95% Conf. Interval]
-------------+-------------------------------------------------------------
   earn_shar |     535         .5     -155.0676        -1435.97   -3.818947*
             |               99.5        702.09        407.5342     1013.94*
 bookvalshar |     542         .5             0             -.7         .36*
             |               99.5      5846.003        5143.451    10543.75*
   closp_jun |    1214         .5           .05             .04    .3109532
             |               99.5       4660.25        2092.229    5164.371

 Lower (upper) confidence limit held at minimum (maximum) of sample


. 
. drop if earn_shar<  -155.0676 & earn_shar!=.
(2 observations deleted)

. drop if earn_shar>   702.09 & earn_shar!=.
(2 observations deleted)

. drop if bookvalshar< 0 & bookvalshar!=.
(1 observation deleted)

. drop if bookvalshar>  5846.003 & bookvalshar!=.
(1 observation deleted)

. drop if closp_jun<    .05 & closp_jun!=.
(4 observations deleted)

. drop if closp_jun>      4660.25 & closp_jun!=.
(6 observations deleted)

. 
. 
*******************************************************************************
*
. **  Dep Var= closing price of june       ****
. 
*******************************************************************************
*
. g p_6mafter=f.closp_jun
(335 missing values generated)

. 
. 
. ************************************
. *******FE model
. ************************************
. 
. xi: xtreg p_6mafter earn_shar  bookvalshar   i.year  , fe
i.year            _Iyear_2000-2006    (naturally coded; _Iyear_2000 omitted)

Fixed-effects (within) regression               Number of obs      =       463
Group variable (i): id                          Number of groups   =       185

R-sq:  within  = 0.1076                         Obs per group: min =         1
       between = 0.4208                                        avg =       2.5
       overall = 0.4480                                        max =         6

                                                F(7,271)           =      4.67
corr(u_i, Xb)  = -0.2414                        Prob > F           =    0.0001

------------------------------------------------------------------------------
   p_6mafter |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   earn_shar |   .3493612   .5254214     0.66   0.507    -.6850655    1.383788
 bookvalshar |    .335608   .0774586     4.33   0.000     .1831109    .4881051
 _Iyear_2001 |   8.676075    27.0212     0.32   0.748    -44.52208    61.87423
 _Iyear_2002 |   51.34743   27.08677     1.90   0.059    -1.979827    104.6747
 _Iyear_2003 |   31.29411   26.56825     1.18   0.240    -21.01229    83.60052
 _Iyear_2004 |   38.94315   24.84464     1.57   0.118    -9.969898     87.8562
 _Iyear_2005 |   41.47083   25.08507     1.65   0.099    -7.915569    90.85723
 _Iyear_2006 |  (dropped)
       _cons |  -23.16047   23.71827    -0.98   0.330    -69.85596    23.53502
-------------+----------------------------------------------------------------
     sigma_u |  169.01447
     sigma_e |  99.859362
         rho |  .74124375   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0:     F(184, 271) =     7.52            Prob > F = 0.0000

 
. 
end of do-file

-------------------------------------------------------------------------------
--------------------------------------------------------

Citazione Nick Cox <[email protected]>:

> I think we need to see some precise evidence of 
> what you think is problematic. Thus, we need to 
> see _exactly_ what you typed and _exactly_ what 
> Stata did -- as the FAQ advises. 
> 
> Nick 
> [email protected] 
> 
> [email protected]
>  
> > thank you for your answers,
> > evidently my question was not clear:
> > I am aware that STATA does not drop missing observations when 
> > using the 
> > regress command, it just does not use them, but I expect the 
> > same regression 
> > sample: 
> > if I drop the missing observations before the regression as 
> > if I just run the regression without dropping missing values...
> > thus, I don't understand why stata runs the regression on 
> > different samples 
> > when I use the two commands described before
> 
> Maarten buis 
> 
> > > --- [email protected] wrote:
> > > > when I use the following command:
> > > > drop if x>450
> > > > STATA drops a lot of observations, while when  I exclude missing
> > > > values as  follows:
> > > > drop if x>450 & x!=.
> > > > STATA eliminates just a couple of observations 
> > > 
> > > This is well known behaviour: In Stata missing values are 
> > the largest
> > > possible values, so a missing value will be larger than 
> > 450. As result
> > > if you type -drop if x>450- the missing values will also be dropped.
> > > 
> > > > I realized this when I run a regression including x as 
> > regressor. If
> > > > STATA drops missing data with the first command,  
> > shouldn't drop the
> > > > same observations when I run the regression after using the second
> > > > command?
> > > 
> > > I don't think I understand the question. Do you think that -regress-
> > > should influence the way -drop- behaves?
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 




-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index