Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: weird behavior of append


From   "Feiveson, Alan H. (JSC-SK311)" <alan.h.feiveson@nasa.gov>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: weird behavior of append
Date   Wed, 12 Sep 2012 09:50:43 -0500

Hello  - In Stata 12 IC, I am trying to append a file of 259 observations to one of 11165 observations. Both files contain only one variable named "id" (see below). After appending, rather than having 259 new observations, it appears that 259 observations have been lost, yet if I reduce the size of the first file to 10000, the append seems to work. Also if the variables have different names, I get even more weird results (see below). Anyone have an explanation?

Thanks,

Al Feiveson

========================================================================
. use temp1,clear
. summ

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          id |     11165    5205.994    98.91063       5000       5389

. use temp2,clear
. summ

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          id |       259    5206.846    101.6719       5000       5388

. append using temp1
. summ

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          id |     11424    5206.013     98.9696       5000       5389


========================================================================
Now cut out some observations

. use temp1,clear
. keep in 1/10000
(1165 observations deleted)

. summ

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          id |     10000    5189.761    91.46947       5000       5331

. append using temp2
. summ

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          id |     10259    5190.192    91.77468       5000       5388

This appears to be correct.

========================================================================
Now rename the variable in one of the files
. use temp2,clear
. des

Contains data from temp2.dta
  obs:           259                          
 vars:             1                          12 Sep 2012 09:32
 size:           518                          
----------------------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
----------------------------------------------------------------------------------------------
id              int    %10.0g                 ID
----------------------------------------------------------------------------------------------
Sorted by:  id

. rename id id2
. append using temp1
. summ

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
         id2 |       259    5206.846    101.6719       5000       5388
          id |     11165    5205.994    98.91063       5000       5389

. count if id==. & id2<.
  259

. count if id2==. & id<.
11165

So it appears that there should be 259 + 11165 observations, since both conditions are exclusive. Yet


. des

Contains data from temp2.dta
  obs:        11,424                          
 vars:             2                          12 Sep 2012 09:32
 size:        45,696                          
----------------------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
----------------------------------------------------------------------------------------------
id2             int    %10.0g                 ID
id              int    %10.0g                 ID
----------------------------------------------------------------------------------------------
Sorted by:  
     Note:  dataset has changed since last saved


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index