Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: weird behavior of append


From   "Klaus Pforr" <kpforr@googlemail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   AW: st: weird behavior of append
Date   Wed, 12 Sep 2012 17:35:01 +0200

<>

I also don't see the problem. Figuratively spoken, in the first append step
you put the first data junk A atop of the second data junk B.
id other
A X
B X
So you have N_A+N_B observations and |Union(A,B)| variables

In the second (or better third) append step you put at least the context of
id in different data columns
id id2 other
A . X
. B X

So you still have N_A+N_B observations. The number of variables should now
be |Union(A,B)|+1.

Sorry, no miracles here, I think...

Best

Klaus

__________________________________

Klaus Pforr
GESIS -- Leibniz Institut für Sozialwissenschaft
B2,1
Postfach 122155
D - 68072 Mannheim
Tel: +49 621 1246 298
Fax: +49 621 1246 100 
E-Mail: klaus.pforr@gesis.org
__________________________________


-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Joerg Luedicke
Gesendet: Mittwoch, 12. September 2012 17:21
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: weird behavior of append

I cannot spot a problem here? You have 11165 observations in one file and
259 observations in the other. Then 11165 + 259 = 11424 observations is what
you end up with after appending?

Joerg

On Wed, Sep 12, 2012 at 9:50 AM, Feiveson, Alan H. (JSC-SK311)
<alan.h.feiveson@nasa.gov> wrote:
> Hello  - In Stata 12 IC, I am trying to append a file of 259 observations
to one of 11165 observations. Both files contain only one variable named
"id" (see below). After appending, rather than having 259 new observations,
it appears that 259 observations have been lost, yet if I reduce the size of
the first file to 10000, the append seems to work. Also if the variables
have different names, I get even more weird results (see below). Anyone have
an explanation?
>
> Thanks,
>
> Al Feiveson
>
> ======================================================================
> ==
> . use temp1,clear
> . summ
>
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>           id |     11165    5205.994    98.91063       5000       5389
>
> . use temp2,clear
> . summ
>
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>           id |       259    5206.846    101.6719       5000       5388
>
> . append using temp1
> . summ
>
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>           id |     11424    5206.013     98.9696       5000       5389
>
>
> ======================================================================
> ==
> Now cut out some observations
>
> . use temp1,clear
> . keep in 1/10000
> (1165 observations deleted)
>
> . summ
>
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>           id |     10000    5189.761    91.46947       5000       5331
>
> . append using temp2
> . summ
>
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>           id |     10259    5190.192    91.77468       5000       5388
>
> This appears to be correct.
>
> ======================================================================
> == Now rename the variable in one of the files . use temp2,clear . des
>
> Contains data from temp2.dta
>   obs:           259
>  vars:             1                          12 Sep 2012 09:32
>  size:           518
>
----------------------------------------------------------------------------
------------------
>               storage  display     value
> variable name   type   format      label      variable label
>
----------------------------------------------------------------------------
------------------
> id              int    %10.0g                 ID
> ----------------------------------------------------------------------
> ------------------------
> Sorted by:  id
>
> . rename id id2
> . append using temp1
> . summ
>
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>          id2 |       259    5206.846    101.6719       5000       5388
>           id |     11165    5205.994    98.91063       5000       5389
>
> . count if id==. & id2<.
>   259
>
> . count if id2==. & id<.
> 11165
>
> So it appears that there should be 259 + 11165 observations, since 
> both conditions are exclusive. Yet
>
>
> . des
>
> Contains data from temp2.dta
>   obs:        11,424
>  vars:             2                          12 Sep 2012 09:32
>  size:        45,696
>
----------------------------------------------------------------------------
------------------
>               storage  display     value
> variable name   type   format      label      variable label
>
----------------------------------------------------------------------------
------------------
> id2             int    %10.0g                 ID
> id              int    %10.0g                 ID
> ----------------------------------------------------------------------
> ------------------------
> Sorted by:
>      Note:  dataset has changed since last saved
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index