Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Merge question.


From   adiallo5@worldbank.org
To   statalist@hsphsun2.harvard.edu
Subject   st: RE: Merge question.
Date   Wed, 30 Aug 2006 17:58:30 -0400

Sure Nick,

Here is my code:

cd
u "$path\EEEFS II\Fichiers CSB\fs"
destring ident codefs_, replace
mer ident codefs_ using community
ta _m
drop if _m==2
drop _m
cou
so ident codefs_
sa "$path\EEEFS II\Fichiers CSB\fscomm", replace


And I get the results below (whether I destring or not, tostring,
sort stable, etc...).


First attempt:

. cd
C:\Documents and Settings\My Documents\Archives\data\Madagasc
> ar HFS\EEEFS II\Fichiers COMMUNAUTAIRE

. u "$path\EEEFS II\Fichiers CSB\fs"

. destring ident codefs_, replace
ident already numeric; no replace
codefs_ already numeric; no replace

. mer ident codefs_ using community
codefs_ was int now long
ident was int now long
(note: case_id is str9 in using data but will be long now)
(label yn already defined)

. ta _m

     _merge |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |        173       12.12       12.12
          2 |      1,152       80.73       92.85
          3 |        102        7.15      100.00
------------+-----------------------------------
      Total |      1,427      100.00

. drop if _m==2
(1152 observations deleted)

. drop _m

. cou
  275

. so ident codefs_

. sa "$path\EEEFS II\Fichiers CSB\fscomm", replace
file C:\Documents and Settings\My Documents\Archives\data\Mad
> agascar HFS\\EEEFS II\Fichiers CSB\fscomm.dta saved







Second attempt:

. cd
C:\Documents and Settings\My Documents\Archives\data\Madagasc
> ar HFS\EEEFS II\Fichiers COMMUNAUTAIRE

. u "$path\EEEFS II\Fichiers CSB\fs"

. destring ident codefs_, replace
ident already numeric; no replace
codefs_ already numeric; no replace

. mer ident codefs_ using community
codefs_ was int now long
ident was int now long
(note: case_id is str9 in using data but will be long now)
(label yn already defined)

. ta _m

     _merge |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |        176       12.31       12.31
          2 |      1,155       80.77       93.08
          3 |         99        6.92      100.00
------------+-----------------------------------
      Total |      1,430      100.00

. drop if _m==2
(1155 observations deleted)

. drop _m

. cou
  275

. so ident codefs_

. sa "$path\EEEFS II\Fichiers CSB\fscomm", replace
file C:\Documents and Settings\My Documents\Archives\data\Mad
> agascar HFS\\EEEFS II\Fichiers CSB\fscomm.dta saved








Third attempt:

. cd
C:\Documents and Settings\My Documents\Archives\data\Madagasc
> ar HFS\EEEFS II\Fichiers COMMUNAUTAIRE

. u "$path\EEEFS II\Fichiers CSB\fs"

. destring ident codefs_, replace
ident already numeric; no replace
codefs_ already numeric; no replace

. mer ident codefs_ using community
codefs_ was int now long
ident was int now long
(note: case_id is str9 in using data but will be long now)
(label yn already defined)

. ta _m

     _merge |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |        181       12.61       12.61
          2 |      1,160       80.84       93.45
          3 |         94        6.55      100.00
------------+-----------------------------------
      Total |      1,435      100.00

. drop if _m==2
(1160 observations deleted)

. drop _m

. cou
  275

. so ident codefs_

. sa "$path\EEEFS II\Fichiers CSB\fscomm", replace
file C:\Documents and SettingsMy Documents\Archives\data\Mad
> agascar HFS\\EEEFS II\Fichiers CSB\fscomm.dta saved



etc...







Interactivelly:


. use "C:\Documents and Settings\My Documents\Archives\data\M
> adagascar HFS\EEEFS II\Fichiers COMMUNAUTAIRE\community.dta", clear

. so ident codefs_

. mer ident codefs_ using "C:\Documents and Settings\My Docum
> ents\Archives\data\Madagascar HFS\EEEFS II\Fichiers CSB\fs.dta"
(note: case_id is long in using data but will be str9 now)
(label yn already defined)

. ta _m

     _merge |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |      1,160       80.84       80.84
          2 |        181       12.61       93.45
          3 |         94        6.55      100.00
------------+-----------------------------------
      Total |      1,435      100.00


(this is stable, both ways. I've made minor changes to my data now I have 94
matches instead of 96).



Actually, I am going interactivelly, but any help would
be the most welcomed.
Best regards.
Amadou.



-------------------------------------------------------------------------------------
Nick wrote:

                                                                                 
                                                                                 
                                                                                 
 Without ruling out the possibility that a -merge-                               
 expert can give you useful advice, this still                                   
 looks like a guessing game in which guessing is no                              
 fun.                                                                            
                                                                                 
 You give us lots of details, but still nothing                                  
 concrete about your datasets or your .do file.                                  
                                                                                 
 A small version in which your problem is evident                                
 is the ideal here.                                                              
                                                                                 
 Naturally, I realise that you are inhibited by                                  
 the Statalist rule of not sending attachments,                                  
 but there are alternatives:                                                     
                                                                                 
 0. Include a listing of your .do file.                                          
                                                                                 
 1. Contact tech support at StataCorp.                                           
                                                                                 
 2. Put the files on a website so that anyone                                    
 interested can download.                                                        
                                                                                 
 3. Offer to send the files to volunteer testers                                 
 (not me).                                                                       
                                                                                 
 Nick                                                                            
 n.j.cox@durham.ac.uk                                                            
                                                                                 
 adiallo5@worldbank.org                                                          
                                                                                 
 >  I am trying to merge 2 datasets.                                             
 >  But everytime, I get different results                                       
 >  (_m==3 has 83 observations in the                                            
 >  first time, 97 in the second, 100 in the                                     
 >  third and 96 in the fourth, and so on).                                      
 >  I tried to set seed and made my sort, stable.                                
 >  With no success. I also tried to recast double                               
 >  my merging identifier. No success. I tried to                                
 >  tostring it. No success either.                                              
 >  Any hints why I obtain these various results?                                
 >  I verified in both Stata and Excel.                                          
 >  I do not understand why Stata marked 3 to some                               
 >  observations that belonged to both datasets in the                           
 >  first trial and not in the second time.                                      
 >  Best regards.                                                                
 >  Amadou.                                                                      
 >                                                                               
 >  PS: When I work interractivelly, I do not have that problem.                 
 >  I have 96 observations that matched. So what I am doing                      
 >  wrong in my stata do file?                                                   
                                                                                 
 ------------------------------------------------------------------------------- 


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index