[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Hugh Colaco" <hmjc66@gmail.com> |

To |
statalist <statalist@hsphsun2.harvard.edu> |

Subject |
st: Converting SAS code into Stata code |

Date |
Wed, 10 Dec 2008 08:34:35 -0500 |

Dear Statalisters, I was given some code in SAS and need to translate it into Stata. My dataset is in Stata. I have attempted the translation, but would appreciate if someone would check it. I don't fully understand the files that the author of the SAS code has created (at the beginning of the code), but the bottom line is that the data consists of years 2002-2007. I have the same variables listed below for all these years, each year in a separate file. In my Stata translation below, I have used the 2002 data (original02.dta) as an example. But I will do the same for the other years as well. Each file is very big (300MB, on average), so I'd rather treat each one separately. I am using Stata10. SAS code libname tmp1 'c:\original'; data tr1; set tmp1.original1; data tr22; set tmp1.original2; data tr33; set tmp1.original3; data tmp1.original0207; set tmp1.original0203 tmp1.original04 tmp1.original05 tmp1.original06; /* create v2 variable & recode largest values*/ data original; set tr1 tr22 tr33; if v1='5MM+' then v1='5000000'; if v1='1MM+' then v1='1000000'; /* remove v1 under 100k)*/ data original;set original; v2=input(v1,8.); if v2>=100000; run; data original; set original; proc sort nodupkey; by v3 v4 v5 v6 v7; /* remove canceled)*/ data canceled (keep= v8 v9 v10); set original; if v8='C'; data canceled (drop=v8); set canceled; rename v9=v4; x=1; run; proc sort data=canceled; by v10 v4; proc sort data=original; by v10 v4; data original; merge original canceled; by v10 v4; if x=1 then delete; if v8='C' then delete; /* remove corrected)*/ data corrected (keep= v8 v9 v10); set original; if v8='W'; data corrected (drop=v8); set corrected; rename v9=v4; x=1; run; proc sort data=corrected; by v10 v4; data original; merge original corrected; by v10 v4; if x=1 then delete; run; /* remove price values)*/ data original; set original; if v11 = 'N'; run; /* (create a file with the cleaned original data)*/ data tmp1.original_clean100k; set original; run; Equivalent Stata code #delimit; use "C:\original02.dta", clear; replace v1="5000000" if v1=="5MM+"; replace v1="1000000" if v1=="1MM+"; destring v1, gen(v2); keep if v2>=100000; sort v3 v4 v5 v6 v7; duplicates drop v3 v4 v5 v6 v7, force; save temp, replace; keep if v8=="C"; keep v9 v10; rename v9 v4; gen x=1; sort v10 v4; save temp1, replace; use temp, clear; sort v10 v4; merge v10 v4 using temp1; drop if x==1 | v8=="C"; keep if v8=="W"; keep v9 v10; rename v9 v4; gen x=1; sort v10 v4; save temp2, replace; use temp, clear; sort v10 v4; merge v10 v4 using temp2; drop if x==1; keep if v11 == "N"; save original02_clean100k, replace; Thanks in advance, -- Hugh * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Converting SAS code into Stata code***From:*"Scott Merryman" <scott.merryman@gmail.com>

- Prev by Date:
**Re: st: Non-standard categorical data test - help!** - Next by Date:
**st: RE: re: combined Correlationmatrix Pearson and Spearman + LaTeX output** - Previous by thread:
**st: xtabond2 on growth gdp** - Next by thread:
**Re: st: Converting SAS code into Stata code** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |