Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: Converting a SAS datastep to Stata

 From Daniel Feenberg <[email protected]> To "'[email protected]'" <[email protected]> Subject RE: st: Converting a SAS datastep to Stata Date Wed, 15 Dec 2010 16:33:37 -0500 (EST)

```
On Wed, 15 Dec 2010, Nick Cox wrote:

```
If I understand that SAS code correctly, and I've never used SAS in my life, an equivalent would be
```
gen lvalue1 = expr1 if flpdyr > 1993 & flpdyr < 1998
gen lvalue2 = expr2 if flpdyr > 1993 & flpdyr < 1998

In fact

if flpdyr > 1993 & flpdyr < 1998

could be translated to

if inrange(flpdyr, 1994, 1997)

```
which isn't much shorter but is likely to match the way you think more closely. (Here I am taking it from context that year variables take integer values only.)
```
```
Yes, integers only, and the range statement is very clear, however consider that there are 18 lines of code for calculating the tax on capital gains income in 2003, then 15 lines used only for 2004, etc for 21 years. While I personally blame the congress for the frequent tax law changes, that isn't relevant for this mailing list.
```
```
Here is the SAS code for capital gains under the alternative minimum tax for a single year:
```
if FLPDYR eq 2003 then do;
_amt5pc = min(c24533,min(c24532,min(c62700,c24517)));
_amt5pc = max(0,_amt5pc);
c62747 = .05*_amt5pc;
_line49 = max(0,min(c24532,min(c24517,c62700)) - _amt5pc);
_line50 = sum(e24583,0);
_amt8pc = min(_line49,_line50);
c62749 = .08*_amt8pc;
_amt10pc = _line49 - _amt8pc;
c62750 =  .1*_amt10pc;
_line55 = c24533 - _amt5pc;
_line56 = min(c24517,c62700) - min(c24532,min(c24517,c62700));
_amt15pc = min(_line55,_line56);
c62755 =  .15*_amt15pc;
_amt20pc = _line56 - _amt15pc;
c62760 =  .2*_amt20pc;
_amt25pc = min(c62700,min(c24517+e24515,c24516))-min(c62700,c24517);
c62770 =  .25*_amt25pc;
_tamt2  = c62747 + c62749 + c62750 + c62755 + c62760 + c62770;
end;

[The purpose of the code is to tax different assets at different rates,
```
where the rates also depend on the taxpayer income, including capital gains, and with non-symmetric treatment of losses]. I think this translates into the following Stata code:
```
_amt5pc = min(c24533,min(c24532,min(c62700,c24517))) if FLPDYR == 2003
_amt5pc = max(0,_amt5pc) if FLPDYR == 2003
c62747 = .05*_amt5pc if FLPDYR == 2003
_line49 = max(0,min(c24532,min(c24517,c62700)) - _amt5pc) if FLPDYR == 2003
_line50 = rowtotal(e24583,0) if FLPDYR == 2003
_amt8pc = min(_line49,_line50) if FLPDYR == 2003
c62749 = .08*_amt8pc if FLPDYR == 2003
_amt10pc = _line49 - _amt8pc if FLPDYR == 2003
c62750 =  .1*_amt10pc if FLPDYR == 2003
_line55 = c24533 - _amt5pc if FLPDYR == 2003
_line56 = min(c24517,c62700) - min(c24532,min(c24517,c62700)) if FLPDYR == 2003
_amt15pc = min(_line55,_line56) if FLPDYR == 2003
c62755 =  .15*_amt15pc if FLPDYR == 2003
_amt20pc = _line56 - _amt15pc if FLPDYR == 2003
c62760 =  .2*_amt20pc if FLPDYR == 2003
_amt25pc = min(c62700,min(c24517+e24515,c24516))-min(c62700,c24517) if FLPDYR == 2003
c62770 =  .25*_amt25pc if FLPDYR == 2003
_tamt2  = c62747 + c62749 + c62750 + c62755 + c62760 + c62770 if FLPDYR == 2003

```
Repeating the if qualifier means repeating a calculation, which is an inefficiency, but it also means repeating the code, which is ugly and distracting. That is why I asked about the possibility of a block level if qualifier. If it doesn't exist, I'll put it in W Gould's suggestion box.
```
```
One thing I could do is allow more complex assignment statements, with fewer of the intermediate values that are used to clarify purpose and show the correspondence to the tax form. That could reduce the number of statements by half but is otherwise undesirable.
```
Daniel Feenberg
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```