Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Converting a SAS datastep to Stata


From   Daniel Feenberg <feenberg@nber.org>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Converting a SAS datastep to Stata
Date   Wed, 15 Dec 2010 16:33:37 -0500 (EST)


On Wed, 15 Dec 2010, Nick Cox wrote:

If I understand that SAS code correctly, and I've never used SAS in my life, an equivalent would be

gen lvalue1 = expr1 if flpdyr > 1993 & flpdyr < 1998
gen lvalue2 = expr2 if flpdyr > 1993 & flpdyr < 1998

In fact

if flpdyr > 1993 & flpdyr < 1998

could be translated to

if inrange(flpdyr, 1994, 1997)

which isn't much shorter but is likely to match the way you think more closely. (Here I am taking it from context that year variables take integer values only.)

Yes, integers only, and the range statement is very clear, however consider that there are 18 lines of code for calculating the tax on capital gains income in 2003, then 15 lines used only for 2004, etc for 21 years. While I personally blame the congress for the frequent tax law changes, that isn't relevant for this mailing list.

Here is the SAS code for capital gains under the alternative minimum tax for a single year:

  if FLPDYR eq 2003 then do;
     _amt5pc = min(c24533,min(c24532,min(c62700,c24517)));
     _amt5pc = max(0,_amt5pc);
     c62747 = .05*_amt5pc;
     _line49 = max(0,min(c24532,min(c24517,c62700)) - _amt5pc);
     _line50 = sum(e24583,0);
     _amt8pc = min(_line49,_line50);
     c62749 = .08*_amt8pc;
     _amt10pc = _line49 - _amt8pc;
     c62750 =  .1*_amt10pc;
     _line55 = c24533 - _amt5pc;
     _line56 = min(c24517,c62700) - min(c24532,min(c24517,c62700));
     _amt15pc = min(_line55,_line56);
     c62755 =  .15*_amt15pc;
     _amt20pc = _line56 - _amt15pc;
     c62760 =  .2*_amt20pc;
     _amt25pc = min(c62700,min(c24517+e24515,c24516))-min(c62700,c24517);
     c62770 =  .25*_amt25pc;
     _tamt2  = c62747 + c62749 + c62750 + c62755 + c62760 + c62770;
   end;

[The purpose of the code is to tax different assets at different rates,
where the rates also depend on the taxpayer income, including capital gains, and with non-symmetric treatment of losses]. I think this translates into the following Stata code:

     _amt5pc = min(c24533,min(c24532,min(c62700,c24517))) if FLPDYR == 2003
     _amt5pc = max(0,_amt5pc) if FLPDYR == 2003
     c62747 = .05*_amt5pc if FLPDYR == 2003
     _line49 = max(0,min(c24532,min(c24517,c62700)) - _amt5pc) if FLPDYR == 2003
     _line50 = rowtotal(e24583,0) if FLPDYR == 2003
     _amt8pc = min(_line49,_line50) if FLPDYR == 2003
     c62749 = .08*_amt8pc if FLPDYR == 2003
     _amt10pc = _line49 - _amt8pc if FLPDYR == 2003
     c62750 =  .1*_amt10pc if FLPDYR == 2003
     _line55 = c24533 - _amt5pc if FLPDYR == 2003
     _line56 = min(c24517,c62700) - min(c24532,min(c24517,c62700)) if FLPDYR == 2003
     _amt15pc = min(_line55,_line56) if FLPDYR == 2003
     c62755 =  .15*_amt15pc if FLPDYR == 2003
     _amt20pc = _line56 - _amt15pc if FLPDYR == 2003
     c62760 =  .2*_amt20pc if FLPDYR == 2003
     _amt25pc = min(c62700,min(c24517+e24515,c24516))-min(c62700,c24517) if FLPDYR == 2003
     c62770 =  .25*_amt25pc if FLPDYR == 2003
     _tamt2  = c62747 + c62749 + c62750 + c62755 + c62760 + c62770 if FLPDYR == 2003

Repeating the if qualifier means repeating a calculation, which is an inefficiency, but it also means repeating the code, which is ugly and distracting. That is why I asked about the possibility of a block level if qualifier. If it doesn't exist, I'll put it in W Gould's suggestion box.

One thing I could do is allow more complex assignment statements, with fewer of the intermediate values that are used to clarify purpose and show the correspondence to the tax form. That could reduce the number of statements by half but is otherwise undesirable.

Daniel Feenberg
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index