Search
   >> Home >> Resources & support >> FAQs >> Leap year indicators

How do I identify leap years in Stata?

Title   Leap year indicators
Author Nicholas J. Cox, Durham University, UK
Date January 2004; minor revisions February 2014

How do you determine if a year is a leap year using Stata syntax? For example, you know that 2004 was a leap year, but how would you handle leap years generally? Even if you are not especially interested in leap years, the answer provides good examples of some key Stata functions, so do read on.

The rules in the Gregorian calendar for a year to be a leap year are

        YES if      1. year divisible by 4 
        
        (but NO if  2. year divisible by 100 
        
        (but YES if 3. year divisible by 400)) 
        
        and NO otherwise.

Note the nesting of rules. If a leap year is a first-order correction, the third rule is an example of a third-order correction. Scientists and engineers often use such ideas, but they seem in shorter supply in the everyday world.

You will be familiar with rule 1, but rules 2 and 3 are occasionally forgotten. For example, Excel has 1900 as a leap year; it is documented that this was to provide compatibility with Lotus 1-2-3.

In Stata, suppose that year is a variable. An indicator containing 1 for leap year and 0 otherwise is then given by

        (mod(year,4) == 0 & mod(year,100) != 0) | mod(year,400) == 0

as mod(,) provides the remainder left over from division. Alternatively, we could use cond(,,):

        cond(mod(year,400) == 0, 1, 
        cond(mod(year,100) == 0, 0, 
        cond(mod(year,4)   == 0, 1, 
                                 0)))

The layout of such code may be important to you, if not to Stata. In such expressions, to make sure you have balanced parentheses, exploit the pertinent function in a decent text editor. In the Stata Do-file Editor, it is "Balance", Ctrl-B; in Vim, it is the % key; etc.

Given daily dates, a feature of a leap year is clearly that there is a February 29, so

        mdy(2,29,2004) < .

or

        mdy(2,29,2004) != .

is true. That is, mdy(2,29,whenever) is missing if whenever is not a leap year but is nonmissing otherwise; in the latter case, it is less than missing. You could also do that as

        !mi(mdy(2,29,2004))

Similarly, there are 366 days in a leap year, so

        doy(mdy(12,31,2004)) == 366

is true. We could vectorize both of those calculations to say

        mdy(2,29,year) < .

or

        doy(mdy(12,31,year)) == 366

Dershowitz and Reingold (2008) provide definitive explanations of calendrical calculations.

Reference

Dershowitz, N. and E.M. Reingold. 2008.
Calendrical Calculations. Cambridge: Cambridge University Press.
The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube