|Title||Leap year indicators|
|Author||Nicholas J. Cox, Durham University, UK|
|Date||January 2004; minor revisions February 2014|
How do you determine if a year is a leap year using Stata syntax? For example, you know that 2004 was a leap year, but how would you handle leap years generally? Even if you are not especially interested in leap years, the answer provides good examples of some key Stata functions, so do read on.
The rules in the Gregorian calendar for a year to be a leap year are
YES if 1. year divisible by 4 (but NO if 2. year divisible by 100 (but YES if 3. year divisible by 400)) and NO otherwise.
Note the nesting of rules. If a leap year is a first-order correction, the third rule is an example of a third-order correction. Scientists and engineers often use such ideas, but they seem in shorter supply in the everyday world.
You will be familiar with rule 1, but rules 2 and 3 are occasionally forgotten. For example, Excel has 1900 as a leap year; it is documented that this was to provide compatibility with Lotus 1-2-3.
In Stata, suppose that year is a variable. An indicator containing 1 for leap year and 0 otherwise is then given by
(mod(year,4) == 0 & mod(year,100) != 0) | mod(year,400) == 0
as mod(,) provides the remainder left over from division. Alternatively, we could use cond(,,):
cond(mod(year,400) == 0, 1, cond(mod(year,100) == 0, 0, cond(mod(year,4) == 0, 1, 0)))
The layout of such code may be important to you, if not to Stata. In such expressions, to make sure you have balanced parentheses, exploit the pertinent function in a decent text editor. In the Stata Do-file Editor, it is "Balance", Ctrl-B; in Vim, it is the % key; etc.
Given daily dates, a feature of a leap year is clearly that there is a February 29, so
mdy(2,29,2004) < .
mdy(2,29,2004) != .
is true. That is, mdy(2,29,whenever) is missing if whenever is not a leap year but is nonmissing otherwise; in the latter case, it is less than missing. You could also do that as
Similarly, there are 366 days in a leap year, so
doy(mdy(12,31,2004)) == 366
is true. We could vectorize both of those calculations to say
mdy(2,29,year) < .
doy(mdy(12,31,year)) == 366
Dershowitz and Reingold (2008) provide definitive explanations of calendrical calculations.