Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: From weekly to monthly data


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: From weekly to monthly data
Date   Fri, 29 Jun 2012 11:27:54 +0100

In a concurrent thread, two people are asking questions about
conversion from weekly to monthly data. I don't see that either is
asking a precise question, and the main purpose of this posting is to
try to explain what would be a precise question under that heading.
For a variety of reasons it seems best not to engage directly with
those posts. I am also pitching this more widely in the hope of
writing something that might be helpful outside current threads.

Let's go back to basics. Note I assume here the conventional Gregorian
calendar, but even with that restriction matters are more complicated
than might be wished.

What is a week? This is the heart of the matter. A smart
seven-year-old will tell you that

Smart 1. A week is seven days long.

Smart 2. It begins on .... Or equivalently, it ends on ....

But what will they say to close #2?

Before we get to that, not even #1 is correct everywhere! In
particular, Stata has an idiosyncratic definition of weeks as far as
weekly dates are concerned, with rules

Stata 1. Week 1 of any given calendar year begins on 1 January of that
year, regardless. So week 2 starts on 8 January, and so forth.

Stata 2. There are only 52 weeks of the year, regardless. So day 365
of a year is always assigned to week 52. And day366 is also so
assigned whenever it occurs, within a leap year. So week 52 is 8 or 9
days long, always.

. di %tw  wofd(mdy(6,29,2012))
2012w26

. di %tw  wofd(mdy(1,1,2012))
 2012w1

. di %tw  wofd(mdy(1,7,2012))
 2012w1

. di %tw  wofd(mdy(1,8,2012))
 2012w2

. di %tw  wofd(mdy(12,30,2012))
2012w52

. di %tw  wofd(mdy(12,31,2012))
2012w52

These are likely to seem odd definitions, but they are Stata's
definitions. They are most unlikely to match "weeks" as defined by any
data-producing body outside StataCorp. Even if they do, you are not
out of the woods yet.

Going back to #2, Stata has a concept that a week starts on Sunday --
which is a convention familiar to many of us. It is embedded in
Stata's function -dow()- (which note is a completely different beast
from -dofw()-).

Day of week is defined by Stata using the rules 0 = Sunday, 1 =
Monday, ..., 6 = Saturday. If you are reading this at any later date
know that 29 June 2012 was a Friday, 24 June 2012 a Sunday, and so
forth.

. di  dow(mdy(6,29,2012))
5

. di  dow(mdy(6,24,2012))
0

. di  dow(mdy(6,25,2012))
1

Note that this doesn't correspond  in general to the definition of
weeks used for weekly dates. Stata's weekly dates. as defined (e.g.)
by -wofd()-,  will change whenever -dow()- flips back to 0 if and only
if 1 January of a year is a Sunday, and even in such years either 1 or
2 days at the end of the year will be awkward.

Let's start focusing on the practical consequences of all this. This
is likely to sound utterly elementary but it is precisely what is
needed to devise solutions in this territory.

Practice 1. Weeks don't all nest within years _unless_ you use an
idiosyncratic definition such as Stata's which is designed to ensure
that that is so. By nesting I mean that every week is entirely
contained within a single year. With every other definition
(particularly, any based on the idea that a particular day of the week
starts the week), weeks at the beginning and the end of the year are
likely to span different years.

Practice 2. Weeks don't all nest within months. Weeks at the beginning
and the end of the month are likely to cross months (and as above
often years too). The only exceptions are when 28 days in February
happen to include exactly 4 weeks according to whatever definition of
week you are using. Even then that can only be occasional and not the
basis of any conversion.  (Note that this can never happen with
Stata's weekly dates as 1 February is _always_ within week 5 of the
year.)

Practice 3. Outside Stata it seems that definitions of weeks typically
involve starting on a particular day of the week and finishing 6 days
later. If follows from everything so far that analysing weekly data
defined in those terms will need to be based directly or indirectly on
-dow()-. This was spelled out at length within

SJ-10-4 dm0052  . . . . . . . . . . . . . . . . Stata tip 68: Week assumptions
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q4/10   SJ 10(4):682--685                                (no commands)
        tip on Stata's solution for weeks and on how to set up
        your own alternatives given different definitions of the
        week

It also follows that Stata's weekly dates will not help at all _unless
exceptionally_ your data were gathered with precisely Stata's
definition. Some advice implicitly to the contrary is thus bad advice.

Those wanting to convert somehow from weekly data to monthly data,
whether they are producing totals or averages (or anything else), will
need to be explicit about the answers to various questions:

Question set 1.

What precisely is the definition of week used in your data?
When does a week start (or end)?
Are weeks allowed to span years?
Does anything odd happen at the end or beginning of the year?

The answers to these cannot be assumed obvious to anyone _unless_ they
happen to be using the same dataset as you and are familiar with its
documentation. If the questioner doesn't know, it is likely that no
one else can fill in.

Question set 2.

When you want to reduce weekly data to monthly, are you meaning the
calendar months January to December?

If not, you need to spell out your definition of month.

If so, what are your rules for splitting weeks that span months? For example

Split 0. Weeks are not to be split, but assigned as a whole to a
month, if the beginning of the week falls within that month, or the
end of the week ditto, or a majority of days in the week ditto.

Split 1. Weeks are to be split, with weights according to how many
days of the week fall within each month, or weights according to how
many "working" days fall within each month, with a definition of
"working".

If nothing else, I hope to have indicated that conversion from weekly
to monthly data is not just a matter of someone saying they want to do
it and then someone else just giving them code that does it. Also,
it's likely that there are yet other complications with particular
kinds of data that I have forgotten to mention or never knew in the
first place.

Finally, at the risk of being thought especially blunt if not brutal,
I should spell out that I am not volunteering to write code for
anybody's conversions _even if_ they answer all the questions above.
Someone else may want to do that, but it is more likely that people
wanting to do this will need to write their own code, as in this area
there is, in my experience, much fiddliness and no fun.

Nick
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index