Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: From weekly to monthly data |

Date |
Fri, 29 Jun 2012 11:27:54 +0100 |

In a concurrent thread, two people are asking questions about conversion from weekly to monthly data. I don't see that either is asking a precise question, and the main purpose of this posting is to try to explain what would be a precise question under that heading. For a variety of reasons it seems best not to engage directly with those posts. I am also pitching this more widely in the hope of writing something that might be helpful outside current threads. Let's go back to basics. Note I assume here the conventional Gregorian calendar, but even with that restriction matters are more complicated than might be wished. What is a week? This is the heart of the matter. A smart seven-year-old will tell you that Smart 1. A week is seven days long. Smart 2. It begins on .... Or equivalently, it ends on .... But what will they say to close #2? Before we get to that, not even #1 is correct everywhere! In particular, Stata has an idiosyncratic definition of weeks as far as weekly dates are concerned, with rules Stata 1. Week 1 of any given calendar year begins on 1 January of that year, regardless. So week 2 starts on 8 January, and so forth. Stata 2. There are only 52 weeks of the year, regardless. So day 365 of a year is always assigned to week 52. And day366 is also so assigned whenever it occurs, within a leap year. So week 52 is 8 or 9 days long, always. . di %tw wofd(mdy(6,29,2012)) 2012w26 . di %tw wofd(mdy(1,1,2012)) 2012w1 . di %tw wofd(mdy(1,7,2012)) 2012w1 . di %tw wofd(mdy(1,8,2012)) 2012w2 . di %tw wofd(mdy(12,30,2012)) 2012w52 . di %tw wofd(mdy(12,31,2012)) 2012w52 These are likely to seem odd definitions, but they are Stata's definitions. They are most unlikely to match "weeks" as defined by any data-producing body outside StataCorp. Even if they do, you are not out of the woods yet. Going back to #2, Stata has a concept that a week starts on Sunday -- which is a convention familiar to many of us. It is embedded in Stata's function -dow()- (which note is a completely different beast from -dofw()-). Day of week is defined by Stata using the rules 0 = Sunday, 1 = Monday, ..., 6 = Saturday. If you are reading this at any later date know that 29 June 2012 was a Friday, 24 June 2012 a Sunday, and so forth. . di dow(mdy(6,29,2012)) 5 . di dow(mdy(6,24,2012)) 0 . di dow(mdy(6,25,2012)) 1 Note that this doesn't correspond in general to the definition of weeks used for weekly dates. Stata's weekly dates. as defined (e.g.) by -wofd()-, will change whenever -dow()- flips back to 0 if and only if 1 January of a year is a Sunday, and even in such years either 1 or 2 days at the end of the year will be awkward. Let's start focusing on the practical consequences of all this. This is likely to sound utterly elementary but it is precisely what is needed to devise solutions in this territory. Practice 1. Weeks don't all nest within years _unless_ you use an idiosyncratic definition such as Stata's which is designed to ensure that that is so. By nesting I mean that every week is entirely contained within a single year. With every other definition (particularly, any based on the idea that a particular day of the week starts the week), weeks at the beginning and the end of the year are likely to span different years. Practice 2. Weeks don't all nest within months. Weeks at the beginning and the end of the month are likely to cross months (and as above often years too). The only exceptions are when 28 days in February happen to include exactly 4 weeks according to whatever definition of week you are using. Even then that can only be occasional and not the basis of any conversion. (Note that this can never happen with Stata's weekly dates as 1 February is _always_ within week 5 of the year.) Practice 3. Outside Stata it seems that definitions of weeks typically involve starting on a particular day of the week and finishing 6 days later. If follows from everything so far that analysing weekly data defined in those terms will need to be based directly or indirectly on -dow()-. This was spelled out at length within SJ-10-4 dm0052 . . . . . . . . . . . . . . . . Stata tip 68: Week assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q4/10 SJ 10(4):682--685 (no commands) tip on Stata's solution for weeks and on how to set up your own alternatives given different definitions of the week It also follows that Stata's weekly dates will not help at all _unless exceptionally_ your data were gathered with precisely Stata's definition. Some advice implicitly to the contrary is thus bad advice. Those wanting to convert somehow from weekly data to monthly data, whether they are producing totals or averages (or anything else), will need to be explicit about the answers to various questions: Question set 1. What precisely is the definition of week used in your data? When does a week start (or end)? Are weeks allowed to span years? Does anything odd happen at the end or beginning of the year? The answers to these cannot be assumed obvious to anyone _unless_ they happen to be using the same dataset as you and are familiar with its documentation. If the questioner doesn't know, it is likely that no one else can fill in. Question set 2. When you want to reduce weekly data to monthly, are you meaning the calendar months January to December? If not, you need to spell out your definition of month. If so, what are your rules for splitting weeks that span months? For example Split 0. Weeks are not to be split, but assigned as a whole to a month, if the beginning of the week falls within that month, or the end of the week ditto, or a majority of days in the week ditto. Split 1. Weeks are to be split, with weights according to how many days of the week fall within each month, or weights according to how many "working" days fall within each month, with a definition of "working". If nothing else, I hope to have indicated that conversion from weekly to monthly data is not just a matter of someone saying they want to do it and then someone else just giving them code that does it. Also, it's likely that there are yet other complications with particular kinds of data that I have forgotten to mention or never knew in the first place. Finally, at the risk of being thought especially blunt if not brutal, I should spell out that I am not volunteering to write code for anybody's conversions _even if_ they answer all the questions above. Someone else may want to do that, but it is more likely that people wanting to do this will need to write their own code, as in this area there is, in my experience, much fiddliness and no fun. Nick * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: Creating a dummy set of values to use in a graph in the presence of under-reported values** - Next by Date:
**Re: st: likihood function is not concave in clogit model** - Previous by thread:
**st: likihood function is not concave in clogit model** - Next by thread:
**st: xtlogit with fe for rare events** - Index(es):