Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten Buis <maartenlbuis@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Restructuring the time dimension in a dataset |

Date |
Fri, 11 Oct 2013 21:03:59 +0200 |

Question 1: -help stsplit- Question 2: that depends on so many things.... Hope this helps, Maarten On Fri, Oct 11, 2013 at 8:47 PM, Tunga Kantarcı <tungakantarci@gmail.com> wrote: > Hello, > > I have a dataset where ‘variable one’ indicates a unique > identification number for each individual in the data. Then there is > ‘variable two’ which indicates a date (like 01-01-2010) which is the > start date of a period and ‘variable three’ indicates a date (like > 05-01-2010) which is the end date of the same period. Then there is > ‘variable four’ which indicates a number between 0 and 1 (like 0.574) > that has been realised during the period 01-01-2010 - 05-01-2010. > > A snapshot of the data sheet for individual 4115111 looks like this: > > 4115111 01-01-2010 05-01-2010 0.574 > 4115111 05-01-2010 31-09-2011 0.321 > > In this dataset, as the snapshot also shows, the length of a period is > irregular. It can be as short as a day (like 01-01-2010 – 02-01-2010) > or as long as a year (like 01-01-2010 - 01-01-2011), or even longer. > Hence it is not clear how I should treat the time dimension of the > data. The cases of variable four are not observed on a monthly or > yearly basis. I plan to restructure the data. That is, I plan to > fragment each period into multiple periods with a length of one day > and then aggregate them to, say, a month. This means that the first > period, which is > > 4115111 01-01-2010 05-01-2010 0.574, > > would be fragmented into > > 4115111 01-01-2010 02-01-2010 0.574 > 4115111 02-01-2010 03-01-2010 0.574 > 4115111 03-01-2010 04-01-2010 0.574 > 4115111 04-01-2010 05-01-2010 0.574, > > and the second period, which is > > 4115111 05-01-2010 31-09-2011 0.321, > > would be fragmented into > > 4115111 05-01-2010 06-01-2010 0.321 > . > . > 4115111 30-09-2011 31-09-2011 0.321. > > After this fragmentation, I plan to collapse the daily series to > monthly series which would mean that variable four will be averaged > over the days of a month to make up a monthly number, perhaps using > the “collapse variable four, by(variable two)” command. In the end I > would like to have monthly data. > > Given this explanation, I would like to ask two questions. > > Question one: In Stata, how can I fragment each case (that is each row > in the data) into multiple cases (multiple rows) with respect to > variable two and variable three as explained above? > > Question two: If it was your own data, how would you treat it? Would > your approach be the same as mine? > > Tunga > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ -- --------------------------------- Maarten L. Buis WZB Reichpietschufer 50 10785 Berlin Germany http://www.maartenbuis.nl --------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Negative probabilities after a margins command for a categorical variable (post logistic model)***From:*"Scheetz, Marc" <mschee@midwestern.edu>

**References**:**st: Restructuring the time dimension in a dataset***From:*Tunga Kantarcı <tungakantarci@gmail.com>

- Prev by Date:
**Re: st: Estimating MNL model** - Next by Date:
**Re: st: Stata 13 forecast solve likely bug** - Previous by thread:
**st: Restructuring the time dimension in a dataset** - Next by thread:
**st: Negative probabilities after a margins command for a categorical variable (post logistic model)** - Index(es):