Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: could you please verify the correctness of the code?-tsfill function |

Date |
Wed, 6 Jun 2012 08:55:45 +0100 |

The flavour of this is now not about Stata but about whether you are making the right substantive decision on what to do with your data. I am not an economist and not familiar with the pertinent literature, but you cannot be the first person to be faced with this problem that prices in different places are observed on different dates. So, what else do people do? Your solution is Procrustean in forcing dates on to a grid. Daily dates that round to the same floor(date/28) could be up to 27 days apart, so with what you do I think you need to calculate the error (real date - gridded date) and talk about its distribution. One alternative is to interpolate the prices to a shared set of dates. Another is to take what you have and calculate monthly average prices and also report how many prices those averages are based on. You will still have gaps and may well need to interpolate too. As I've said many times on this list, Statalist may not serve well the expectations of those list members who want to be told how best to analyse their data. I keep wondering whether the response to thesis examiners/committee members or paper reviewers to "Why did you do that?" is going to be "Oh, that was what recommended on an internet discussion list by one person who answered my question". Nick On Wed, Jun 6, 2012 at 2:14 AM, stef salvez <loggyedy@googlemail.com> wrote: > thank you Nick. I really appreciate your help and your patience. > Let me be more explicit this time > > > I have a panel data set of prices of goods that vary across time and countries. > > As you can see from the table below > > > > country dates price of good k > > > > 1 "23/11/08" 2 > 1 "28/12/08" 3 > 1 "25/01/09" 4 > 1 "22/02/09" 5 > 1 "29/03/09" 6 > 1 "26/04/09" 32 > 1 "24/05/09" 23 > 1 "28/06/09" 32 > 2 "26/10/08" 45 > 2 "23/11/08" 46 > 2 "21/12/08" 90 > 2 "18/01/09" 54 > 2 "15/02/09" 65 > 2 "16/03/09" 77 > 2 "12/04/09" 7 > 2 "10/05/09" 6 > > > > > > > > the start and end date of the time series for countries 1 and 2 are > different. For example, for country 1 the time series begins on > "23/11/08" while for country 2 the time series begins on > "26-10-2008". > > My data on prices are available every 28 days (or equivalently every 4 > weeks). But in some cases I have jumps (35 days or 29 days instead of > 28 days). For example from the above table we have such jumps: from > "28/12/08" to "28/12/08" , from 22/02/09" to > "29/03/09", etc > > My goal is to have as much as possible the same sequence of dates > across countries which is a bit difficult because of the two > "problems" that I mentioned above. I want to have the same sequence of > dates across countries because eventually what I want to do is see how > the difference of prices for,say good k, between two countries > evolves over time. So I want to set up the following regression > > > > > > ΔP_{ij,t}_{k}= constant +regressors +error term where ΔP_{ij,t } is > the difference of prices between countries i and j in period t for > good k. The ΔP_{t}_{k} is a vector of difference of prices for all > pairs of countries at time t for good k. > The whole point is to be able to run the above regression > > > My initial idea was to use -tsfill- in the code which i display below > ( and which can be easily reproduced with copy paste in stata): > > > > clear all > cd D:\ > input id str8 (dates) variable > 1 "23/11/08" 2 > 1 "28/12/08" 3 > 1 "25/01/09" 4 > 1 "22/02/09" 5 > 1 "29/03/09" 6 > 1 "26/04/09" 32 > 1 "24/05/09" 23 > 1 "28/06/09" 32 > 2 "26/10/08" 45 > 2 "23/11/08" 46 > 2 "21/12/08" 90 > 2 "18/01/09" 54 > 2 "15/02/09" 65 > 2 "16/03/09" 77 > 2 "12/04/09" 7 > 2 "10/05/09" 6 > end > > > > gen edate1 = date(dates, "DM20Y") > gen edate2= floor(edate1/28) > tsset id edate2 > tsfill > > > > > > But I do not know if this approach is correct or not in order to be > able to run the above regression. Apart from tsfill I have no other > idea how to run this regression. Any suggestions/codes are welcome. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: could you please verify the correctness of the code?-tsfill function***From:*stef salvez <loggyedy@googlemail.com>

**References**:**st: could you please verify the correctness of the code?-tsfill function***From:*stef salvez <loggyedy@googlemail.com>

**Re: st: could you please verify the correctness of the code?-tsfill function***From:*stef salvez <loggyedy@googlemail.com>

**Re: st: could you please verify the correctness of the code?-tsfill function***From:*stef salvez <loggyedy@googlemail.com>

**Re: st: could you please verify the correctness of the code?-tsfill function***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: could you please verify the correctness of the code?-tsfill function***From:*stef salvez <loggyedy@googlemail.com>

- Prev by Date:
**st: Re: st: Imputing with ice and using clogit** - Next by Date:
**Re: st: Brant test and goodness of fit(after -oprobit-) with complex survey data** - Previous by thread:
**Re: st: could you please verify the correctness of the code?-tsfill function** - Next by thread:
**Re: st: could you please verify the correctness of the code?-tsfill function** - Index(es):