Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: could you please verify the correctness of the code?-tsfill function


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: could you please verify the correctness of the code?-tsfill function
Date   Wed, 6 Jun 2012 08:55:45 +0100

The flavour of this is now not about Stata but about whether you are
making the right substantive decision on what to do with your data. I
am not an economist and not familiar with the pertinent literature,
but you cannot be the first person to be faced with this problem that
prices in different places are observed on different dates. So, what
else do people do?

Your solution is Procrustean in forcing dates on to a grid. Daily
dates that round to the same floor(date/28) could be up to 27 days
apart, so with what you do I think you need to calculate the error
(real date - gridded date) and talk about its distribution.

One alternative is to interpolate the prices to a shared set of dates.

Another is to take what you have and calculate monthly average prices
and also report how many prices those averages are based on. You will
still have gaps and may well need to interpolate too.

As I've said many times on this list, Statalist may not serve well the
expectations of those list members who want to be told how best to
analyse their data. I keep wondering whether the response to thesis
examiners/committee members or paper reviewers to "Why did you do
that?" is going to be "Oh, that was what recommended on an internet
discussion list by one person who answered my question".

Nick

On Wed, Jun 6, 2012 at 2:14 AM, stef salvez <loggyedy@googlemail.com> wrote:
> thank you Nick. I really appreciate your help and your patience.
> Let me be more explicit this time
>
>
> I have a panel data set of prices of goods that vary across time and countries.
>
> As you can see from the table below
>
>
>
>  country  dates                price of good k
>
>
>
>  1         "23/11/08"            2
> 1   "28/12/08"                   3
> 1    "25/01/09"                   4
> 1   "22/02/09"                   5
> 1    "29/03/09"                  6
> 1  "26/04/09"                   32
> 1  "24/05/09"                   23
> 1  "28/06/09"                   32
> 2   "26/10/08"                45
> 2  "23/11/08"                 46
> 2  "21/12/08"               90
> 2  "18/01/09"                54
> 2  "15/02/09"                 65
> 2   "16/03/09"               77
> 2  "12/04/09"                    7
> 2   "10/05/09"                   6
>
>
>
>
>
>
>
>  the start and end date of the time series for countries 1 and 2 are
> different. For example, for country 1 the time series begins on
> "23/11/08"       while for country 2 the time series begins       on
> "26-10-2008".
>
> My data on prices are available every 28 days (or equivalently every 4
> weeks). But in some cases I have jumps (35 days or 29 days instead of
> 28 days). For example from the above table we have such jumps: from
> "28/12/08"   to  "28/12/08"   , from 22/02/09"         to
> "29/03/09", etc
>
> My goal is to have as much as possible the same sequence of dates
> across countries which is a bit difficult because of the two
> "problems" that I mentioned above. I want to have the same sequence of
> dates across countries because eventually what I want to do is see how
> the difference of prices for,say good k,  between two countries
> evolves over time. So I  want to set up the following regression
>
>
>
>
>
> ΔP_{ij,t}_{k}= constant +regressors +error term where  ΔP_{ij,t } is
> the difference of prices between  countries i and j in period t for
> good k. The  ΔP_{t}_{k} is a vector  of difference of  prices for all
> pairs of countries at time t for good k.
> The whole point is to be able to run the above regression
>
>
> My initial idea was to use  -tsfill- in the code which i display below
> ( and  which can be easily reproduced  with copy paste in stata):
>
>
>
> clear all
> cd D:\
> input id  str8 (dates)    variable
>  1         "23/11/08"            2
> 1   "28/12/08"                   3
> 1    "25/01/09"                   4
> 1   "22/02/09"                   5
> 1    "29/03/09"                  6
> 1  "26/04/09"                   32
> 1  "24/05/09"                   23
> 1  "28/06/09"                   32
> 2   "26/10/08"                45
> 2  "23/11/08"                 46
> 2  "21/12/08"               90
> 2  "18/01/09"                54
> 2  "15/02/09"                 65
> 2   "16/03/09"               77
> 2  "12/04/09"                    7
> 2   "10/05/09"                   6
> end
>
>
>
> gen edate1 = date(dates, "DM20Y")
>  gen edate2= floor(edate1/28)
> tsset id edate2
> tsfill
>
>
>
>
>
> But I  do not know if this approach is correct or not in order to be
> able to run the above regression. Apart from tsfill I have no other
> idea  how to run this regression. Any suggestions/codes are welcome.
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index