Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: st: Σχετ: st: calculating percentage changes in an unbalanced panel data set

From   Nick Cox <>
Subject   st: Re: st: Σχετ: st: calculating percentage changes in an unbalanced panel data set
Date   Wed, 6 Mar 2013 10:38:04 +0000

My guess is that -time- is a string variable.

This contradicts the earlier output from -tsset-, which would not have
worked if -time- were a string variable.

So, we need another guess. Perhaps you are really using different
names, but translated for Statalist for some reason, but forgot to
take some difference into account. Or this is a slightly different
version of the same data. Either way, there is something about your
dataset which you are not telling us.

Whatever the answer, -mofd()- should work if and only if the argument
is a numeric daily date variable. If it's a string variable, you
should use -date()- to convert it to a numeric daily date variable.


On Wed, Mar 6, 2013 at 9:08 AM, Tzaloupas Dimitrov
<> wrote:

> thanks for your reply Rebecca. the dates that I have in my files are of this format dd/mm/yyyy. so by applying the code you provided, specifically
> gen month=mofd(time)  I get the following error
> type mismatch
> r(109);
> So, still I can not find the answer to my question. Is there any other suggestion?

Rebecca Pope <>

> You have inflation measured on a daily basis? My guess is not. In all
> likelihood, what you have is monthly data that happens to be coded
> 01mmmYYYY. Stata, however, does not know this.
> gen month = mofd(time)  // get date in month format
> format month %tm
> tsset id month
> Now Stata knows you have monthly changes, so it doesn't appear that
> you have many missing observations within your panel simply due to
> false "gaps" because of how your data is recorded.
> Once you have -tsset- your data, you can use the lag operator.
> Otherwise, based on what you are doing, there isn't much point in
> -tsset-. Using lags to calculate a change in the inflation rate would
> be as so:
> gen p2 = (inf/L.inf-1)*100  // L. is Stata's lag operator (see -help
> tsvarlist- if unfamiliar)
> If you are wanting inflation since "baseline" rather than
> period-to-period inflation:
> bys id (mon): gen p2_alt = (inf[_n]/inf[1]-1)*100  // note here that
> the time variable is in ()
> In your original code, you had "bys country time". The problem with
> this is that Stata is looking within country _and_ time and counting
> observations. Because you only have one observation at each time
> period, you get missing values. Placing time in parentheses tells
> Stata to sort by that value but not to count within it.
> p2 will result in missing values if your panel data are still
> unbalanced after correcting for monthly observations. p2_alt will give
> you a value at every point in your series. However, the two provide
> fundamentally different information. Your example of 2/1 leaves the
> ultimate question unclear so I've given you code for both.

On Tue, Mar 5, 2013 at 5:27 PM, Tzaloupas Dimitrov

>> I have some time series observations (inflation) for a set of countries
>> The panel data set is unbalanced, that is,
>> egen id = group(country), label
>> tsset id time
>>        panel variable:  id (unbalanced)
>>        time variable:  time, 01oct2008 to 01nov2011, but with gaps
>>                delta:  1 day
>> within each country I want to find the percentage change of inflation.
>> I tried
>> bysort country time : gen p2=(inf[2]-inf[1]/inf[1])*100
>> but I get this message
>> (500 missing values generated)
>>  Am I doing something wrong?
>> I use Stata 11
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index