Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Re: st: Re: st: Σχετ: st: calculating percentage changes in an unbalanced panel data set |

Date |
Wed, 6 Mar 2013 11:43:14 +0000 |

A more general point is embedded here which arises again and again with date variables. It is vital to realise that descriptions of the form my dates are of format dd/mm/yyyy do not distinguish between string variables with values such as "25/12/2012" and numeric date variables with -format- (Stata's sense) %tdd/n/Cy That is, 1. The word "format" is overloaded. 2. The needed information is about _types_. The output of -describe- for the variable concerned is informative. Word descriptions using your own terminology are often ambiguous. On Wed, Mar 6, 2013 at 10:38 AM, Nick Cox <njcoxstata@gmail.com> wrote: > My guess is that -time- is a string variable. > > This contradicts the earlier output from -tsset-, which would not have > worked if -time- were a string variable. > > So, we need another guess. Perhaps you are really using different > names, but translated for Statalist for some reason, but forgot to > take some difference into account. Or this is a slightly different > version of the same data. Either way, there is something about your > dataset which you are not telling us. > > Whatever the answer, -mofd()- should work if and only if the argument > is a numeric daily date variable. If it's a string variable, you > should use -date()- to convert it to a numeric daily date variable. > > Nick > > On Wed, Mar 6, 2013 at 9:08 AM, Tzaloupas Dimitrov > <tzaloupas1232@yahoo.gr> wrote: > >> thanks for your reply Rebecca. the dates that I have in my files are of this format dd/mm/yyyy. so by applying the code you provided, specifically >> >> gen month=mofd(time) I get the following error >> >> type mismatch >> r(109); >> >> >> So, still I can not find the answer to my question. Is there any other suggestion? > > Rebecca Pope <rebecca.a.pope@gmail.com> > >> You have inflation measured on a daily basis? My guess is not. In all >> likelihood, what you have is monthly data that happens to be coded >> 01mmmYYYY. Stata, however, does not know this. >> >> gen month = mofd(time) // get date in month format >> format month %tm >> tsset id month >> >> Now Stata knows you have monthly changes, so it doesn't appear that >> you have many missing observations within your panel simply due to >> false "gaps" because of how your data is recorded. >> >> Once you have -tsset- your data, you can use the lag operator. >> Otherwise, based on what you are doing, there isn't much point in >> -tsset-. Using lags to calculate a change in the inflation rate would >> be as so: >> >> gen p2 = (inf/L.inf-1)*100 // L. is Stata's lag operator (see -help >> tsvarlist- if unfamiliar) >> >> If you are wanting inflation since "baseline" rather than >> period-to-period inflation: >> bys id (mon): gen p2_alt = (inf[_n]/inf[1]-1)*100 // note here that >> the time variable is in () >> >> In your original code, you had "bys country time". The problem with >> this is that Stata is looking within country _and_ time and counting >> observations. Because you only have one observation at each time >> period, you get missing values. Placing time in parentheses tells >> Stata to sort by that value but not to count within it. >> >> p2 will result in missing values if your panel data are still >> unbalanced after correcting for monthly observations. p2_alt will give >> you a value at every point in your series. However, the two provide >> fundamentally different information. Your example of 2/1 leaves the >> ultimate question unclear so I've given you code for both. > > On Tue, Mar 5, 2013 at 5:27 PM, Tzaloupas Dimitrov > >>> I have some time series observations (inflation) for a set of countries >>> >>> The panel data set is unbalanced, that is, >>> >>> egen id = group(country), label >>> tsset id time >>> panel variable: id (unbalanced) >>> time variable: time, 01oct2008 to 01nov2011, but with gaps >>> delta: 1 day >>> >>> within each country I want to find the percentage change of inflation. >>> >>> I tried >>> >>> bysort country time : gen p2=(inf[2]-inf[1]/inf[1])*100 >>> >>> but I get this message >>> (500 missing values generated) >>> >>> Am I doing something wrong? >>> >>> >>> >>> I use Stata 11 > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: calculating percentage changes in an unbalanced panel data set***From:*Tzaloupas Dimitrov <tzaloupas1232@yahoo.gr>

**Re: st: calculating percentage changes in an unbalanced panel data set***From:*Rebecca Pope <rebecca.a.pope@gmail.com>

**st: Σχετ: st: calculating percentage changes in an unbalanced panel data set***From:*Tzaloupas Dimitrov <tzaloupas1232@yahoo.gr>

**st: Re: st: Σχετ: st: calculating percentage changes in an unbalanced panel data set***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Data management** - Next by Date:
**Re: st: nonlinear N effect** - Previous by thread:
**st: Re: st: Σχετ: st: calculating percentage changes in an unbalanced panel data set** - Next by thread:
**Re: st: Nonconvergence in subset of bootstrap samples doesn't show up in e(N_misreps)** - Index(es):