Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: How to aggregate daily data in Stata


From   "Christopher A. Hartwell" <[email protected]>
To   <[email protected]>
Subject   RE: st: How to aggregate daily data in Stata
Date   Mon, 17 Dec 2012 11:28:42 +0400

Thanks Nick, that looks like it will be useful - it helped on my constituent
data sheets in collapsing the data, so many thanks for your assistance.

This unfortunately leads to a second question regarding Stata's use of dates
and how it imports dates from Excel - I'm encountering a problem getting the
data in correctly into Stata.

As before, I have a large unbalanced panel dataset of monthly data, with not
every year/month represented (i.e. Belarus starts at May 1998, Czech
Republic starts in May 1994, etc.). Using Stata 11IC and stat-transferring
from Excel, I get the problem of the monthly date coming in as a string
variable or as a "long" d/m/y variable. The problems, in their order:

1.	When originally imported via stat-transfer, the date comes in
appearing normal (it shows in data editor as a d/m/y variable, the first of
each month, although, not the just month-year that I want). However, when I
tsset it as a monthly variable, it appears to have Islamic timing. For
example, Belarus from May 1998 to January 1999 translates as:

Country	Month
Belarus	3126m9
Belarus	3129m4
Belarus	3131m10
Belarus	3134m5
Belarus	3136m12
Belarus	3139m6
Belarus	3142m1
Belarus	3144m7
Belarus	3147m2

If I keep it as "daily" data, I can't use any operators such as lags or
differences, because stata thinks the previous period is missing.

2.	Alternately, if I attempt to just paste the data into the data
editor, the date shows as a string variable. I try . generate
Date=date(month,"MY") on the month variable, and it generates all missing.
The same thing happens if I 

. generate Date=date(month,"DMY")

. gen dm = mofd(Date)

All are blank.

Finally, if I encode month, gen(Month) off of this, I end up with a series
that LOOKS correct - May-98, Jun-98, but when I click on the cell in data
editor, it shows as a number that is non-sequential. That is, May 98 is 214,
June-98 is 166, Jul-98 is 142... and if I tsset on this variable, it

a) says it has tsset "Month, 1960m2 to 1983m12, but with gaps"
b) sorts the months all together and then the years, so it goes Apr-00,
Apr-01, Apr-02... then Aug-00, Aug-01... for each country

Can anyone help with this? Is this an excel problem or a stata one? I just
need to get the monthly/yearly data imported correctly, or else I can't
really do anything.



Christopher A. Hartwell, PhD
Senior Research Fellow
Institute for Emerging Market Studies (IEMS)
http://iems.skolkovo.ru/ 

In Moscow: +7 916 777 1260 (m)
In the US:    +1 202-415-6030 (m)
                        +1 773-724-2310 (t/f)
Skype: chartwel
[email protected] 


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Sunday, December 16, 2012 2:15 AM
To: [email protected]
Subject: Re: st: How to aggregate daily data in Stata

. help collapse

Nick

On 15 Dec 2012, at 20:46, "Christopher A. Hartwell"
<[email protected]  > wrote:
>
> Have a question that should be easy to figure out for the experts, but 
> I can't wrap my mind around it conceptually.
>
> I have daily stock market data for 28 countries over a 20-odd year 
> period.
> Some countries have 6 years of data, some have 20 years, etc., 
> depending on when they started their stock markets. I want to 
> calculate a volatility metric by using the sum of squares of daily 
> changes for each month (that is, aggregate the daily squared changes 
> for each month). Given that the panel data has different time lengths 
> and there are different monthly periods (i.e. some months have 30 
> days, some countries have holidays in the month so there's no data), 
> how can I use stata to sum the squares of the daily changes by month?
>
> For example, I have
>
> Belarus        March-1-05    1.37
> Belarus        March-2-05    0.69
> .
> .
> .
> Belarus        March 31-05    17.33
>
> And I want to generate one variable per country per month that is the 
> sum of these numbers, so that I have
>
> Belarus        Mar-05        37.20
> Belarus        Apr-05        18.99
>
> Etc.
>
> Is there an easy "by x:" command that will let me do it? Or some 
> iteration that will allow for this?
>
> The problem I see is that I need to sum the data first by day over a 
> specific month, for each month, for each country, on a dataset that is 
> severely unbalanced.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index