Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How to apply sktest to panel data?
From 
 
Nick Cox <[email protected]> 
To 
 
"[email protected]" <[email protected]> 
Subject 
 
Re: st: How to apply sktest to panel data? 
Date 
 
Tue, 9 Apr 2013 09:41:28 +0100 
-egen-'s -skew()- and -kurt()- functions allow -by <varlist>:-. That
is documented in the help for -egen- and what you need. In fact, they
also allow a -by()- option. That is not documented, but equivalent in
terms of results, and easier to use.
. sysuse auto
(1978 Automobile Data)
. egen skew = skew(mpg), by(rep78)
. tabdisp rep78, c(skew)
----------------------
Repair    |
Record    |
1978      |       skew
----------+-----------
        1 |          0
        2 |   .2241236
        3 |   .3555228
        4 |  -.1326299
        5 |  -.0151831
        . |  -.4781053
----------------------
Nick
[email protected]
On 9 April 2013 04:54, LI Mengjia <[email protected]> wrote:
> Dear Nick,
>
> Thanks again for the detailed suggestions. The string problem is solved. But since my data arranges like this, I'm confused how to use skew() and kurt() to calculate the two. The skewness and kurtosis I wish to have is calculated from Week1-26 for each Year (say 200506, 200512, till 201012) and from the whole period (which is Week1-26 of all Year), perhaps separately for female dummy =1/0. So ideally, each Fund will have 26 skewness and kurtosis values, among which 12 are for each Year and 1 for whole period, doubled by female dummy.
>
> Fund         | Year             Week    Return  female
> -----------------+--------------------------------------------------------
> 000011.OF       200506  Week1   -.01595214       0
> ……..
> 000011.OF       200506    Week26 -.02965235        0
> 000011.OF       200512  Week1   -.01595214       0
> ……….
> 000011.OF       201012  Week26 .00202634         0
> 000021.OF       200506  Week1   .03485255       1
> …………
> 690003.OF       201012  Week26  .02142162       0
>
> When I tried to use -egen- directly (so shame I don't really know much about the codes):
>      egen skewness=skew(Return)
> If only came up one value (I thought it was calculated using all the Return value).
>
> If I added:
>         egen skewness=skew(Return) if Year == 200506
> Though only cells belong to 200506 were filled, still only one number is achieved (I guess it was achieved by using all the Return data in Year 200506). I can't achieve the value for each Fund.
>
> So I tried -foreach-:
> . egen group = group(Fund)
> . foreach i=1/299 {
>     if Year == 200506 egen skewness = skew(Return)
>   }
> invalid syntax
> r(198);
>
> I know it must be caused by my poor codes! Wish to get some instructions on this!
>
> Best regards,
> Amy
>
>
>
> 在 2013-4-9,上午1:34,Nick Cox <[email protected]> 写道:
>
>> No; attachments should not be sent to Statalist. This is explicit in
>> the FAQ, which you were asked to read before posting. Please do read
>>
>> http://www.stata.com/support/faqs/resources/statalist-faq/#toask
>>
>> From your statement
>>
>> . sktest Return if Fund == 000011.OF
>>
>> I guess that your -Fund- is a string variable and so " " are needed
>> around string values.
>>
>> From your answers it seems to be that you would be better off trying
>> to measure skewness and kurtosis, rather than to test them, for which
>> -summarize- suffices, but see the convenient -egen- functions -skew()-
>> and -kurt()-, which make a loop unnecessary.
>>
>> But even then there is an extra problem (#5 in my list), documented in
>>
>> SJ-10-3 st0204  . . Speaking Stata: The limits of sample skewness and kurtosis
>>       . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
>>       Q3/10   SJ 10(3):482--495                                (no commands)
>>       uses Stata and Mata to show that sample skewness and
>>       kurtosis are limited by sample size and that these limits
>>       impart bias to estimation
>>
>> Note that daily returns would increase any problems of dependence in time.
>>
>> Nick
>> [email protected]
>>
>>
>> On 8 April 2013 16:33, LI Mengjia <[email protected]> wrote:
>>
>>> I'm sorry I tried to attach the dta file here but the email is bounded. How should I give you the data to make my point clearer?
>>>
>>> Dear Nick,
>>>
>>> Thank you very much for your reply. The attachment is my data. I'm sorry for not offering it in the previous email. This is the first time I contact with statalist.
>>>
>>> The "doesn't work" appeared to be like this (perhaps this is the problem withe format in this column):
>>> . sktest Return if Fund == 000011.OF
>>> variable OF not found
>>>
>>> The major reason I want to do these tests is to explore whether female- or male-managed funds have a (more significant) positive skewness, which is actually preferred by investors. And similar reason to test kurtosis. I wish to test for individual fund is because I also want to have the proportion of funds managed by different genders that may have positive skewness.
>>>
>>> The return data is collected weekly while the other data I have is on a half-year base from 2005 to 2010. Since the time length is quite limited, I wish to see the results on each time spot as well as the whole period.
>>>
>>> I don't quite understand the 3rd point you mentioned. For the 4th one, I might also achieve the daily return that will generate a larger sample.
>>>
>>> I just started this and I will take into consideration seriously.
>>>
>>> Thank you again and regards,
>>> Amy
>>
>>> 在 2013-4-8,下午10:22,Nick Cox <[email protected]> 写道:
>>>
>>>> There is no problem in principle about specifying -if- with -sktest-.
>>>> Here is a dopey example:
>>>>
>>>> . sysuse auto , clear
>>>> (1978 Automobile Data)
>>>>
>>>> . sktest mpg if foreign
>>>>
>>>>                  Skewness/Kurtosis tests for Normality
>>>>                                                       ------- joint ------
>>>>  Variable |    Obs   Pr(Skewness)   Pr(Kurtosis)  adj chi2(2)    Prob>chi2
>>>> -------------+---------------------------------------------------------------
>>>>       mpg |     22       0.143          0.477         2.98         0.2250
>>>>
>>>> You don't give your code or explain what "doesn't work" means,
>>>> contrary to request, so I can only guess that you made some syntax
>>>> error.
>>>>
>>>> A bigger question is what are you going to do with the results and why
>>>> they are of interest.
>>>>
>>>> 1. Suppose some panels fall one side and other panels fall the other
>>>> side of your chosen threshold significance level? What then?
>>>>
>>>> 2. -sktest- tacitly assumes independence of observations. If this is
>>>> not valid for an -xt- problem, P-values are suspect, so decisions
>>>> based on them are shaky.
>>>>
>>>> 3. Normality (Gaussianity) of the marginal distribution of the
>>>> response is not a requirement for much. In practice, marked skewness
>>>> for example might on various grounds fit better with (e.g.) fitting on
>>>> a log scale, but that's a rather different story. (As you have returns
>>>> as your response, that is unlikely to be possible.)
>>>>
>>>> 4. 26 is a rather small sample size to estimate skewness and kurtosis.
>>>>
>>>> The short answer to your question is "use a loop" and the mechanics of
>>>> what you are seeking are covered are an FAQ
>>>>
>>>> FAQ     . . . . . . . . . . Making foreach go through all values of a variable
>>>>      . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
>>>>      8/05    Is there a way to tell Stata to try all values of a
>>>>              particular variable in a foreach statement without
>>>>              specifying them?
>>>>
>>>> http://www.stata.com/support/faqs/data-management/try-all-values-with-foreach/
>>>>
>>>> Nick
>>>> [email protected]
>>>>
>>>> On 8 April 2013 15:06, LI Mengjia <[email protected]> wrote:
>>>>
>>>>> I want to run skewness and kurtosis normality test on my panel data in the following way:
>>>>>
>>>>> (1)For each Fund, and for each Year, use the Return from Week1 to Week26 to run the sktest and also get the value of skewness and kurtosis.
>>>>>
>>>>> (2)For each Fund, use all its Return to run the sktest and also get the value of skewness and kurtosis.
>>>>>
>>>>> The 1st column is Fund, in which each cell is the code of one fund. The 2nd column is Year, from 2005-2010. The 3rd column is Week, from Week1-Week26. The last column is Return.
>>>>>
>>>>> I tried "if" to restrain the sktest to each fund code but it didn't work. And I have 299 funds that I wish to finish the tests in one go instead of running each fund for 299 times.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/