Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: How do I test that two subsample have different coefficient of variation?


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: How do I test that two subsample have different coefficient of variation?
Date   Fri, 11 Jul 2008 10:57:30 +0100

There are some references in 

Sokal, R.R. and Rohlf, F.J. 1995. Biometry. New York: W.H. Freeman.

The rough argument for thinking logarithmically goes like this. It makes
sense to work with the coefficient of variation whenever standard
deviation is proportional to mean. That implies that variability is
multiplicative, not additive, which in turn implies working on a
logarithmic scale. 

Clearly, that in turn is possible only if values are all strictly
positive. (I can also imagine that there might be problems in which cv
looks attractive but in which values are all strictly negative, in which
case just discard the signs for this purpose.) 

It might be that there are variables including zeros in which cv appears
also natural, or at least convenient, in which case you do have the
problem, much discussed but never fully solved, of what to do when
log(zero) is implied. 

However, I've seen in three publications at least coefficients of
variation for monthly temperatures measured in Celsius. In each case the
authors were lucky that, to the resolution reported, means were never
zero. But in each case negative means and so negative cvs were
perceptively interpreted by the authors as a sign that the cv was not
fully satisfactory as a measure of relative variability. One of these
authors at least had the bright idea that changing to Fahrenheit would
be a solution. However, other advice springs to mind, notably "Don't do
that!". Suppressing the references I take in this instance to be the
greater service to science. As Jay Verkuilen also points out in this
thread, Celsius (and Fahrenheit) temperatures are interval scale
measures for which ratios are not appropriate. 

However, another argument is worth brief airing. Contemplation of the
gamma distribution shows that the coefficient of variation is a natural
parameter for that family. Thus fitting gammas might be a way forward in
some problems of this kind. Generalised linear models as in -glm- treat
the scale parameter as ancillary and are not especially suitable for
this purpose, but -gammafit- from SSC might be of use. 

I note also that various arguments point to the cube root as (to a good
approximation) a natural transformation for the gamma. 

Nick
n.j.cox@durham.ac.uk 

Austin Nichols

Antonio Vezzani <antonio.vezzani@uniroma2.it> et al.--
Maarten provides a link to testing equality of variances (of errors)
using -robvar- (help sdtest) and Nick proposes working on a log scale
(for strictly positive variables only), but neither of these are
actually a test of equality of CV.  I suspect Yulia Marchenko could
outline a general procedure using -xtmixed-
(http://www.stata-journal.com/article.html?article=st0095).  I will
propose yet another answer that does not do exactly what you want:
-geivars- on SSC will calculate SEs for the squared coef of variation
(see also http://econpapers.repec.org/paper/bocasug06/16.htm).  As for
a simple command to follow

sysuse auto, clear
tabstat price, stat(cv) by(for)

allowing a test of equality of CV, I don't think there is one.

I believe the sampling distribution of the CV is tricky...  esp. if
one is unwilling to stipulate that the variable of interest is
normally distributed in the population:
http://www.ripublication.com/ijss/ijssv1n1_5.pdf
Gupta RC, Ma S. Testing the equality of the coefficient of variation
in k normal populations. Communications in Statistics.
1996;25:115-132.
Wilson CA, Payton ME. Modelling the coefficient of variation in
factorial experiments. Communications in Statistics-Theory and
Methods. 2002;31:463-476.

Perhaps working with the reciprocal (mean/sd) offers greater stability?
http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel3/24/8488/00370217.p
df?temp=x
but I can't see that paper, just this abstract:
Sharma, K.K. and H. Krishna. 1994. "Asymptotic sampling distribution
of inversecoefficient-of-variation and its applications" IEEE
Transactions on Reliability, 43(4):630 - 633. This paper develops the
asymptotic sampling distribution of the inverse of the coefficient of
variation (InvCV). This distribution is used for making statistical
inference about the population CV (coefficient of variation) or InvCV
without making an assumption about the population distribution.

On Thu, Jul 10, 2008 at 1:13 PM, Maarten buis <maartenbuis@yahoo.co.uk>
wrote:
> --- Antonio Vezzani <antonio.vezzani@uniroma2.it> wrote:
>> If, for example,  in auto.dta I want to test that price have
>> different coefficient of variation for foreign and domestic auto,
>> which is the right procedure?
>
> Christopher F. Baum (206) Stata tip 38: Testing for groupwise
> heteroskedasticity, The Stata Journal, 6(4): 590--592.
> http://www.stata-journal.com/article.html?article=st0117

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index