Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: manipulating matrix elements

From   Steve Vaisey <>
Subject   RE: st: manipulating matrix elements
Date   Fri, 31 Mar 2006 15:33:06 -0500

I compared the standardizations (yours and Martin's) on a matrix of 11 variables (for a total of 55 non-redundant, 2 variable comparisons). They are highly correlated (~.99) so it's probably a matter of taste at one level. Or am I missing something?

Thanks again for your help on this.


Date: Thu, 30 Mar 2006 18:34:16 +0100
From: "Nick Cox" <>
Subject: RE: st: manipulating matrix elements

Thanks for the references. I found the second reference once I had worked out that the volume number is 107.
Martin does, as you say, use the quantity
SUM p ln p + ln #cells
In fields I know a bit about, it is more common
to use
- SUM p ln p = H
as a basic quantity. This is what is used in my program -ineq- on SSC, for example.
Also, if this H is based on K categories, it can vary between 0 and ln K, so a simple scaling is H / ln K. (In the limiting case of a single category with p = 1, you have to trap the 0 / 0 calculation.) There is no assumption or approximation in this.
I am not clear that this is what you doing, but no matter.
Looking at my little program, it is easy to generalise it so that it can take one variable or two. This is me
modifying the program so it does things I sometimes want to do, no more.
*! 1.0.0 NJC 30 March 2006
program myentropy, rclass
version 9 syntax varlist(min=1 max=2) [if] [in] [fweight aweight]
marksample touse qui count if `touse' if r(N) == 0 error 2000
tempname matname
tab `varlist' [`weight' `exp'] if `touse', matcell(`matname')
mat `matname' = `matname' / r(N)
mata: subroutine("`matname'")
di di as txt "entropy " as res %7.4f r(entropy) di as txt "scaled [0,1] " as res %7.4f r(scaled) return scalar entropy = r(entropy) return scalar scaled = r(scaled) end

void subroutine(string scalar matname)
real matrix X
real scalar H X = st_matrix(matname)
H = -sum(X :* ln(X)) scaled = H == 0 ? 0 : H / ln(rows(X) * cols(X)) st_numscalar("r(entropy)", H)
st_numscalar("r(scaled)", scaled)

Steve Vaisey


*   For searches and help try:

© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index