Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: Re: st: Need Kullback–Leiber divergence measure

 From Tirthankar Chakravarty To statalist@hsphsun2.harvard.edu Subject st: Re: st: Need Kullback–Leiber divergence measure Date Sat, 8 May 2010 15:21:34 +0530

```<>

Although the user-written -multgof- (Jeroen Weesie, SSC) will do this
for you, it is pretty easy to do this by yourself, following excellent
http://www.stata-journal.com/sjpdf.html?articlenum=pr0024
I am assuming you want to find the divergence between the frequencies
of a two-way tabulation:
*********************************************
clear*
sysuse auto, clear
tabulate rep78 foreign, matcell(newmat)
mata
// Kullback-Leibler divergence
vP1 = st_matrix("newmat")[.,1]:/
sum(st_matrix("newmat")[.,1])
vP2 = st_matrix("newmat")[.,2] :/
sum(st_matrix("newmat")[.,2])
dKLdiv = sum(vP1:*log(vP1:/vP2))
// Kullback-Leibler symmetric divergence
dKLSdiv = 0.5*(dKLdiv+ sum(vP2:*log(vP2:/vP1)))
// Jensen-Shannon divergence
dJSdiv = sum(vP1:*log(vP1:/(0.5*(vP1+vP2)))) +
sum(vP2:*log(vP2:/(0.5*(vP1+vP2))))
dKLdiv, dKLSdiv, dJSdiv
end
*********************************************
See however, this discussion on the Matlab lists about handling zero
probabilities in discrete-valued distributions.
http://www.mathworks.com/matlabcentral/fileexchange/13089

Note that -multgof- will refuse to handle this case for you:
*********************************************
svmatf , mat(newmat) fil(newmat.dta)
use newmat, clear
multgof c1 c2, kl
*********************************************
using -svmatf- due Jan Brogger (SSC).

T

2010/5/8 Michael C. Morrison <Morrimic@niacc.edu>:
> I've searched Stata (with no success) for "KullbackLeiber divergence" also
> known as the information number, discrimination function, and “distance.”
>
> It's used to measure the divergence between two distributions.
>
> Any help would be appreciated.
>
> Mike
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```