[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: A methodological problem

From   "Feiveson, Alan H. (JSC-SK311)" <>
To   <>
Subject   RE: st: A methodological problem
Date   Thu, 30 Oct 2008 09:37:48 -0500

Carlo - Even though you are analyzing all the games ,couldn't the
results from a given year be considered a realization of random outcomes
over years? So each years' data can be considered an observation of a
190-vector (as Stas points out) of results.

Al Feiveson

-----Original Message-----
[] On Behalf Of Stas
Sent: Thursday, October 30, 2008 9:05 AM
Subject: Re: st: A methodological problem

1. You really have a problem of dyadic data analysis: your actual
observations are games between the two teams (right?) so you probably
have 20*19/2 = 190 observations if every team played other team.
Dyadic data analysis is an emerging theme, the methods are somewhat
complex as they need to take symmetry of the problem into account which
produces some weird non-linear constraints and puts parameters in
unwieldy places in the likelihoods. Peter Hoff from U of Washington has
been working on this and proposing some Bayesian solutions. Kit Baum
should know something too as he worked on international trade data that
has similar structure (imports and exports between two countries).

2. I don't think there are universally recognized and well established
solutions with complete populations. Certainly most statistical methods
start with "Let's take an i.i.d. sample from distribution f(x,theta)".
In survey statistics, there are some ideas about superpopulations, the
infinite populations from which a given finite population was drawn; and
some methods take those superpopulations into account and explicitly
talk about estimation of parameters of those superpopulations. I am not
totally sure you want to get into this talk.

Alternatively, there's been some amount of work on what's called
apparent populations
( The solutions
proposed there are mostly along Bayesian lines, again: since all the
data you have appears to be fixed as a population, that paradigm may
indeed be a better way to go.

Of course the whole idea of fixed population with fixed characteristics
runs contrary, to say the least, to the nature of
sports: you can make some predictions, in some sort of large numbers
sense (Manchester United is more likely to win Champions League title
than any of the Russian clubs... at least so far :) ), but nobody can
predict the results perfectly, let alone for any particular game.
That's an interesting methodological challenge.

Finally, in American Statistical Association, there is Section on
Statistics in Sports, and I believe there's a journal with the same
title. You can take a look and see if they have something sensible for

On 10/30/08, Carlo Amenta <> wrote:
> Dear statalisters,
>  I am studying a sport team league with 20 teams in a specific years. 
> I  am using a 2sls estimator because of simultaneity problem with a  
> specific variable which was confirmed using the -ivendog- procedure.
>  At this stage the study is cross sectional and regards a specific  
> season. It is correct to say that I have not any efficiency or  
> consistency problem wuth the estimator considering the fact that I am

> studying the entire populationa and not a sample? As a matter of fact

> n=20 even if very small it is not the number of observation but all  
> the teams in the league so the entire population. I think I have not  
> to worry about any inference problem. Can someone confirm that or  
> indicate any specific references?
>  Thank you
>  Carlo Amenta
>  University of Palermo
>  *
>  *   For searches and help try:
>  *
>  *
>  *

Stas Kolenikov, also found at Small print: I
use this email account for mailing lists only.
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2023 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index