# Re: st: A methodological problem

 From "Stas Kolenikov" To statalist@hsphsun2.harvard.edu Subject Re: st: A methodological problem Date Thu, 30 Oct 2008 09:04:59 -0500

```1. You really have a problem of dyadic data analysis: your actual
observations are games between the two teams (right?) so you probably
have 20*19/2 = 190 observations if every team played other team.
Dyadic data analysis is an emerging theme, the methods are somewhat
complex as they need to take symmetry of the problem into account
which produces some weird non-linear constraints and puts parameters
in unwieldy places in the likelihoods. Peter Hoff from U of Washington
has been working on this and proposing some Bayesian solutions. Kit
Baum should know something too as he worked on international trade
data that has similar structure (imports and exports between two
countries).

2. I don't think there are universally recognized and well established
solutions with complete populations. Certainly most statistical
f(x,theta)". In survey statistics, there are some ideas about
superpopulations, the infinite populations from which a given finite
population was drawn; and some methods take those superpopulations
into account and explicitly talk about estimation of parameters of
those superpopulations. I am not totally sure you want to get into
this talk.

Alternatively, there's been some amount of work on what's called
apparent populations
(http://www.citeulike.org/user/ctacmo/article/333410). The solutions
proposed there are mostly along Bayesian lines, again: since all the
data you have appears to be fixed as a population, that paradigm may
indeed be a better way to go.

Of course the whole idea of fixed population with fixed
characteristics runs contrary, to say the least, to the nature of
sports: you can make some predictions, in some sort of large numbers
sense (Manchester United is more likely to win Champions League title
than any of the Russian clubs... at least so far :) ), but nobody can
predict the results perfectly, let alone for any particular game.
That's an interesting methodological challenge.

Finally, in American Statistical Association, there is Section on
Statistics in Sports, and I believe there's a journal with the same
title. You can take a look and see if they have something sensible for
you.

On 10/30/08, Carlo Amenta <carlo.amenta@gmail.com> wrote:
> Dear statalisters,
>
>  I am studying a sport team league with 20 teams in a specific years. I
>  am using a 2sls estimator because of simultaneity problem with a
>  specific variable which was confirmed using the -ivendog- procedure.
>  At this stage the study is cross sectional and regards a specific
>  season. It is correct to say that I have not any efficiency or
>  consistency problem wuth the estimator considering the fact that I am
>  studying the entire populationa and not a sample? As a matter of fact
>  n=20 even if very small it is not the number of observation but all
>  the teams in the league so the entire population. I think I have not
>  to worry about any inference problem. Can someone confirm that or
>  indicate any specific references?
>  Thank you
>
>
>  Carlo Amenta
>  University of Palermo
>  *
>  *   For searches and help try:
>  *   http://www.stata.com/help.cgi?search
>  *   http://www.stata.com/support/statalist/faq
>  *   http://www.ats.ucla.edu/stat/stata/
>

--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```