There are also interpretations of the census parameters with
meta-populations and stuff... I won't bother trying to circumvent a
groundbreaking interpretation of the regression line for the whole
population means, although I would agree that in the very end, the
inference on this would hinge on that interpretation.
What you need to do is to express your model in such a way that you
can make the race an explanatory variable, so that you could test its
coefficient (and probably its interactions with other variables). You
would need to -reshape- your data for that purpose. One option then
that might be feasible is to re-express this as a Poisson regression
with the number of people admitted as the dependent variable, and the
base population as the offset. Then the race can be made one of the
explanatory variables. Another option is to run this as a multivariate
(meaning, with several response variables) regression model where the
coefficients will go side by side -- and then you can also compare
genders, Hispanics, whatever. If you really care about those temporal
autocorrelations, you would need to build up a vector autoregression
model.
I would be concerned that your rates variables are limited range and
heavily skewed. You might need to transform them by say taking logs --
-boxcox- might be yet another thing to look at.
On 1/11/07, Maarten buis <[email protected]> wrote:
--- Traci Schlesinger <[email protected]> wrote:
> By saying I should 'fudge it and compare them indirectly' do you
> simply mean that i should compare them 'by eye' -- stating that not
> only is the coefficient for mt_b larger in the model for blacks but
> that the t-score is also higher? This is not so terrible for this
> data, since according to the Bureau of Justice Statistics, this is
> the full population of people admitted to prison (not a sample), but
> I still think many people (esp. reviewers) would not be convinced
If you have the population, then "comparing them by eye" is all you can
do (or go Bayesian, but you probably don't want to go there).
Frequentist testing assumes that all uncertainty about a parameter is
the result of the fact that we only observed a random sample from the
population, not the actual population itself. Then it imagines what the
population would be like if the null hypothesis were true. Then it
looks at how likely it is to draw a random sample with the observed
test statistic or extremere, the p-value. If it is very unlikely to
draw such a sample "by accident" then we think that there must be
something wrong with the null hypothesis. Since you don't have a
sample, but the population concepts like the p-values loose their
meaning.
The idea behind testing is that what you find in a sample gives you
information about what happens in the population, but that information
is uncertain due to the fact that you are only looking at a sample.
Testing quantifies the uncertainty due to sampling. You have no
uncertainty on that account. You may have other sources of uncertainty:
is my model correct, are my variables measured without error, etc. But
testing can't help you there.
-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
--
Stas Kolenikov
http://stas.kolenikov.name
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/