Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Chi-square goodness of fit with grouped counts


From   "Krier, Betty" <Betty.Krier@oig.dot.gov>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: Chi-square goodness of fit with grouped counts
Date   Wed, 1 Mar 2006 10:04:37 -0500

Title: Chi-square goodness of fit with grouped counts

This is actually a statistical question, rather than a programming one. I have data by markets (e.g. LA to NYC, CHI to LA, etc.) for numbers of flights cancelled in a given time period. For example, in market A there may be 400 cancellations, in market B 1327 cancellations, and so on.  (I also have the total number of scheduled flights for each market in that same time period.)

I am interested in analyzing whether there is any significant pattern in the distribution of cancellations across short versus medium versus long-distance markets. I'm thinking that I want to use a chi-square goodness of fit test, comparing an expected distribution of cancellations across these market categories with what is observed. The problem is, I don't have standard frequency data in that I don't have data on individual flights; I have the number of cancellations by market.

At first, I thought that I could add up the numbers of cancellations in all short-distance markets to get the observed number of short-distance flight cancellations, and do similarly for the medium and long-distance markets. However, something about this doesn't seem right, and I get huge chi-square statistics if I do the calculations this way.

Is there a way to use a chi-square goodness of fit test in this context, and, if so, how should I account for my actual number of observations being equal to the number of markets and not the number of scheduled flights?




© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index