Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: too many base levels specified?


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: too many base levels specified?
Date   Mon, 06 Dec 2010 14:19:55 -0600

Traci Schlesinger <traci.schlesinger@gmail.com> is using -suest- with the
results from two -logistic- model fits and is getting the "too many base
levels specified" error:

> I am analyzing racial disparities in criminal justice outcomes using
> both logistic and reg on individual-level data split by sentencing
> structure.  I first run the models on states with traditional codes
> and then on states with presumptive guidelines.  I am using the SUEST
> command so that I can compare the values of the variables "black" and
> "latino" across these two models.  In other words, I want to know if
> the value of "black", for example, is different in states with
> traditional codes than in states with presumptive guidelines.
> 
> here is (part of) my code:
> 
> logistic incarcerate black latino young1 young2 drugt totchgs2 chg1att
> second_fel cjstatus priarr prifconv primconv pripris y1998 y2000 y2002
> y2004 y2006 i.county if guidelines_cat == 0;
> est store a;
> logistic incarcerate black latino young1 young2 drugt totchgs2 chg1att
> second_fel cjstatus priarr prifconv primconv pripris y1998 y2000 y2002
> y2004 y2006 i.county if guidelines_cat == 2;
> est store b;
> suest a b;
> test [a_incarcerated]black=[b_incarcerated]black;
> test [a_incarcerated]latino=[b_incarcerated]latino;
> 
> While this code is very similar to codes I have used in the past, I am
> getting the error code "too many base levels specified r(198);" when
> stata gets to the "suest a b;" line.  What does this mean?  And, is
> there anything I can do about it?

It appears that the smallest value of the 'country' variable is different when
'guidelines_cat' is equal to 0 compared to when it is equal to 2.  This
results in a different base level chosen for the 'country' factor variable in
the two -logistic- model fits.  When these two models are combined by -suest-,
Stata complains about the two different base level choices.

Traci can -fvset- the base for the 'country' factor variable so that it is
consistent between model fits.

Here is a fabricated example using the auto dataset.  Output is omitted for
the sake of brevity.

	. sysuse auto
	. tabulate rep78 foreign, nolabel

Reviewing the output from -tabulate- shows that 'rep78' takes on the integer
values from 1 to 5, but there are no observations where 'rep78' is 1 or 2 and
'foreign' is 1.

Let's use -fvset- to fix the base level for -rep78- to 5:

	. fvset base 5 rep78

Now we can fit two linear regression models (or any models we want) without
having to also specify the common base level.

	. regress mpg turn i.rep if foreign==1
	. estimates store Foreign
	. regress mpg turn i.rep if foreign==0
	. estimates store Domestic

Thus we can use -suest- with the stored estimation results.

	. suest Foreign Domestic
	. test [Foreign_mean]turn = [Domestic_mean]turn

Notice that we chose a base of 5 for 'rep78' instead of 1.  The above -test-
result will be the same, but the interpretation of the 'rep78' coefficients
would not be consistent between the model fits.  This would only matter if we
were interested in comparing the 'rep78' coefficients between the model fits.

--Jeff
jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index