|  |  | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Predicted probabilities after oprobit w/robust  standard errors
At 02:40 PM 6/2/2006, Nick Winter wrote:
No.
You are confusing the (sampling) variance of the various estimates, 
with the variance of the underlying distribution.  The latter is 
normalized to one regardless of the technique used to estimate the 
sampling variances.
--NW
Well put.  And to try it one other way - lets say a particular case 
has a predicted probability of 30% of being in category 1.  But, that 
30% is itself an estimate.  The 95% confidence interval for it might 
run from, say, 24% to 36%.
And, in an OLS regression, you have a single predicted value.  In 
oprobit and other multi-outcome techniques, you have more than one 
predicted value.  In all the techniques, the predicted value is your 
"best guess" as to the true value.  But, because of sampling 
variability, your best guess may be too high or too low.
In terms similar to how Matt is putting it - suppose your OLS 
predicted value was $10,000, with a confidence interval that ran 
$1,000 either way.  Then you specify robust standard errors, and then 
all of a sudden the predicted value is still $10,000 but with a 
confidence interval that runs a million dollars either way. 
(Hopefully this would never actually happen!)  Well, I suppose you 
could say that, in the latter case, there is a greater probability 
that the person is actually a millionaire than in the first 
case.  But, our "best guess" is still $10,000.  Likewise, in an 
oprobit, our best guess of being in category 1 is going to stay at, 
say, 15%, but huge standard errors are going to make us less 
confident of how accurate that prediction is.
The ideas of sampling variability and heterogeneity may also be 
getting confounded here.  You may have reason for believing there is 
heterogeneity in the residuals, e.g. there is more variability for 
women than men.  If so, a location-scale (aka heterogeneous choice) 
model may be appropriate.  But heterogeneity is different from 
sampling variability.  Sampling variability is a characteristic of 
the sample, and things like drawing a larger sample will generally 
reduce it.  But heterogeneity is a characteristic of the population; 
and even if you had the entire population in your sample, a failure 
to control for heterogeneity could bias your parameter estimates in a 
logit or probit analysis.  See, for example,
Allison, Paul.  1999.  "Comparing Logit and Probit Coefficients 
Across Groups." Sociological Methods and Research 28(2): 186-208.
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX:    (574)288-4373
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW (personal):    http://www.nd.edu/~rwilliam
WWW (department):    http://www.nd.edu/~soc 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/