Lee Sieswerda <Lee.Sieswerda@tbdhu.com> took my suggestion to one of George Hoffman's question in a completely surprising direction, at least to me. I was answering George's 2nd question, whereas Lee was answering George's 1st. Even so, Lee found an interesting way to twist my answer toward the 1st question. Taking great liberty with Lee's response, he basically suggests using -ci- to get the CIs for two different categories and graphing those along with the original data using -twoway-. (Lee actually used -egen- but notes that the results are the same as -ci-.). Lee then picked up on my suggestion to use -predictnl- to get CIs for INDIVIDUAL observations after a -regress-ion and then cleverly used an indicator variable as the regressor so that those CIs would be the same for all observations in a group. He then compared the results of -ci- to those from -predictnl- and found that they were different. Using the auto data, Lee gets the following CIs using -predictnl- after -regress-. regress weight foreign predictnl yhat=predict(), ci(lb ub) . bysort foreign: sum ub lb ____________________________________________________________________________ -> foreign = Domestic Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ub | 52 3491.338 0 3491.338 3491.338 lb | 52 3142.893 0 3142.893 3142.893 ____________________________________________________________________________ -> foreign = Foreign Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ub | 22 2583.76 0 2583.76 2583.76 lb | 22 2048.058 0 2048.058 2048.058 Lee notes that these are different from -ci-, . ci weight, by(foreign) ____________________________________________________________________________ -> foreign = Domestic Variable | Obs Mean Std. Err. [95% Conf. Interval] -------------+-------------------------------------------------------------- weight | 52 3317.115 96.4296 3123.525 3510.706 ____________________________________________________________________________ -> foreign = Foreign Variable | Obs Mean Std. Err. [95% Conf. Interval] -------------+-------------------------------------------------------------- weight | 22 2315.909 92.31665 2123.926 2507.892 Let me get the same results as, -predictnl- more directly by using foreign and domestic indicator variables directly in -regress-. . gen domestic = ! foreign . regress weight for domestic, noconstant [...] ------------------------------------------------------------------------------ weight | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- foreign | 2315.909 134.3649 17.24 0.000 2048.058 2583.761 domestic | 3317.115 87.39676 37.95 0.000 3142.893 3491.338 ------------------------------------------------------------------------------ We see that the 95% CIs from regress match those from -predictnl- after regress, as they should. Now, however, it is easier to see why the CIs are different. -ci- with the -by()- option assumed independent samples for domestic and foreign, one with 22 observations and one with 52 observations, and it also assumed that two variances were to be estimated, one for domestic and the other for foreign. -regress-, on the other hand, assumed a single variance was to be estimated and that variance had 72 degrees of freedom. In the parlance of regression, the -ci- estimates of variance allowed for heteroskedasticity across the domestic and foreign groups, while -regress- did not. Basically, we make different assumptions when using -regress- than when using -ci, by()-. -- Vince vwiggins@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

