Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: standard errors after xtmixed, predit.., fitted

From	Steve Samuels <[email protected]>
To	[email protected]
Subject	Re: st: standard errors after xtmixed, predit.., fitted
Date	Wed, 4 Jan 2012 18:14:42 -0500
No-the covariance is for the actual random effects, not for the estimated effects.  Moreover, by subtracting the mean for years, the correlation is reduced to near zero, at least in the example I gave.

If you still have problems you _must_ show us all the output, including the e(b) matrix, the scalar lists, and the results of describing the estimated effects (to check on their names).

If there are relatively few top level regions, I still do not see why they should be given random effects.


Steve
[email protected]


On Jan 4, 2012, at 3:07 AM, Jennyfer Wolf wrote:

Sorry, I thought I still would need to add the covariance contribution
to the example below.
Again many, many thanks for all the great help!
Jennyfer

2012/1/3 Steve Samuels <[email protected]>:
> Jennyfer:
> 
> You must have missed http://www.stata.com/statalist/archive/2011-12/msg00852.html
> 
> 
> Steve
> 
> On Jan 3, 2012, at 7:58 AM, Jennyfer Wolf wrote:
> 
> Thank you!! And sorry for getting back so late, but I did not have
> access to the dataset for a couple of days.
> As (hopefully) a last step could you explain me for this approach how
> to add the covariance term (for covariance(unstructured)) to the
> fitted standard error? I tried it like you explained to me in one of
> your earlier mails:
> 
> scalar atrho = [atr2_1_1_2]_cons
> scalar  rho =   (exp(2*atrho)-1)/(exp(2*atrho)+1)
> scalar  cov = rho*sd_country*sd_country_u
> 
> gen se_fitted=  ///
> sqrt(se_fix^2 + (se_u1^2+(se_u2*year_cat)^2 + se_u3^2+year_cat*2*cov )
> 
> 
> but I get so many missing values for the confidence intervals that i
> cannot give estimates with uncertainty.
> 
> Thank you!!
> 
> 
> 
> 
> 2011/12/26 Steve Samuels <[email protected]>:
>> 
>> Jennyfer:
>> 
>> 1. Here's a version that users -predict, rses- to get standard errors of the random effects. This is the right one, I think. This model has just the linear predictor "year", which is akin to your spline prediction.  In this case there is one observation for each state and year, as I believe you probably have one observation per year and country.
>> 
>> 2. The estimated SEs of fitted are indeed sensitive to the location (not scale) of year.  I believe that I've seen references to this phenomenon with longitudinal data before, but I don't have a text handy.
>> 
>> I recommend that you center the year term (subtract off the mean).  Compare below to using year0 = year-1978.
>> 
>> Steve
>> 
>> *******************************************
>> set more off
>> webuse productivity, clear
>> 
>> xtmixed gsp  year ///
>>     || region: || state: year, cov(unstructured) reml
>> 
>> predict fitted, fitted
>> predict se_fix,  stdp
>> predict se_u*, reses
>> des se_u*     // se_u2 is interaction term
>> gen se_fitted=  ///
>> sqrt(se_fix^2 + se_u1^2 + (se_u2*year)^2 + se_u3^2)
>> 
>> sum  se_fitted, det
>> 
>> sum fitted gsp*
>> corr gsp fitted
>> ****************************************************
>> 
>> On Dec 26, 2011, at 6:59 AM, Jennyfer Wolf wrote:
>> 
>> Dear Steve,
>> 
>> thank you for your email and please:
>> 
>> no need to apologize, it is absolutely great help that you offer!!
>> 
>> I am actually wondering at the moment about the following thing:
>> 
>> when I use "your" formula "gen se_fitted..." to calculate the standard errors and then the CIs for the predictions of the supply for a certain year the formula is sensitive to a change of scale (probably not the right words, excuse my English). I mean I use year as the dependant variable and when I take year as ranging from 1980 to 2015 I get a very different fitted standard error than when I set 1980 to Zero and therefore work on a scale from 0 to 35 (35==2015).
>> 
>> In my understanding this should not change the standard error for the predicted value of water supply??
>> I hope my question is understandable.
>> 
>> Very good Christmas!
>> 
>> Jennyfer
>> 
>> Am 25.12.2011 16:45, schrieb Steve Samuels:
>>> Jennyfer:
>>> 
>>> On second thought, there is no need to add the term for the standard error of the survey point estimates (se_logit^2), nor for the error terms (se_error^2).  So in my example, the equation should be:
>>> 
>>> ************************************************
>>> gen se_fitted=  ///
>>> sqrt(se_fix^2 +sd_region^2 +  sd_state^2 ///
>>> + (unemp*sd_state_u)^2 +unemp*2*cov)
>>> ************************************************
>>> 
>>> I apologize for the confusion.
>>> 
>>> Steve
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Correction:
>>> If q = 1 - p
>>> se_logit =  se_p/(p*q)
>>> se_logit^2 = (se_p/(p*q))^2
>>> 
>>> Steve
>>> 
>>> Jennyfer.
>>> 
>>> If there are not many different regions at your highest level, I doubt that you should be fitting each a random effect-in what sense are they random?; fixed effects for the highest levels would probably be better.
>>> 
>>> In addition to a covariance term (below), you will need to add a term for the  survey standard errors. If  se_p is the estimated survey standard error for a proportion p, then the squared standard error for the logit to add would be:  se_logit^2 = se_p^2/(p*(1-p)).
>>> 
>>> And yes, compute interval endpoints on the logit scale and convert back with the invlogit() function. And no, back-transformed standard errors (or SDs) need not look like those for the original data.
>>> 
>>> ***********************
>>> sysuse auto, clear
>>> gen lprice = log(price)
>>> mean price lprice
>>> di exp(0.0455814)
>>> ******************
>>> 
>>> You have an additional problem if the yearly estimates for a single country are correlated by virtue of the survey design. I would have considered adding a correlation structure to the residual s.
>>> 
>>> Here is how to add the covariance contribution to the standard error ifor  Stata -productivity- example. Note that the "atr" term is the hyperbolic arctangent, not the log, of the correlation.
>>> 
>>> ***********************************************
>>> webuse productivity, clear
>>> 
>>> xtmixed gsp private emp hwy water other unemp ///
>>>   || region: || state: unemp, cov(unstructured) reml
>>> matrix list e(b)  //names of terms
>>> scalar sd_err  = exp([lnsig_e]_cons )
>>> scalar sd_region = exp([lns1_1_1]_cons)
>>> scalar sd_state_u  = exp([lns2_1_1]_cons)
>>> scalar sd_state   = exp([lns2_1_2]_cons)
>>> 
>>> scalar atrho = [atr2_1_1_2]_cons
>>> scalar  rho =   (exp(2*atrho)-1)/(exp(2*atrho)+1)
>>> scalar  cov = rho*sd_state*sd_state_u
>>> 
>>> scalar dir // check these quantities against results
>>> 
>>> predict fitted, fitted
>>> predict se_fix,  stdp
>>> 
>>> gen se_fitted=  ///
>>> sqrt(se_fix^2 +sd_region^2 +  sd_state^2 ///
>>> + (unemp*sd_state_u)^2 +unemp*2*cov + sd_err^2)
>>> sum  se_fit*
>>> *****************************************************
>>> 
>>> 
>>> When you post in the future, please describe what you really did (Statalist FAQ Section 3.3). It will save a lot of time.  Just to warn you: I'll have only infrequent looks at Statalist for the next 10 days.
>>> 
>>> Steve
>>> [email protected]
>>> 
>>> 
>>> On Dec 22, 2011, at 6:11 AM, Jennyfer Wolf wrote:
>>> 
>>> Dear Steve, thanks again so much!
>>> 
>>> I have data from different surveys (survey point estimates) and I use
>>> a term for unstructured covariance in my model:
>>> 
>>> xtmixed wat_tot year_spline1*|| reg1: || reg2: || country2:year_cat,
>>> cov(unstructured).
>>> 
>>> so I will add an error term for this in my calculation of the fitted
>>> standard error:
>>> scalar sd_cov = exp([atr3_1_1_2]_cons)
>>> 
>>> Actually wat_tot (the dependent variable) is transformed with logit(),
>>> to restrict observations between 0 and 1 as I am modelling
>>> proportions.
>>> 
>>> After "predict A, fitted" I use the inlogit() command to get the
>>> backtransformed estimates. However, I had problems to backtransform
>>> the standard errors because when I compared standard errors received
>>> without any transformation of the dependent variable and
>>> backtransformed standard errors received from a transformed dependent
>>> variable, these values were very different. Would you know a solution
>>> to this (is it correct to also backtransform the fitted standard
>>> error) and also (of course) I would like to restrict my confidence
>>> intervals to values between 0 and 1.
>>> 
>>> One concern rests relating to the confidence intervals I calculate
>>> with the fitted standard errors:
>>> The model fits the individual country data very well and the
>>> predictions for the estimates and the fitted values seem very
>>> sensible, however, the standard error and the CIs calculated with the
>>> method you proposed for the fitted values are huge and actually take
>>> any sense away from making a prediction..
>>> Does that mean I need to mak my model simpler?
>>> 
>>> Thank you very much for the great support!
>>> Jennyfer
>>> 
>>> 
>>> 2011/12/22 Steve Samuels<[email protected]>:
>>>> Jennyfer,
>>>> 
>>>> I misunderstood your request: my solution was for an observation chosen at random and it incorrectly omitted the residual SD term, to boot.  Try this.
>>>> 
>>>> *******************************************
>>>> webuse productivity, clear
>>>> xtmixed gsp private emp hwy water other unemp ///
>>>>  || region: || state: unemp
>>>> 
>>>> matrix list e(b)  //names of terms
>>>> scalar sd_res = exp([lnsig_e]_cons)
>>>> 
>>>> predict se_fix,  stdp
>>>> predict se_region se_state_u se_state, reses
>>>> des se* //check against variable labels
>>>> gen se_fitted =  ///
>>>> sqrt(se_fix^2 +se_region^2 +  se_state^2 ///
>>>> + (unemp*se_state_u)^2 +sd_res^2)
>>>> *******************************************
>>>> 
>>>> I think that in your case the last three statements will be:
>>>> ******************************************************************
>>>> predict se_region1 se_region2  se_country_year se_country, rses
>>>> des se*  //check against variable labels
>>>> sqrt(se_fix^2 +se_region1^2 +  se_region2^2 + ///
>>>> + se_country^2 + (year_cat*se_country_year)^2 +sd_res^2)
>>>> ******************************************************************
>>>> 
>>>> Note that these statements assume that there is no correlation between the country and countryXyear random effects, which is what your model implies.  If there is such correlation (and you can test for it), then a covariance term must be added to the estimated standard error.
>>>> 
>>>> If you happen to have sample survey data, then be sure to read the section of Survey Data in the manual entry for -xtmixed-.
>>>> 
>>>> Steve
>>>> [email protected]
>>>> 
>>>> 
>>>> On Dec 21, 2011, at 10:01 AM, Jennyfer Wolf wrote:
>>>> 
>>>> Thank you very much for your answer. I've tried it in many different
>>>> variations but I guess there are problems with this approach:
>>>> 
>>>> 1. the squared standard deviations that we are adding up are
>>>> describing variation from the fixed effects but, when I understand
>>>> right, not the error of the model
>>>> 
>>>> 2. the CIs I need describe the uncertainty for the estimates for each
>>>> country so countries with more datapoints have a narrower CI and also
>>>> for future predictions the CI should get wider (which does not happen
>>>> with the approach you suggested.
>>>> 
>>>> I tried gllamm and used the "ci_marg_mu" command after "gllapred x, mu
>>>> marg fsample" but this does not fit to my individual country data and
>>>> still gives me the same CIs no matter how many survey points I have
>>>> per country.
>>>> 
>>>> Any more ideas on how to get confidence intervals after "xtmixed" and
>>>> "predict x, fitted" for the predicted values in multilevel modeling?
>>>> (Alternatively with gllamm)
>>>> Thank you very, very much.
>>>> 
>>>> Jennyfer
>>>> 
>>>> 
>>>> 2011/12/17 Steve Samuels<[email protected]>:
>>>>> Correction: I should not have included the SD for the error term, as it is not part of the fitted value.
>>>>> 
>>>>> 
>>>>> Here's an example more like yours, but with two levels, not three. I expect that you can take it
>>>>> from here
>>>>> *******************************************
>>>>> webuse productivity, clear
>>>>> xtmixed gsp private emp hwy water other unemp  || region: || state: unemp
>>>>> 
>>>>> matrix list e(b)  //names of terms
>>>>> scalar sd_region = exp([lns1_1_1]_cons)
>>>>> scalar sd_state_u  = exp([lns2_1_1]_cons)
>>>>> scalar sd_state   = exp([lns2_1_2]_cons)
>>>>> scalar dir // check these SDs against results
>>>>> 
>>>>> predict se_fix,  stdp
>>>>> 
>>>>> gen se_fitted =  ///
>>>>> sqrt(se_fix^2 +sd_region^2 +  sd_state^2 + (unemp*sd_state_u)^2)
>>>>> *******************************************
>>>>> 
>>>>> Steve
>>>>> [email protected]
>>>>> 
>>>>> On Dec 16, 2011, at 11:34 AM, Jennyfer Wolf wrote:
>>>>> 
>>>>> Dear Statalist,
>>>>> 
>>>>> sorry for asking the question again, but we are a bit desperate so it
>>>>> would be great if anybody has a solution for my question:
>>>>> 
>>>>> Is it possible to get standard errors for the fitted values of a
>>>>> multilevel-model (three levels, random slope and intercept) after
>>>>> 
>>>>> xtmixed dep_var indep_var || region1: || region2: || country :year_cat
>>>>> predict var, fitted
>>>>> 
>>>>> ?
>>>>> 
>>>>> We would like to present the estimated values with a confidence interval?
>>>>> If it is not possible to get the standard errors for the predicted
>>>>> values from Stata, is it possible to calculate these values from the
>>>>> Standard Errors from the individual estimates?
>>>>> 
>>>>> Thank you very very much.
>>>>> With kind regards,
>>>>> Jennyfer
>>>>> *
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>> 
>>> 
>>> 
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
References:
- Re: st: standard errors after xtmixed, predit.., fitted
  - From: Jennyfer Wolf <[email protected]>
- Re: st: standard errors after xtmixed, predit.., fitted
  - From: Steve Samuels <[email protected]>
- Re: st: standard errors after xtmixed, predit.., fitted
  - From: Jennyfer Wolf <[email protected]>
Prev by Date: RE: st: multiple CI/mean plots in one graphic
Next by Date: st: logit transformation of RHS variable
Previous by thread: Re: st: standard errors after xtmixed, predit.., fitted
Next by thread: st: Difference in values within the same variable
Index(es):
- Date
- Thread