FAQ:  Statistics 


Last updated:  02 April 2015 
How do I get the Euler–Mascheroni constant gamma = 0.57721 ... in Stata?
How do I calculate values of the beta function?
How are the chisquared and F distributions related?
How do I bootstrap a vector of results?
How can I use Stata to calculate power by simulation?
How large should the bootstrapped samples be relative to the total number of cases in the dataset?
What are some of the small sample adjustments to the sandwich estimate of variance?
Why does test sometimes produce chisquared and other times F statistics?
How can I compute the Chow test statistic?
Should the pvalue given with a paired ttest always be lower than the signrank?
Does Stata provide a test for trend?
Why do I get different results when running a ml procedure on Stata/SE and Stata/MP?
Why do I see different pvalues, etc., when I change the base level for a factor in my regression?
How do I calculate row medians?
How can I get an Rsquared value when a Stata command does not supply one?
How can I calculate percentile ranks?
How can I calculate plotting positions?
How do I estimate a nonlinear model using ml?
Why does bootstrap give a warning message for noneclass commands?
How do you fit a model when the dependent variable is a proportion?
How can I take random samples from an existing dataset?
How can I get the variance–covariance matrix or coefficient vector?
What are some of the problems with stepwise regression?
Why doesn't summarize accept pweights?
What does summarize calculate when you use aweights?
Why do estimation commands sometimes omit variables?
How do I fit a linear regression with interval (inequality) constraints in Stata?
How do I fit a regression with interval constraints in Stata?
For twostage leastsquares (2SLS/IV/ivregress) estimates,
Why is the Rsquared statistic not printed in some cases?
Why is the model sum of squares sometimes negative?
Why are the Rsquared and model sum of squares sometimes negative?
What is the effect of specifying aweights with regress?
Why is the pseudoR^{2} for tobit negative or greater than one?
Why do I get an error message when I try to run a repeated measures ANOVA?
How does the anova command handle collinearity?
How do I fit a bivariate probit model with partial observability and only one dependent variable?
How are the standard errors and confidence intervals computed for odds ratios (ORs) by logistic?
How do I obtain confidence intervals for predicted probabilities after logistic regression?
How can I get confidence intervals for predicted probabilities after probit?
How can I do logistic regression or multinomial logistic regression with grouped data?
Why do I get the message "outcome does not vary" when I perform a logistic or logit regression?
What is the difference between odds and odds ratio?
Why is there no intercept in the clogit model?
In clogit, why can't I use
covariates that are constant within panel?
Why does clogit sometimes report a coefficient but missing value for
the standard error, confidence interval, etc.?
How can I do logistic regression or multinomial logistic regression with grouped data?
How do I impose the restriction that rho is zero using the heckman command with full ml?
What is the difference between “endogeneity” and “sample selection bias”?
How are estimates of rho outside the bounds [1,1] handled in the twostep Heckman estimator? (Technical FAQ)
Why are there so many formulas for the inverse of Mills' ratio?
What if I have censoring from above/below in my Heckman selection model?
Where can I find a description of the various timeseries operators?
How do I obtain bootstrapped standard errors with panel data?
How can I generate a variable relating panel data to a reference panel?
How should I interpret changing quadchk results?
What is the difference between randomeffects and populationaveraged estimators?
Why don't the decomposed variances in xtsum add up?
Why does xtgls not report an R^{2} statistic?
How do I test for panellevel heteroskedasticity and autocorrelation?
What is the between estimator?
How does xtgls differ from regression clustered with robust standard errors?
Why does xtreg with the mle option produce different results from xtreg with only the re option?
How can there be an intercept in the fixedeffects model estimated by xtreg, fe?
What role does the time variable play in xtgls?
Why isn't the calculation of R^{2} the same for areg and xtreg, fe?
Why do I obtain different results when executing xttobit on the same data in different sessions?
Why does xtgee sometimes report that convergence was not achieved?
How can I calculate the pseudo R^{2} for xtprobit?
What are the divisors used in xtgee? (Technical FAQ)
Can Stata estimate a Rasch model?
How does Stata's implementation of GEE differ from other implementations?
What is the relationship between baseline hazard and baseline hazard contribution?
How can I obtain the standard error of the regression with streg?
How do I convert my spelltype data into a survival dataset?
How do I stset my spelltype data?
How do I analyze multiple failuretime data using Stata?
Why does stsum sometimes report missing values for the percentiles of survival time?
Why can't a subject die at time 0?
Why can't a subject enter and die at the same time in the Cox model?
What is the difference between sts list and ltable?
How is the number of observations computed for subpopulation estimation?
How do I obtain percentiles for survey data?
Is there a way in Stata to do stepwise regression with svy: logit or any of the svy commands?
Do the svy commands handle zero weights differently than nonsvy commands do?
Are the estimates produced by probit and logit with the
vce(cluster clustvar) option true maximum likelihood
estimates?
Is there a difference between the estimates produced by the svy:
probit, with psu variable specified in svyset command and
probit, vce(cluster clustvar) (and, similarly, between
svy: logit, psu variable specified in svyset and
logit, vce(cluster clustvar))?
Why doesn't summarize accept pweights? What does summarize calculate when you use aweights?
What are some of the small sample adjustments to the sandwich estimate of variance?
Are the estimates produced by probit and logit with the
vce(cluster clustvar) option true maximum likelihood
estimates?
Is there a difference between the estimates produced by the svy:
probit, with psu variable specified in svyset command and
probit, vce(cluster clustvar) (and, similarly, between
svy: logit, psu variable specified in svyset and
logit, vce(cluster clustvar))?
Why does Fisher’s exact test disagree with the confidence interval for the odds ratio?
Can I do n:1 matching with the mcc command?
How do I estimate recursive systems using a subset of available instruments?
What metaanalysis features are available in Stata?
How can I combine results other than coefficients in e(b) with multiply imputed data?
How can I account for clustering when creating imputations with mi impute?
How can I estimate correlations and their level of significance with survey data?
How do I obtain the standard error of the predicted probability with logistic regression analysis?
What are the divisors used in xtgee? (Technical FAQ)
Why do Stata’s xtgee standard errors differ from those reported by SAS’s PROC GENMOD?
I am using a model with interactions. How can I obtain marginal effects and their standard errors?
Can I use mfx on survey data with unweighted means?
I am using mfx after an estimation that has an offset. How does mfx take that into account?
What is the difference between the linear and nonlinear methods that mfx uses?
How do I calculate least square means in Stata?
What does “completely determined” mean in my logistic regression output?
How can I produce adjusted means after ANOVA?
Why does stcox sometimes produce missing standard errors?
What are the differences between predict and adjust?
How can I obtain the correlation matrix as a Stata matrix?
Why does my mlogit take so long to converge?
How can I get robust standard errors for tobit?
Is there any difference between using tsset and iis and tis before xt commands?
How can I get robust standard errors for tobit?
How do I estimate a nonlinear model using ml?
Why do I get an "unbalanced data" error message when I run nlogit?
How can I obtain the correlation between the factors after an oblique rotation?
Is it possible to analyze survey data with two or more levels of clustering with the svy commands?
How can I calculate moving averages for panel data?
Does Stata support any multiple comparison tests following twoway ANOVA?
How do I get the correct variance–covariance matrix from the bs routine?
How can I estimate stepwise Cox models?
How can I estimate a fixedeffects
regression with instrumental variables?
Why were the timings in the American Statistician (August 1997) review of the svy commands so slow?
How do I estimate a Cox model with a continuously timevarying parameter?
What are completely determined panels?
What is the difference between biprobit/heckprob and the STB commands?
Where are the Wald tests for zinb that appear in the manual?
Why do Stata's cc and cci commands report different confidence intervals than Epi Info?
How can I get onetailed probabilities for the Student's t distribution?
How can I simulate random multivariate normal observations from a given correlation matrix?
Why does Weibull with entry and exit times produce different results from Weibull with duration?
How does Stata's xtgee handle singletons with exchangeable correlation?
Can Stata's ml routine converge and produce answers that look good even when it shouldn't?
Why don't the old huber results match the new robust versions?
How can I get predicted probabilities for different x values after probit?
How can I get predicted probabilities after svylogit, svyprobt, svymlog, svyolog, or svyoprob?
What is the pseudo R^{2} in the weibull output?
How can I get the Mills' ratios for my heckman model?
How do I test endogeneity?
How do I perform a Durbin–Wu–Hausman test?