# 马尔科夫链

$$\pi_{t+n} = \pi_{t+n-1} P=\pi_{t+n-2} P^2=\pi_{t} P^n$$.

$$\pi (I-P) = 0$$，所以$$\pi$$为P矩阵特征值1对应的特征向量。

# 马尔科夫链

$$P^{(n)} = \pi_{t} P^n$$，如果$$P^{(n)}_{ij}>0$$，称$$j$$$$i$$可达（accessible）。

# MCMC

MCMC: Gibbs抽样(Geman and Geman, 1984), Metropolis-Hastings抽样(Metropolis et al., 1953; Hastings, 1970).

MH抽样；设目标分布为$$f(x)$$

1. 给定$$x_t$$, 定义工具分布$$q$$, 从$$q(y|x_t$$)$$抽取随机数$$y_t$2. 定义接受概率(acceptance probability): $$\rho(x_t, y_t)=\min \left[ \frac{f(y_t)}{f(x_t)} \frac{q(x_t|y_t)}{q(y_t|x_t)}, 1 \right]$$, $x_{t+1} = \begin{cases} y_{t} & \rho(x_t, y_t) \\ x_{t} & 1- \rho(x_t, y_t) \\ \end{cases}$ 对称情形：$$\rho(x_t, y_t)=\min \left[ \frac{f(y_t)}{f(x_t)}, 1 \right]$$, 独立情形：$$\rho(x_t, y_t)=\min \left[ \frac{f(y_t)}{f(x_t)} \frac{q(x_t)}{q(y_t)}, 1 \right]$$, # MH抽样 target: Beta(2.7, 6.3); proposal: uniform(0,1). . clear . set obs 5000 Number of observations (_N) was 0, now 5,000. . gen x=. (5,000 missing values generated) . mata: ───────────────────────────────────────────────── mata (type end to exit) ─────────────────────────────────────────────────── : n = 5000 : x = runiform(n,1) : for (i=2; i<=n; i++) { > y = runiform(1,1) > rho = betaden(2.7, 6.3, y)/betaden(2.7, 6.3, x[i-1]) > x[i] = runiform(1,1)<rho ? y : x[i-1] > } : st_store((1,n), "x", x) : end ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── . twoway (function y=betaden(2.7,6.3,x), range(0 1)) (kdensity x, lp(dash))  # MH抽样 target: Cauchy; proposal: t(1). . clear . set obs 2000 Number of observations (_N) was 0, now 2,000. . gen x=. (2,000 missing values generated) . mata: ───────────────────────────────────────────────── mata (type end to exit) ─────────────────────────────────────────────────── : n = 5000 // burn-in sample:3000 : x = J(n,1,0) // initializing : for (i=2; i<=n; i++) { > y = rt(1,1,1) // proposal density: t(1) > denom = cauchyden(0,1,x[i-1])*tden(1,y) > rho = cauchyden(0,1,y)*tden(1,x[i-1])/denom > x[i] = runiform(1,1)<rho ? y : x[i-1] > } : st_store((1,2000), "x", x[3001..n]) : end ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── . twoway (function y=1/(c(pi)*(1+x^2)), range(-10 10)) /// > (kdensity x if abs(x)<10, lp(dash))  # Random-walk MH抽样 给定$$x_t$$, 抽取随机数$$y_t = x_t + u_t, u_t \sim q(u)$$ target: Beta(2.7, 6.3); proposal: uniform(0,1). . clear . set obs 5000 Number of observations (_N) was 0, now 5,000. . gen x=. (5,000 missing values generated) . mata: ───────────────────────────────────────────────── mata (type end to exit) ─────────────────────────────────────────────────── : n = 5000 : x = runiform(n,1) : for (i=2; i<=n; i++) { > y = x[i-1] + rnormal(1,1,0,1) > rho = betaden(2.7, 6.3, y)/betaden(2.7, 6.3, x[i-1]) > x[i] = runiform(1,1)<rho ? y : x[i-1] > } : st_store((1,n), "x", x) : end ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── . twoway (function y=betaden(2.7,6.3,x), range(0 1)) (kdensity x, lp(dash)) , legend(off) xtitle("")  # Gibbs抽样 将随机变量分为两组，$$(x_t, y_t)$$，联合密度为$$f(x,y)$$ 1. 抽取$$x_0$$ 2. $$t=1,2,\cdots$$，抽取$$y_{t+1} \sim f(y|x_t)$$, $$x_{t+1} \sim f(x|y_{t+1})$$ 例：二元正态分布。 1. 抽取$$x_0=0$$ 2. $$t=1,2,\cdots$$，抽取 $\begin{eqnarray*} y_{t+1} | x_t & \sim & N \left( \frac{\sigma_1}{\sigma_2} \rho x_t,(1-\rho^2) \sigma_1^2 \right) \\ x_{t+1} | y_{t+1} & \sim & N \left( \frac{\sigma_2}{\sigma_1} \rho y_{t+1},(1-\rho^2) \sigma_2^2 \right) \\ \end{eqnarray*}$ 可以直接扩展到随机变量分为多组的情况。 # Gibbs抽样 对协方差矩阵的抽样一般采用Gibbs抽样，inverse Wishart为最广泛采用的先验分布。 # 适应性MCMC $$T_0$$为burn-in样本量，$$T$$为MCMC的样本量，整个MCMC迭代的次数为$$T_{total}=T_0+(T-1)\times thinning + 1$$ MH抽样生成试验值的公式为: $$\theta_{*} = \theta_{t-1} + e, e \sim N(0, \rho_k^2 \Sigma_k)$$，所谓适应性MH抽样是指每隔一定的区间调整$$\rho_k$$$$\Sigma_k$$，以实现最优的接受率(TAR)。Gelman, Gilks and Roberts (1997)证明，单一参数的TAR为0.44，多个参数的TAR为0.234。 设Alen为每个适应性区间的长度（adaptation(every())）。可以设置最小迭代次数（adaptation(miniter())）和最大迭代次数（adaptation(maxiter())）。 # 适应性MH算法的步骤 初始化：$$\theta_t = \theta_0^{(f)}$$$$\rho_0=2.38/\sqrt{d}$$，其中d为参数个数。$$\Sigma_0 = I$$或者由covariance()选项自行设定。 1. 生成试验值: $$\theta_{*} = \theta_{t-1} + e, e \sim N(0, \rho_k^2 \Sigma_k)$$ 2. 接受概率为 $\alpha(\theta_{*}|\theta_{t-1}) = \text{min} \left{ \frac{p(\theta_{*}|y)}{p(\theta_{t-1}|y)}, 1 \right}.$ 其中，$$p(\theta_{*}|y) = f(y|\theta) p(\theta)$$$$\theta$$的后验分布。 $$\rho_k$$$$\Sigma_k$$的更新公式为： $\rho_k = \rho_{k-1} \exp( \beta_k [\Phi^{-1}(AR_k/2) - \Phi^{-1}(TAR/2) ] )$ 其中， $AR_k = (1-\alpha) AR_{k-1} + \alpha AP$ # 适应性MH算法的步骤 AP为接受概率$$\alpha(\theta_{*}|\theta_{t-1})$$的均值。$$\alpha$$的默认值为0.75，或者由adaptation(alpha())自行设定。 $\Sigma_k = (1-\beta_k) \Sigma_{k-1} + \beta_k \hat{\Sigma}_k, \beta_k = \beta_0/k^{\gamma}.$ 其中，$$\Sigma_0 = I$$或者由covariance()选项自行设定；$$\hat{\Sigma}_k$$为MCMC样本的协方差。$$\beta_0$$的默认值为0.8，或者由adaptation(beta())自行设定。$$\gamma$$的默认值为0,或者由adaptation(gamma())自行设定。 # 内容 引言 马尔科夫链蒙特卡洛模拟（MCMC） 计量模型的贝叶斯估计与检验 特定模型的似然函数或后验分布 贝叶斯预测 # 贝叶斯定理 $\pi(\theta|y) = \frac{f(y|\theta) \pi(\theta)}{f(y)} \propto f(y|\theta) \pi(\theta).$ 其中，$$f(y|\theta) \pi(\theta)$$叫做贝叶斯核(kernel)。 # 贝叶斯估计 如果先验分布$$\pi(\theta)$$对后验分布的影响非常小， 称其是无信息的(noninformative, vague, diffuse, flat)。 如果$$\pi(\theta)$$的积分不等于1，称其是不恰当的(improper)。 Jeffreys先验：$$\pi(\theta) \propto |I(\theta)|^{1/2}$$，其中，$$I(\theta)$$为Fisher信息矩阵， $I(\theta) = -E\left[ \frac{\partial^2 \log(p(y|\theta))}{\partial \theta^2} \right].$ # 线性模型的贝叶斯估计 模型: $$y=x\beta + u$$ 似然函数: $$y_i \sim Normal(x_i \beta, \sigma^2)$$ 先验: $\begin{eqnarray*} \beta & \sim & \text{Mumltivariate-Normal}(\beta_0, V_0) \\ \sigma^2 & \sim & \text{Inverse-Gamma}(\alpha_0/2, \delta_0/2) \\ \end{eqnarray*}$ # note: inverse-gamma distribution 一般的逆Gamma分布，InverseGamma[$$\alpha, \beta,\gamma,\mu$$] 定义域为($$\mu, \infty$$)，$$\mu$$为位置参数，$$\alpha, \gamma$$为形状参数，$$\beta$$为刻度参数。 InverseGamma[$$\alpha, \beta$$]表示InverseGamma[$$\alpha, \beta, 1, 0$$], pdf为 $pdf(x) = \frac{\exp(-\beta/x) (\beta/x)^{\alpha}}{x \Gamma(\alpha)}$ $E(x) = \frac{\beta}{\alpha-1} (\text{ for } \alpha>1); Var(x) = \frac{\beta^2}{(\alpha-2)(\alpha-1)^2} (\text{ for } \alpha>2).$ 如果$$x\sim Gamma(\alpha, \beta)$$，那么$$1/x \sim Inv-Gamma(\alpha, 1/beta)$$. # 线性模型 边际后验分布： $\beta | \sigma^2,y \sim \text{Mumltivariate-Normal} \left( \beta_1, V_1 \right)$ 其中， $V_1 = \left[ \sigma^{-2} X'X + V_0^{-1} \right]^{-1}, \beta_1 = V_1 \left[ \sigma^{-2} X'y + V_0^{-1} \beta_0 \right]$ $\sigma^2 | \beta,y \sim \text{Inverse-Gamma}\left( (\alpha_0+n)/2, [\delta_0+(y-x\beta)'(y-x\beta)]/2 \right) .$ # 线性模型Gibbs抽样 $$\beta$$ are sampled in one block (Greenberg, 2013). 1. 设定初始值$$\sigma^{2(0)}$$. 2. $$t$$次迭代, 从下面的分布抽样 $\begin{eqnarray*} \beta^{(t)} | \sigma^2,y & \sim & \text{MNormal} \left( \beta_1, V_1 \right) \\ \sigma^2 | \beta,y & \sim & \text{IG}\left( (\alpha_0+n)/2, [\delta_0+(y-x\beta^{(t)})'(y-x\beta^{(t)})]/2 \right) . \end{eqnarray*}$ # Stata指令 bayes [, bayesopts] : estimation_command [, estopts] bayesopts note gibbs 只适用于regress, mvreg等模型 normalprior(#) 回归系数的正态先验分布的标准差，默认值为100 igammaprior(# #) 回归模型的方差的先验分布：inverse-gamma分布(shape, scale)，默认值为igammaprior(0.01 0.01) iwishartprior(#) 随机效应协方差矩阵的先验分布：Inverse-wishart分布(degree of freedom) prior(priorspec) 自己设定先验分布 其中，priorspec的形式为：{[eqname:]param[, matrix]}, priordist。比如， prior(educ, normal(0, 9)) prior(age, uniform(-3,3)) prior({dep:}, mvnormal(3, mvec, vmat)) # 例: MH抽样 . use mroz, clear . bayes, normalprior(10) igammaprior(0.01 0.01) : regress lwage educ age exper Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: lwage ~ regress(xb_lwage,{sigma2}) Priors: {lwage:educ age exper _cons} ~ normal(0,100) (1) {sigma2} ~ igamma(.01,.01) ────────────────────────────────────────────────────────────────────────────── (1) Parameters are elements of the linear form xb_lwage. Bayesian linear regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 428 Acceptance rate = .3342 Efficiency: min = .06385 avg = .0982 Log marginal-likelihood = -467.9285 max = .2099 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── lwage │ educ │ .1087602 .0140838 .000557 .1092518 .081044 .1356251 age │ -.0011846 .0048595 .000184 -.0011781 -.0108816 .008066 exper │ .0160051 .0044727 .000167 .0162019 .0069124 .0243769 _cons │ -.3458634 .2641602 .009615 -.3501628 -.885464 .1888002 ─────────────┼──────────────────────────────────────────────────────────────── sigma2 │ .4509639 .0315607 .000689 .4500597 .394099 .5204984 ─────────────┴──────────────────────────────────────────────────────────────── Note: Default priors are used for some model parameters.  # 例: Gibbs抽样 . use mroz, clear . bayes, gibbs normalprior(10) igammaprior(0.01 0.01) : regress lwage educ age exper Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: lwage ~ normal(xb_lwage,{sigma2}) Priors: {lwage:educ age exper _cons} ~ normal(0,100) (1) {sigma2} ~ igamma(.01,.01) ────────────────────────────────────────────────────────────────────────────── (1) Parameters are elements of the linear form xb_lwage. Bayesian linear regression MCMC iterations = 12,500 Gibbs sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 428 Acceptance rate = 1 Efficiency: min = .9761 avg = .9952 Log marginal-likelihood = -467.84857 max = 1 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── lwage │ educ │ .1092317 .0142794 .000143 .1093622 .0815355 .1376194 age │ -.0013859 .0048177 .000048 -.0013814 -.0109141 .0080745 exper │ .0163677 .0046453 .000046 .0163429 .0072553 .0253992 _cons │ -.3475648 .2653856 .002654 -.3446139 -.8709864 .1654805 ─────────────┼──────────────────────────────────────────────────────────────── sigma2 │ .4504266 .0311056 .000315 .4491257 .393057 .5144107 ─────────────┴──────────────────────────────────────────────────────────────── Note: Default priors are used for some model parameters.  # 内容 引言 马尔科夫链蒙特卡洛模拟（MCMC） 计量模型的贝叶斯估计与检验 • 诊断与检验 特定模型的似然函数或后验分布 贝叶斯预测 # MCMC诊断 收敛性(是否达到平稳分布) • 踪迹图（围绕常数均值波动） • 核密度图（模拟样本分为前后两段，比较其分布，或者做等均值等方差检验） • 自相关 • Gelman-Rubin (shrink factor)统计量：多条马尔科夫链进行抽样，那么链之间的平均差异与链内的平均差异应该进行相等。二者的比例称作收缩因子（shrink factor），作为经验规则， 收缩因子应低于1.1。 Efficiency: 大于0.1，视作良好。低于0.01，严重关注。 # MCMC诊断 bayesgraph type which 其中，type包括:diag, trace, histogram, kdensity, ac, cusum, matrix. which包括： _all, {[*eqname*:]*param*} # MCMC诊断:综合 . bayesgraph diag {lwage:educ}  # MCMC诊断: 自相关图 . bayesgraph ac {lwage:}, byparm  # 贝叶斯估计的描述统计 所有参数： . bayesstats ic Bayesian information criteria ─────────────┬──────────────────────────────── │ DIC log(ML) log(BF) ─────────────┼──────────────────────────────── active │ 877.4558 -467.8486 . ─────────────┴──────────────────────────────── Note: Marginal likelihood (ML) is computed using Laplace–Metropolis approximation. . bayesstats ess Efficiency summaries MCMC sample size = 10,000 Efficiency: min = .9761 avg = .9952 max = 1 ─────────────┬────────────────────────────────────── │ ESS Corr. time Efficiency ─────────────┼────────────────────────────────────── lwage │ educ │ 10000.00 1.00 1.0000 age │ 10000.00 1.00 1.0000 exper │ 10000.00 1.00 1.0000 _cons │ 10000.00 1.00 1.0000 ─────────────┼────────────────────────────────────── sigma2 │ 9761.25 1.02 0.9761 ─────────────┴──────────────────────────────────────  # 贝叶斯推断 所有参数： bayesgraph trace {lwage:}, byparm bayesgraph kdensity {lwage:}, byparm bayesgraph hist {lwage:}, byparm bayesgraph ac {lwage:}, byparm 单个参数： bayesgraph trace {lwage:educ} bayesgraph kdensity {lwage:educ} bayesgraph hist {lwage:educ} bayesgraph ac {lwage:educ} # 模型选择 bayesstats ic: 三个统计指标。 • 边际似然函数 • dic (应选择dic最低的模型) • bf (贝叶斯因子=两个模型的边际似然函数的比（相同的样本）) # 模型选择 贝叶斯因子 (Jeffreys, 1961)： $BF_{12} = \frac{P(y|M_1)}{P(y|M_2)} = \frac{m_1(y)}{m_2(y)}.$ posterior odds: $\frac{P(M_1|y)}{P(M_2|y)} = \frac{P(y|M_1)P(M_1)}{P(y|M_2)P(M_2)} = BF_{12} \frac{P(M_1)}{P(M_2)}.$ # 模型选择 $$\log_{10}(bf)$$ bf 结论 0 - 1/2 1 - 3.2 1/2 - 1 3.2 - 10 较强 (substantial) 1 - 2 10 - 100 强 (strong) >2 > 100 一致 (decisive) Kass and Raftery (1995): 贝叶斯因子： $$2\log_{e}(bf)$$ bf 结论 0 - 2 1 - 3 2 - 6 3 - 20 positive 6 - 10 20 - 150 strong >10 > 150 very strong # 模型比较 bayestest model ... 可以用于比较不同的先验、不同的后验分布、不同的回归函数、不同的解释变量等。 条件：不同模型使用完全相同的样本；具有恰当的后验分布；MC收敛。 对参数落在某个区间的检验： bayestest interval ({y:x1},lower(.9) upper(1.02)) ({y:x2},lower(-1.1) upper(-.8)), joint # 先验分布的设定 Stata对不同模型的不同类型的参数设置了不同的先验分布。 回归系数的先验分布默认为独立的正态分布$$N(0, \sigma_{prior}^2)$$，其中$$\sigma_{prior}^2=10,000$$ • 在多数情况下，$$N(0, 10,000)$$是无信息的（uninformative），但如果估计量的数值非常大（比如2000），那么标准差100会仍然含有较强的信息（informative）。这时，需要自己通过normalprior()设定先验分布的方差。 正值的参数（比如方差）默认先验分布为Inverse-Gamma($$\alpha, \beta$$)。默认值为$$\alpha=0.01$$, $$\beta=0.01$$ 排序选择模型的切点：先验分布为Unifrom(0,1)。 多水平模型：随机效应协方差矩阵 • independent, identity: Inverse-Gamma($$\alpha, \beta$$) • unstructured: Invser-Wishart(d+1, I(d))，其中d为协方差矩阵的维度，d+1为自由度。 # 内容 引言 马尔科夫链蒙特卡洛模拟（MCMC） 计量模型的贝叶斯估计与检验 • Stata 特定模型的似然函数或后验分布 贝叶斯预测 # bayesmh的用法 bayesmh可以用于估计单方程线性模型、单方程非线性模型、多方程线性模型、多方程非线性模型、多水平模型、概率分布等。 bayesmh depvar [indepvarspec] [if] [in] [weight], likelihood(modelspec) prior(priorspec) llevaluator(log-likelihood) evaluator(log-postdist) [options] 三种用法： • likelihood() + prior(): 内置似然函数 + 内置先验分布 • llevaluator() + prior(): 自定义的对数似然函数 + 内置先验分布 • evaluator(): 自定义的对数后验分布 # likelihood() 似然函数即概率密度函数，不同分布涉及到的参数也不同。比如，正态分布有均值和方差两个参数；泊松分布只有一个参数，既为均值也是方差。 bayesmh中，分布的均值参数通过模型来设定。分布的形式和其它参数则通过likelihood()来设定。比如， • bayesmh y x1 x2, likelihood(normal({var}))，表示$$y\sim N(b_0+b_1 x_1 + b_2 x_2, var)$$ • bayesmh y x1 x2, likelihood(poisson)，表示$$y\sim Poisson(exp(b_0+b_1 x_1 + b_2 x_2))$$ • bayesmh y x1 x2, likelihood(probit)，表示$$y\sim Bernoulli(\Phi(b_0+b_1 x_1 + b_2 x_2))$$ # likelihood() likelihood() 似然函数 分布形式 normal(var) 正态分布 $$y\sim N(xb, var)$$ lognormal(var) 对数正态分布 $$\ln(y)\sim N(xb, var)$$ exp 指数分布 $$y \sim \exp(xb)$$ probit 贝努里分布 $$P(y=1)=\Phi(xb)$$ logit 贝努里分布 $$P(y=1)=\Lambda(xb)$$ oprobit 多项式分布，概率为排序probit $$P(y=j)=P(c_{j-1}<y^*<c_j)$$ ologit 多项式分布，概率为排序logit $$P(y=j)=P(c_{j-1}<y^*<c_j)$$ poisson 泊松分布 $$P(y=j)=Poisson(xb)$$ binomial(n) 二项分布 $$P(y=j)=Binoamial(n,)$$ stexp 指数生存函数 stgamma(lns) Gamma生存函数(对数标度参数lhs) stloglogistic(lns) log-logistic生存函数(对数标度参数lhs) stlognormal(lnstd) 对数正态生存函数(对数标准差lnstd) stweibull(lnp) Weibull生存函数(对数形状参数lnp) llf() 对数似然函数的表达式 # 先验分布 Stata对不同模型的不同类型的参数设置了不同的先验分布。 回归系数的先验分布默认为独立的正态分布$$N(0, \sigma_{prior}^2)$$，其中$$\sigma_{prior}^2=10,000$$ • 在多数情况下，$$N(0, 10,000)$$是无信息的（uninformative），但如果估计量的数值非常大（比如2000），那么标准差100会仍然含有较强的信息（informative）。这时，需要自己通过normalprior()设定先验分布的方差。 正值的参数（比如方差）默认先验分布为Inverse-Gamma($$\alpha, \beta$$)。默认值为$$\alpha=0.01$$, $$\beta=0.01$$ 排序选择模型的切点：先验分布为Unifrom(0,1)。 多水平模型：随机效应协方差矩阵 • independent, identity: Inverse-Gamma($$\alpha, \beta$$) • unstructured: Invser-Wishart(d+1, I(d))，其中d为协方差矩阵的维度，d+1为自由度。 # 先验分布(一元) priordist note normal(mu, var) Normal(m, v) t(mu, s2, df) location-scale t with mean mu, squared scale sigma2, and degrees of freedom df lognormal(mu, var) log-normal(mu, var) uniform(a,b) uniform on (a,b) gamma(a,b) gamma with shape a and scale b igamma(a,b) inverse-gamma with shape a and scale b exponential(b) gamma with scale b beta(a,b) beta with shape a and b laplace(mu,b) laplace with location loc and scale b cauchy(loc,b) beta with shape a and b chi2(df) chi2 with degree of freedom pareto(a,b) pareto with shape a and scale b jeffreys Jeffreys prior # 先验分布(多元) priordist note mvnormal(d, mu, var) d-dimensions with mean (mu) and vaiance (var) mvnormal0(d, var) d-dimensions with zero-mean and vaiance (var) zellnersg(d, g, mu, {var}) Zellner-g prior of dimension d with g dof, mean mu and variance {var} zellnersg0(d, g, mu, {var}) Zellner-g prior of dimension d with g dof, zero-mean and variance {var} dirichlet(a1, a2, …, ad) Dirichlet (multivariate beta) of dimension d with shape (a1,a2,…,ad) wishart(d,df, V) Wishart of dimension d with DOF df and scale matrix V iwishart(d,df, V) inverse-Wishart of dimension d with DOF df and scale matrix V jeffreys(d) Jeffreys prior of d-variate normal dist. # 先验分布 (离散) priordist note bernoulli(p) 0 or 1 (with probability p) geometric(p) 0, 1, 2, … with geometric(p) index(p1,…,pk) 1, 2, …, k with prob. (p1, p2, …, pk) poisson(mu) 0, 1, 2, … with Poisson(mu) # 先验分布 (其它形式) priordist note flat density(f) generic density f logdensity(logf) generic log-density logf note: 扁平(flat)先验要谨慎使用。Flat先验是不恰当的(improper)，即先验分布的积分并不为1（事实上为无穷大），可能导致不恰当的后验分布而使得无法进行贝叶斯推断。 Jeffreys先验：均值参数的Jeffreys先验即扁平先验: $$\pi(\mu)=1$$；标准差参数的Jeffreys先验为倒数先验(inverse prior): $$\pi(\sigma)=1/\sigma$$. # likelihood() + prior() . use mroz, clear . bayesmh lwage educ age exper, likelihood(normal({var})) prior({lwage:}, flat) prior({var}, igamma(0.01, 0.01)) Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: lwage ~ normal(xb_lwage,{var}) Priors: {lwage:educ age exper _cons} ~ 1 (flat) (1) {var} ~ igamma(0.01,0.01) ────────────────────────────────────────────────────────────────────────────── (1) Parameters are elements of the linear form xb_lwage. Bayesian normal regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 428 Acceptance rate = .2752 Efficiency: min = .03065 avg = .04804 Log marginal-likelihood = -455.02345 max = .07165 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── lwage │ educ │ .1093384 .0144143 .000675 .1094341 .0807039 .1364698 age │ -.001229 .0048322 .000221 -.0012917 -.0106893 .0081376 exper │ .0163394 .0045813 .000171 .0162681 .0074164 .0253667 _cons │ -.3510312 .2650894 .012581 -.3514681 -.8570201 .1873868 ─────────────┼──────────────────────────────────────────────────────────────── var │ .4498116 .030934 .001767 .4483777 .3933691 .5150745 ─────────────┴──────────────────────────────────────────────────────────────── Note: Adaptation tolerance is not met.  # likelihood() + prior(): informative先验 . bayesmh lwage educ age exper, likelihood(normal({var})) prior({lwage:}, normal(0,10000)) prior({var},igamma(0.01, 0.01)) Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: lwage ~ normal(xb_lwage,{var}) Priors: {lwage:educ age exper _cons} ~ normal(0,10000) (1) {var} ~ igamma(0.01,0.01) ────────────────────────────────────────────────────────────────────────────── (1) Parameters are elements of the linear form xb_lwage. Bayesian normal regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 428 Acceptance rate = .253 Efficiency: min = .03411 avg = .04046 Log marginal-likelihood = -477.13055 max = .05554 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── lwage │ educ │ .1086604 .0143401 .000608 .1092476 .0799753 .1362948 age │ -.0018785 .0048822 .000256 -.0019494 -.0110175 .0084529 exper │ .0169774 .0044795 .000243 .0168361 .008705 .0264318 _cons │ -.3258351 .2698549 .013326 -.3212487 -.857303 .1921515 ─────────────┼──────────────────────────────────────────────────────────────── var │ .450493 .0319438 .001698 .4494177 .3939383 .5191762 ─────────────┴────────────────────────────────────────────────────────────────  # likelihood() + prior() . bayesmh inlf age educ kids*, likelihood(logit) prior({inlf:}, flat) Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: inlf ~ logit(xb_inlf) Prior: {inlf:age educ kidslt6 kidsge6 _cons} ~ 1 (flat) (1) ────────────────────────────────────────────────────────────────────────────── (1) Parameters are elements of the linear form xb_inlf. Bayesian logistic regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 753 Acceptance rate = .1607 Efficiency: min = .02098 avg = .04698 Log marginal-likelihood = -475.37675 max = .05996 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed inlf │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── age │ -.0647113 .0128096 .000581 -.0650733 -.089756 -.0393152 educ │ .2020967 .0383368 .001757 .2038444 .1280694 .2722752 kidslt6 │ -1.496497 .1902067 .013131 -1.483813 -1.880425 -1.124643 kidsge6 │ -.0939495 .0671387 .002742 -.0927716 -.2155951 .0316578 _cons │ 1.041166 .7975184 .033216 1.062803 -.5794387 2.618541 ─────────────┴────────────────────────────────────────────────────────────────  # likelihood() + prior(): 分别设定参数的先验 . bayesmh lwage educ age exper, likelihood(normal({var})) prior({lwage:educ}, normal(0.1, 0.4)) prior({lwage:age}, normal(-0. > 01, 1)) prior({lwage:exper}, normal(0.1, 1)) prior({lwage:_cons}, normal(2, 10)) prior({var},igamma(0.01, 0.01)) Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: lwage ~ normal(xb_lwage,{var}) Priors: {lwage:educ} ~ normal(0.1,0.4) (1) {lwage:age} ~ normal(-0.01,1) (1) {lwage:exper} ~ normal(0.1,1) (1) {lwage:_cons} ~ normal(2,10) (1) {var} ~ igamma(0.01,0.01) ────────────────────────────────────────────────────────────────────────────── (1) Parameters are elements of the linear form xb_lwage. Bayesian normal regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 428 Acceptance rate = .2158 Efficiency: min = .001864 avg = .02366 Log marginal-likelihood = -459.584 max = .08343 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── lwage │ educ │ .1080999 .0163228 .003015 .1085795 .076095 .1408135 age │ -.0016291 .0051764 .000886 -.0016677 -.0122493 .0081089 exper │ .0164365 .004882 .000169 .0164146 .0069865 .0262597 _cons │ -.3244789 .3187475 .073838 -.3318201 -.977361 .3921511 ─────────────┼──────────────────────────────────────────────────────────────── var │ .4539588 .030861 .00189 .4522007 .3942803 .5171512 ─────────────┴──────────────────────────────────────────────────────────────── Note: There is a high autocorrelation after 500 lags.  # likelihood() + prior(): 多个参数的联合先验分布 . qui reg lwage educ age exper . matrix vec=e(b) . matrix cov=e(V) . bayesmh lwage educ age exper, likelihood(normal({var})) prior({lwage:}, mvnormal(4,vec,cov)) prior({var},igamma(0.01, 0.01) > ) Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: lwage ~ normal(xb_lwage,{var}) Priors: {lwage:educ age exper _cons} ~ mvnormal(4,vec,cov) (1) {var} ~ igamma(0.01,0.01) ────────────────────────────────────────────────────────────────────────────── (1) Parameters are elements of the linear form xb_lwage. Bayesian normal regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 428 Acceptance rate = .1926 Efficiency: min = .001502 avg = .02856 Log marginal-likelihood = -442.16765 max = .07191 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── lwage │ educ │ .1078397 .0078649 .001125 .1079189 .0934599 .1234291 age │ -.0019469 .0027718 .000339 -.0018452 -.0075171 .0033644 exper │ .0165112 .003203 .000119 .016684 .0101089 .0225106 _cons │ -.3074617 .1048001 .027043 -.3049968 -.5511863 -.1230595 ─────────────┼──────────────────────────────────────────────────────────────── var │ .4480066 .0291848 .001213 .4463937 .3955503 .5106466 ─────────────┴──────────────────────────────────────────────────────────────── Note: There is a high autocorrelation after 500 lags.  # likelihood() + prior(): Zellner-g prior prior({eq:}, normal(0,10000))设定各个参数是独立的。可以通过两种方式允许多个参数是相关的： • 设定多个参数服从多元分布（多元正态分布等）。 • Zellner-g prior $\beta | \sigma^2 \sim MVN(\mu_0, g\sigma^2 (x'x)^{-1})$ 其中，$$g$$为g-prior的自由度。 Stata设定：zellnersg(d, g, mu, sigsq) 其中，sigsq往往直接设定为{var}，即zellnersg(d, g, mu, {var}) # likelihood() + prior(): Zellner-g prior . bayesmh lwage educ age exper, likelihood(normal({var})) prior({lwage:}, zellnersg(4,30,vec,{var})) prior({var},igamma(0.01, > 0.01)) Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: lwage ~ normal(xb_lwage,{var}) Priors: {lwage:educ age exper _cons} ~ zellnersg(4,30,vec,{var}) (1) {var} ~ igamma(0.01,0.01) ────────────────────────────────────────────────────────────────────────────── (1) Parameters are elements of the linear form xb_lwage. Bayesian normal regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 428 Acceptance rate = .1603 Efficiency: min = .00119 avg = .01802 Log marginal-likelihood = -446.16778 max = .04551 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── lwage │ educ │ .1012417 .0115183 .001895 .1013191 .0791962 .1233749 age │ -.0037958 .0041686 .000711 -.0037038 -.0119122 .0040406 exper │ .01706 .0043192 .000227 .017137 .0083099 .025188 _cons │ -.1531522 .1799528 .052174 -.1626343 -.46208 .1529861 ─────────────┼──────────────────────────────────────────────────────────────── var │ .4450226 .032087 .001504 .4429504 .387583 .5111025 ─────────────┴──────────────────────────────────────────────────────────────── Note: There is a high autocorrelation after 500 lags.  # likelihood() + prior(): user-defined prior Gelman et al. (2014): Logit回归模型中，系数的先验分布为Cauchy分布，位置参数为$$a = 0$$, 刻度参数为$$b = 2.5$$ bayesmh y x, likelihood(logit) prior({y:x}, logdensity(ln(2.5)-ln(2.5^2+{y:x}^2)-ln(_pi))) prior({y:_cons}, logdensity(ln(10)-ln(10^2+{y:_cons}^2)-ln(_pi))) # 内容 引言 马尔科夫链蒙特卡洛模拟（MCMC） 计量模型的贝叶斯估计与检验 特定模型的似然函数或后验分布 贝叶斯预测 # 非线性模型 非线性模型需要明确指定参数和变量构成的表达式。 参数以大括号括起来，比如{b0}, {b1}等。 例：bayesmh y={b0} + {b1}*x1^{p} + {b2}*x2, 表示$$y=b_0+b_1 x_1^{p} + b_2 x_2$$ 非线性单方程模型：eqname : dep = subexp 例: bayesmh mpg = {b0}+{b1}*price^{b2}, likelihood(…) prior(…) 非线性多方程模型：(eqname : dep = subexp) (…..) # 非线性模型 估计CES生产函数；预测当capital=1.5, labor=2时的产出。 . use production, clear . bayesmh lnoutput = ({b0}-1/{rho=1}*ln({delta=0.5}*capital^(-1*{rho})+(1-{delta})*labor^(-1*{rho}))), likelihood(normal({sig > 2})) prior({sig2}, igamma(0.01, 0.01)) prior({b0}, normal(0, 100)) prior({delta}, uniform(0, 1)) prior({rho}, gamma(1, 1 > )) block({b0 delta rho sig2}, split) rseed(16) saving(ces_mcmc, replace) Burn-in ... Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Likelihood: lnoutput ~ normal({b0}-1/{rho=1}*ln({delta=0.5}*capital^(-1*{rho})+(1-{delta })*labor^(-1*{rho})),{sig2}) Priors: {sig2} ~ igamma(0.01,0.01) {b0} ~ normal(0,100) {delta} ~ uniform(0,1) {rho} ~ gamma(1,1) ────────────────────────────────────────────────────────────────────────────── Bayesian normal regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 100 Acceptance rate = .4385 Efficiency: min = .02283 avg = .1225 Log marginal-likelihood = -93.731032 max = .2461 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── b0 │ 3.829453 .1047645 .0059 3.829384 3.633927 4.027828 delta │ .4830306 .0636595 .001283 .4820599 .3540333 .6146103 rho │ 1.838675 .8665057 .057343 1.624436 .8088482 4.053971 sig2 │ .3098183 .043935 .001009 .3062381 .233141 .4054633 ─────────────┴──────────────────────────────────────────────────────────────── file ces_mcmc.dta saved.  # 多方程模型 线性多方程模型：(eqname : dep indepvars [, noconstant]) (……) 例： bayesmh (eq:mpg price weight) (eq2:price weight foreign), likelihood(…) prior(…) 表示模型: • mpg = {eq:price}*price + {eq:weight}*weight + {eq:_cons} • price = {eq2:weight}*weight + {eq2:foreign}*foreign + {eq2:_cons} # 先验分布中对参数的指定 先验分布设定：prior( parameter, dist ) • 例：定义方程eq中x1的系数的先验分布：prior({y:x1}, normal(0,100)) • bayesmh y x1 x2, likelihood(normal({var})) prior({y:x1}, normal(0,100)) 基本规则：参数以大括号括起来。比如， • 单个参数{mu} • 特定方程的单个参数：{eq:mu} • 参数矩阵（一般为协方差矩阵）：{Sigma, matrix} • 特定方程的参数矩阵：{Cov:Sigma, matrix} # evaluator 如果似然函数或先验分布比较复杂，不能由内置的分布函数或表达式来表示，可以写似然函数或后验分布程序来实现。 evaluator()设置对数后验分布，用llevaluator()设置对数似然函数。 设置自己的对数后验分布： bayesmh depvar indepvars [if] [in] , evaluator(program, spec) 设置自己的对数似然函数： bayesmh depvar indepvars [if] [in] , llevaluator(program, spec) # evaluator的基本格式 基本架构： program progname args lnden xb1 [xb2 ...] [ modelparams] ... computations ... scalar lnden' = ... end 1. args跟着对数似然函数(lnden)、线性表达式(xb1, xb2, …)、模型其它参数。 2. bayesmh设定的模型必须要与evaluator程序的args保持一致。比如，args lnden xb1 var，对应的bayesmh指令： . bayesmh y x1 x2, evaluator(normaljeffreys, parameters({var})) • y x1 x2：因变量为y，自变量为x1和x2。 • b0+b1*x1+b2*x2对应args的xb1 • parameters({var})则对应args中的var。 # evaluator的基本格式 例：args lnden xb1 xb2 tau var1 var2，对应的bayesmh指令： . bayesmh (y1 x1 x2) (y2 x1 z2), evaluator(normaljeffreys, parameters({tau} {var1} {var2})) 其中，(y1 x1 x2) (y2 x1 z2)： 模型包含两个方程，因变量分别为y1、y2，自变量分别为(x1, x2)和(x1, z2)。 • b0+b1*x1+b2*x2对应args的xb1，c0+c1*x1+c2*z2对应args的xb2。 • parameters({tau} {var1} {var2})则对应args中的tau var1 var2 # evaluator定义的global macro model global macro note 单方程$MH_y 因变量
$MH_x1,$MH_x2, … 第1个方程的第1、2、…个自变量
$MH_xn 第1个方程的自变量的个数 多方程$MH_y1, $MH_y2,$MH_y3, … 第1、2、3、…个方程的因变量
$MH_y1x1,$MH_y1x2, … 第1个方程的第1、2、…个自变量
$MH_y2x1,$MH_y2x2, … 第2个方程的第1、2、…个自变量
$MH_y1xn,$MH_y2xn, … 第1、2、…个方程的自变量的个数

$MH_b 系数向量$MH_bn 参数个数
$MH_m1,$MH_m2, … 第1、2、…个矩阵参数
$MH_mn 矩阵参数的个数$MH_extravars 其它变量

# 例

. type normaljeffreys.ado
program normaljeffreys
version 16.1
args lnp xb var
/* compute log likelihood */
tempname sd
scalar sd' = sqrt(var')
tempvar lnfj
quietly generate double lnfj' = lnnormalden($MH_y,xb',sd') if$MH_touse
quietly summarize lnfj', meanonly
if r(N) < $MH_n { scalar lnp' = . exit } tempname lnf scalar lnf' = r(sum) /* compute log prior */ tempname lnprior /* case 1: variance: Jeffereys prior, coefficient: flat prior. */ scalar lnprior' = -2*ln(sd') /* compute log posterior */ scalar lnp' = lnf' + lnprior' end  # 例 由于已经在evaluator设置了对数后验分布，bayesmh指令中不能再设置prior()或likelihood()。 . use mroz, clear . bayesmh lwage educ age, evaluator(normaljeffreys, parameters({var})) Burn-in ... note: invalid initial state. Simulation ... Model summary ────────────────────────────────────────────────────────────────────────────── Posterior: lwage ~ normaljeffreys(xb_lwage,{var}) ────────────────────────────────────────────────────────────────────────────── Bayesian regression MCMC iterations = 12,500 Random-walk Metropolis–Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 428 Acceptance rate = .2765 Efficiency: min = .06077 avg = .07804 Log marginal-likelihood = -452.12848 max = .09761 ─────────────┬──────────────────────────────────────────────────────────────── │ Equal-tailed │ Mean Std. dev. MCSE Median [95% cred. interval] ─────────────┼──────────────────────────────────────────────────────────────── lwage │ educ │ .1099913 .013842 .000443 .1105567 .0814029 .1356931 age │ .0067848 .0043219 .000175 .0067199 -.0016034 .0153288 _cons │ -.4874439 .2635526 .009471 -.4873747 -1.001772 .0526065 ─────────────┼──────────────────────────────────────────────────────────────── var │ .4626251 .0312279 .00113 .4615565 .4051353 .5263745 ─────────────┴────────────────────────────────────────────────────────────────  # 例 等价的做法：likelihood() + prior() bayesmh lwage educ age, likelihood(normal({var})) prior({mpg:}, flat) prior({var}, jeffreys)  # 例：llevaluator 可以单独写对数似然函数，先验分布采用内置的分布。 例：注意被注释掉的几行语句。 . type normallnf.ado program normallnf version 16.1 args lnf xb var // lnp --> lnf tempname sd scalar sd' = sqrt(var') tempvar lnfj quietly generate double lnfj'=lnnormalden($MH_y,xb',sd') if $MH_touse quietly summarize lnfj', meanonly if r(N) <$MH_n {
scalar lnf' = . // lnp --> lnf
exit
}
// tempname lnf
scalar lnf' = r(sum)
// tempname lnprior
// scalar lnprior' = -2*ln(sd')
// scalar lnp' = lnf' + lnprior'
end


# 例: llevaluator() + prior()

. bayesmh lwage educ age, llevaluator(normallnf, parameters({var})) prior({lwage:}, flat) prior({var}, jeffreys)

Burn-in ...
note: invalid initial state.
Simulation ...

Model summary
──────────────────────────────────────────────────────────────────────────────
Likelihood:
lwage ~ normallnf(xb_lwage,{var})

Priors:
{lwage:educ age _cons} ~ 1 (flat)                                        (1)
{var} ~ jeffreys
──────────────────────────────────────────────────────────────────────────────
(1) Parameters are elements of the linear form xb_lwage.

Bayesian regression                              MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Number of obs    =        428
Acceptance rate  =      .1388
Efficiency:  min =    .001144
avg =    .005087
Log marginal-likelihood = -464.96066                          max =    .009786

─────────────┬────────────────────────────────────────────────────────────────
│                                                Equal-tailed
│      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
─────────────┼────────────────────────────────────────────────────────────────
lwage        │
educ │  .1713085    .013087   .002016   .1711292   .1469124   .1960288
age │  .0262109   .0036357   .000504   .0263546   .0189688   .0330134
_cons │ -2.103083   .1503325    .04444  -2.137789  -2.353303  -1.825682
─────────────┼────────────────────────────────────────────────────────────────
var │  .5224401    .038405   .003882   .5194123    .448735    .603441
─────────────┴────────────────────────────────────────────────────────────────
Note: There is a high autocorrelation after 500 lags.


# 例: logit模型的贝叶斯估计

Logit模型的对数似然函数。

. type logitlnf.ado
program logitlnf
version 16.1
args lnf xb
tempvar lnfj
quietly generate lnfj' = ln(invlogit(xb')) if $MH_y == 1 &$MH_touse
quietly replace lnfj' = ln(invlogit(-xb')) if $MH_y == 0 &$MH_touse
quietly summarize lnfj', meanonly
if r(N) < \$MH_n {
scalar lnf' = .
exit
}
scalar lnf' = r(sum)
end


# 例: logit模型的贝叶斯估计

. bayesmh inlf kids* educ age, llevaluator(logitlnf) prior({inlf:}, flat)

Burn-in ...
Simulation ...

Model summary
──────────────────────────────────────────────────────────────────────────────
Likelihood:
inlf ~ logitlnf(xb_inlf)

Prior:
{inlf:kidslt6 kidsge6 educ age _cons} ~ 1 (flat)                         (1)
──────────────────────────────────────────────────────────────────────────────
(1) Parameters are elements of the linear form xb_inlf.

Bayesian regression                              MCMC iterations  =     12,500
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,500
MCMC sample size =     10,000
Number of obs    =        753
Acceptance rate  =      .2605
Efficiency:  min =     .03826
avg =      .0633
Log marginal-likelihood = -475.38324                          max =      .0815

─────────────┬────────────────────────────────────────────────────────────────
│                                                Equal-tailed
inlf │      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
─────────────┼────────────────────────────────────────────────────────────────
kidslt6 │ -1.483831   .1929396   .007068  -1.476455  -1.871804  -1.096678
kidsge6 │  -.096295   .0680707    .00348  -.0951521  -.2328876   .0339874
educ │   .201044   .0379359   .001761    .200977    .129362    .275341
age │ -.0638044   .0124056   .000435  -.0633938    -.08888  -.0399602
_cons │  1.024522   .7731857   .028078     1.0043  -.4649703    2.55663
─────────────┴────────────────────────────────────────────────────────────────


# 例: 门限回归的贝叶斯估计

. sysuse taylor, clear

. gen ygap1 = L.ygap
(1 missing value generated)

. bayesmh (eq1:ffr l.ffr l.pi l.ygap) (eq2: ffr l.ffr l.pi l.ygap), ///
>  llevaluator(normalthresh, parameters({tau} {var1} {var2}) extravars(ygap1)) ///
>  prior({eq1:}, normal(0,0.5)) prior({var1}, igamma(2,1))  ///
>  prior({eq2:}, normal(0,0.5)) prior({var2}, igamma(2,1))  ///
>  prior({tau}, uniform(-2,2))  ///
>  initial({eq1:} 0 {eq2:} 0 {tau} -2 {var1} 1 {var2} 1)  ///
>  block({eq1:}) block({eq2:}) block({var1}) block({var2}) blocksummary ///
>  burnin(2000) mcmcsize(2000)

Burn-in ...
Simulation ...

Model summary
──────────────────────────────────────────────────────────────────────────────
Likelihood:
ffr ffr ~ normalthresh(xb_eq1,xb_eq2,{tau},{var1},{var2})

Priors:
{eq1:L.ffr L.pi L.ygap _cons} ~ normal(0,0.5)                            (1)
{eq2:L.ffr L.pi L.ygap _cons} ~ normal(0,0.5)                            (2)
{tau} ~ uniform(-2,2)
{var1 var2} ~ igamma(2,1)
──────────────────────────────────────────────────────────────────────────────
(1) Parameters are elements of the linear form xb_eq1.
(2) Parameters are elements of the linear form xb_eq2.

Block summary
──────────────────────────────────────────────────────────────────────────────
1:  {eq1:L.ffr L.pi L.ygap _cons}
2:  {eq2:L.ffr L.pi L.ygap _cons}
3:  {var1}
4:  {var2}
5:  {tau}
──────────────────────────────────────────────────────────────────────────────

Bayesian regression                              MCMC iterations  =      4,000
Random-walk Metropolis–Hastings sampling         Burn-in          =      2,000
MCMC sample size =      2,000
Number of obs    =        113
Acceptance rate  =      .3433
Efficiency:  min =     .03825
avg =     .08808
Log marginal-likelihood =  -132.2379                          max =       .261

─────────────┬────────────────────────────────────────────────────────────────
│                                                Equal-tailed
│      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
─────────────┼────────────────────────────────────────────────────────────────
eq1          │
ffr │
L1. │  .4263503   .1543339   .013155   .4467987   .1165634   .7892567
│
pi │
L1. │  1.005484   .2931961   .025911   .9806879   .3473836   1.622831
│
ygap │
L1. │  .1055092   .2932414   .032391   .0986647  -.4638702   .6225725
│
_cons │ -.3022746   .5587455   .063881  -.2833094  -1.369574   .7045184
─────────────┼────────────────────────────────────────────────────────────────
eq2          │
ffr │
L1. │  .9153278   .0236862   .002067   .9171459   .8683087   .9619808
│
pi │
L1. │  .3238712   .0551187   .004669   .3243831   .2182006   .4343945
│
ygap │
L1. │  .0801297   .0641431    .00581   .0773504  -.0436599   .1989345
│
_cons │ -.3283278   .1315534   .011183  -.3253656  -.6088118  -.0825758
─────────────┼────────────────────────────────────────────────────────────────
tau │ -1.144866   .0370001   .002212  -1.157775  -1.182468  -1.030312
var1 │  2.230633   .7411225   .055102   2.107737   1.227005   4.183789
var2 │  .2537051   .0384719   .001684   .2511057   .1879426   .3328634
─────────────┴────────────────────────────────────────────────────────────────


# 例: 门限回归的贝叶斯估计

. bayesgraph diagnostics {tau}


# 例: 门限回归的贝叶斯估计

. bayesstats ess

Efficiency summaries    MCMC sample size =     2,000
Efficiency:  min =    .03825
avg =    .08808
max =      .261

─────────────┬──────────────────────────────────────
│        ESS   Corr. time    Efficiency
─────────────┼──────────────────────────────────────
eq1          │
ffr │
L1. │     137.64        14.53        0.0688
│
pi │
L1. │     128.04        15.62        0.0640
│
ygap │
L1. │      81.96        24.40        0.0410
│
_cons │      76.50        26.14        0.0383
─────────────┼──────────────────────────────────────
eq2          │
ffr │
L1. │     131.38        15.22        0.0657
│
pi │
L1. │     139.38        14.35        0.0697
│
ygap │
L1. │     121.88        16.41        0.0609
│
_cons │     138.39        14.45        0.0692
─────────────┼──────────────────────────────────────
tau │     279.67         7.15        0.1398
var1 │     180.90        11.06        0.0905
var2 │     521.93         3.83        0.2610
─────────────┴──────────────────────────────────────


# 贝叶斯预测

bayespredict {_ysim#[numlist]}

numlist：设定对哪些观测值进行预测。

. use production, clear

. qui bayesmh lnoutput = ({b0}-1/{rho=1}*ln({delta=0.5}*capital^(-1*{rho})+(1-{delta})*labor^(-1*{rho}))), likelihood(normal(
> {sig2})) prior({sig2},  igamma(0.01, 0.01))  prior({b0}, normal(0, 100)) prior({delta}, uniform(0, 1))  prior({rho}, gamma(
> 1, 1))  block({b0 delta rho sig2}, split)  rseed(16) saving(ces_mcmc, replace)

.
. bayespredict {_ysim}, saving(ces_pred, replace) rseed(16)

Computing predictions ...

file ces_pred.dta saved.
file ces_pred.ster saved.

. describe _ysim1_1 _ysim1_2 _ysim1_3  using ces_pred

Variable      Storage   Display    Value
name         type    format    label      Variable label
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
_ysim1_1        double  %10.0g                Simulated lnoutput, obs #1
_ysim1_2        double  %10.0g                Simulated lnoutput, obs #2
_ysim1_3        double  %10.0g                Simulated lnoutput, obs #3


# 贝叶斯预测

. bayesstats summary {_ysim[1/5]} using ces_pred

Posterior summary statistics                      MCMC sample size =    10,000

─────────────┬────────────────────────────────────────────────────────────────
│                                                Equal-tailed
│      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
─────────────┼────────────────────────────────────────────────────────────────
_ysim1_1 │  3.111305    .564188   .005642   3.114486   2.013625   4.230432
_ysim1_2 │   4.47848   .5647274   .006264   4.487651   3.362394   5.576457
_ysim1_3 │  2.033675   .5562368   .005692   2.033444   .9360682   3.115028
_ysim1_4 │  2.234464   .5692011   .005783   2.234235   1.130026   3.361621
_ysim1_5 │  2.894212   .5655813   .005948   2.894129   1.789959   4.013839
─────────────┴────────────────────────────────────────────────────────────────


# 贝叶斯预测

bayesgraph histogram [label:]@*func(arg1* [, arg2)], saving(file)

bayesgraph histogram [label:]@*userprog, arg1* [arg2], saving(file)

arg1arg2为{_ysim[#]}, {_resid[#]}（残差）, 或{_mu[#]}（因变量的期望）。 预测值的函数（比如均值、方差、偏度等）：

. bayesgraph histogram (resvar:@variance({_resid})) using ces_pred


# 贝叶斯预测

bayespredict newvar, mean median std cri outcome(depvar)

. bayespredict pmean, mean rseed(123)

Computing predictions ...

. twoway (kdensity lnoutput) (kdensity pmean, lp(dash)), legend(off) title(Observed vs replicated data)


# 贝叶斯预测

. set obs 101
Number of observations (_N) was 100, now 101.

. qui replace capital = 1.5 in 101

. qui replace labor = 2 in 101

. bayespredict pmean2, mean rseed(123)

Computing predictions ...

. bayespredict cilow ciupp, cri rseed(123)

Computing predictions ...

. list labor capital lnoutput pmean2 cilow ciupp in 96/101

┌─────────────────────────────────────────────────────────────────┐
│    labor    capital   lnoutput     pmean2      cilow      ciupp │
├─────────────────────────────────────────────────────────────────┤
96. │ .4023916   .2614668   2.737468   2.678401   1.561842   3.798991 │
97. │ 3.801785   .6329353   3.619768   3.782753   2.692533   4.906854 │
98. │ .3353206   .2557932   2.292713   2.597483   1.527138   3.708973 │
99. │  31.5595    2.72437   5.026271   5.273558      4.144   6.378542 │
100. │  3.80443    .946865   3.363704   4.154781   3.045156   5.251562 │
├─────────────────────────────────────────────────────────────────┤
101. │        2        1.5          .   4.362791   3.242146   5.457026 │
└─────────────────────────────────────────────────────────────────┘


# 贝叶斯预测

bayespredict [label:]@*func(arg1* [, arg2)], saving(file)

bayespredict [label:]@*userprog, arg1* [arg2], saving(file)

. bayespredict (mu:@mean({_mu})), saving(mc2, replace)

Computing predictions ...

file mc2.dta saved.
file mc2.ster saved.


# 贝叶斯预测

. bayesstats ppvalues (mean:@mean({_ysim})) (min:@min({_ysim})) (max:@max({_ysim})) using ces_pred

Posterior predictive summary   MCMC sample size =    10,000

─────────────┬─────────────────────────────────────────────
T │      Mean   Std. dev.  E(T_obs)  P(T>=T_obs)
─────────────┼─────────────────────────────────────────────
mean │  3.045143   .0787588   3.044554        .5026
min │  .5130189   .3401942   1.049675        .0365
max │   5.84806   .3703789   5.703145         .626
─────────────┴─────────────────────────────────────────────
Note: P(T>=T_obs) close to 0 or 1 indicates lack of fit.


# 贝叶斯预测

. mata
───────────────────────────────────────────────── mata (type end to exit) ───────────────────────────────────────────────────
: real scalar skew(real colvector x) {
r(3000);

:   return (sqrt(length(x))*sum((x:-mean(x)):^3)/(sum((x:-mean(x)):^2)^1.5))
'return' found where almost anything else expected
r(3000);

: }
expression invalid
r(3000);

: end
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

. capture prog drop kurt

. program kurt
1.   version 17
2.   args kurtosis x
3.   quietly summarize x', detail
4.   scalar kurtosis' = r(kurtosis)
5. end


# 贝叶斯预测

. bayespredict (skewness:@skew({_ysim})) (kurtosis:@kurt {_ysim}), saving(ces_teststat, replace) rseed(16)

Computing predictions ...

file ces_teststat.dta saved.
file ces_teststat.ster saved.


# 贝叶斯预测

. bayesstats summary {skewness} {kurtosis} using ces_teststat

Posterior summary statistics                      MCMC sample size =    10,000

─────────────┬────────────────────────────────────────────────────────────────
│                                                Equal-tailed
│      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
─────────────┼────────────────────────────────────────────────────────────────
skewness │  .1352403   .1606844   .001607   .1350271   -.181804   .4555612
kurtosis │  2.646662   .2969499   .003685   2.613873   2.165103   3.318124
─────────────┴────────────────────────────────────────────────────────────────

. bayesstats ppvalues {skewness} {kurtosis} using ces_teststat

Posterior predictive summary   MCMC sample size =    10,000

─────────────┬─────────────────────────────────────────────
T │      Mean   Std. dev.  E(T_obs)  P(T>=T_obs)
─────────────┼─────────────────────────────────────────────
skewness │  .1352403   .1606844   .1562801        .4449
kurtosis │  2.646662   .2969499   2.252134        .9359
─────────────┴─────────────────────────────────────────────
Note: P(T>=T_obs) close to 0 or 1 indicates lack of fit.


# 贝叶斯预测

. bayesreps yrep*, nreps(3) rseed(123)

Computing predictions ...

. list lnoutput yrep1 yrep2 yrep3 in 1/5

┌───────────────────────────────────────────┐
│ lnoutput      yrep1      yrep2      yrep3 │
├───────────────────────────────────────────┤
1. │ 2.933451   3.397611   3.147219   4.039656 │
2. │ 4.613716   4.896418   4.197628   5.014223 │
3. │ 1.654005   1.758073   1.464527   2.596067 │
4. │ 2.025361    2.30658    1.71803   2.535869 │
5. │ 3.165065   2.221415   2.840843   3.330061 │
└───────────────────────────────────────────┘