Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

# st: Mata: Calculating conditional expectation

 From Ivan Png To statalist@hsphsun2.harvard.edu Subject st: Mata: Calculating conditional expectation Date Thu, 25 Apr 2013 18:22:37 +0800

```Let me try to explain how I did it.  I did not use Mata to  direct
computing the vector of conditional expectation of the explanatory
variables.  Rather, I constructed the conditional expectation of each
explanatory variable separately, and then collected them into the
vector.  Not very elegant, but seems to work.

== code fragment ==

. count if Y == 1 & Y_T == 0  /* Y = observed Y; Y_T = true Y */
. gen fppct = r(N)/total      /* total = no. of observations */
/* fppct = percent false positives */
. count if Y == 0 & Y_T == 1
. gen fnpct = r(N)/total      /* fppct = percent false negatives */

. gen const = 1

*** create X including dummy variables (fixed effects) ***

. local dummies
. forvalues i=1/1000 {
local dummies "`dummies' id`i'"
}

. mata : st_view(X = .,., ("hhr", "const", "`dummies'"))  /* hhr =
higher degree */
. mata : rows(X), cols(X)  /* check matrix */
. mata : X[|1,1 \ 7,7|]    /* check matrix for empty rows */

** construct conditional mean(X) **
. sort lower year

. foreach var in const hhr {
egen `var'_fpmean = mean(`var') if Y == 1 & Y_T == 0
gen  `var'_fp = fppct * `var'_fpmean
egen `var'_fnmean = mean(`var') if Y == 0 & Y_T == 1
gen  `var'_fn = fnpct * `var'_fnmean
drop `var'_fnmean
}

** construct P = mean(X) conditional on false positive **
. local dummies
. forvalues i = 1/1000 {
egen fpmean`i' = mean(id`i') if Y == 1 & Y_T == 0
gen  idfp`i' = fppct * fpmean`i'
local dummies "`dummies' idfp`i'"
drop fpmean`i'
}

. mata : st_view(F = .,., ("hhr_fp", "const_fp", "`dummies'"))
. mata : rows(F), cols(F)  /* check matrix */
. mata : F[|1,1 \ 15,15|]    /* check matrix for empty rows */

. mata : st_subview(P = . , F, 1 , .)  /* choose first non-empty row */
. mata : rows(P), cols(P)
. mata : P[|1,1 \ 1,20|]    /* check vector */

** construct N = mean(X) conditional on false negative **
. sort lower year

. local dummies
. forvalues i = 1/1000 {
egen fnmean`i' = mean(id`i') if Y == 0 & Y_T == 1
gen  idfn`i' = fnpct * fnmean`i'
local dummies "`dummies' idfn`i'"
drop fnmean`i'
}

. mata : st_view(G = .,., ("hhr_fn", "const_fn", "`dummies'"))
. mata : rows(G), cols(G)  /* check matrix */
. mata : G[|1,1 \ 20,20|]    /* check matrix for empty rows */

. mata : st_subview(N = . , G, 3 , .)  /* choose first non-empty row */
. mata : rows(N), cols(N)
. mata : N[|1,1 \ 1,20|]    /* check vector */

** estimate bias **
. mata : D = invsym(cross(X,X))*(P' - N')
. mata : rows(D), cols(D)
. mata : D[|1,1 \ 9,1|]    /* check vector */

. mata : st_subview(E = . , D, (1::5) , .)  /* coefficients of focal
variables */
. mata : E

. mata : H = 9898 * E      /* total = 9898 */
. mata : H

=== end fragment ===

On 23 April 2013 11:45, Ivan Png <iplpng@gmail.com> wrote:
>
> Many thanks to Statalist members for their previous help on
> constructing the matrix.
>
> To recall, my dependent variable is categorical: Mobility = 1 if
> inventor changed employer, else = 0.  I'm investigating the effect of
> classification error.  Obviously, this cannot be classical.  Let Y =
> true mobility and Z = recorded mobility. If Y = 0 and Z = 1, then
> error = -1, while if Y = 1 and Z = 0, error = +1.
>
> Meyer and Mittag, U of Chicago (2012) characterize the bias as
> N(X'X)^{-1} [ Pr( Y = 0 & Z = 1) E(X :  Y = 0 & Z = 1) - Pr(Y = 1 & Z
> = 0) E(X :  Y = 1 & Z = 0) ].  I have a benchmark data set with both
> the true and inaccurate mobility data, and would like to compute the
> bias.
>
> So, I need to compute the conditional expectations, E(X :  Y = 0 & Z =
> 1) and E(X :  Y = 1 & Z = 0), and then weight by the probabilities,
> take the difference, and pre-multiply by N(X'X)^{-1}.  My idea:
>
> . keep if Y = 0 & Z = 1
> . mata : F = mean(X)
> . mata : mata matsave filename
>
> and repeat for Y = 1 & Z = 0.
>
> But, sadly, I don't how to proceed.  How to combine the original data
> with the two new files containing the conditional expectations, and
> then going back to MATA to calculate the bias.  Grateful to
> Statalisters for help.
>

--
Best wishes
Ivan Png
Skype: ipng00
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index