Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Mata: Calculating conditional expectation |

Date |
Thu, 25 Apr 2013 12:09:02 +0100 |

In this there is a lot of putting constants into variables, only to drop them later. In addition, whenever you want only one mean, -egen-'s -mean()- function is overkill. If you want to handle constants, that usually calls for scalars, not variables. -summarize, meanonly- leaves r(mean) behind in memory, which you can exploit more than you do. Your first block I would rewrite as count if Y == 1 & Y_T == 0 scalar fp = r(N)/_N count if Y == 0 & Y_T == 1 scalar fn = r(N)/_N Indeed su Y_T if inlist(Y_T, 0, 1), meanonly scalar fp = r(mean) scalar fn = 1 - fppct is better technique, I suggest. It does not presume there are no missing values on -Y_T-. (In addition, you calculated fractions, not percents.) The block local dummies forvalues i = 1/1000 { egen fpmean`i' = mean(id`i') if Y == 1 & Y_T == 0 gen idfp`i' = fppct * fpmean`i' local dummies "`dummies' idfp`i'" drop fpmean`i' } could be simplified to local dummies forvalues i = 1/1000 { su id`i' if Y == 1 & Y_T == 0, meanonly gen idfp`i' = fppct * r(mean) local dummies "`dummies' idfp`i'" } These are small changes. I've not tried to understand the whole of what you are doing. Nick njcoxstata@gmail.com On 25 April 2013 11:22, Ivan Png <iplpng@gmail.com> wrote: > Let me try to explain how I did it. I did not use Mata to direct > computing the vector of conditional expectation of the explanatory > variables. Rather, I constructed the conditional expectation of each > explanatory variable separately, and then collected them into the > vector. Not very elegant, but seems to work. > > == code fragment == > > . count if Y == 1 & Y_T == 0 /* Y = observed Y; Y_T = true Y */ > . gen fppct = r(N)/total /* total = no. of observations */ > /* fppct = percent false positives */ > . count if Y == 0 & Y_T == 1 > . gen fnpct = r(N)/total /* fppct = percent false negatives */ > > . gen const = 1 > > > *** create X including dummy variables (fixed effects) *** > > . local dummies > . forvalues i=1/1000 { > local dummies "`dummies' id`i'" > } > > . mata : st_view(X = .,., ("hhr", "const", "`dummies'")) /* hhr = > higher degree */ > . mata : rows(X), cols(X) /* check matrix */ > . mata : X[|1,1 \ 7,7|] /* check matrix for empty rows */ > > > ** construct conditional mean(X) ** > . sort lower year > > . foreach var in const hhr { > egen `var'_fpmean = mean(`var') if Y == 1 & Y_T == 0 > gen `var'_fp = fppct * `var'_fpmean > egen `var'_fnmean = mean(`var') if Y == 0 & Y_T == 1 > gen `var'_fn = fnpct * `var'_fnmean > drop `var'_fnmean > } > > ** construct P = mean(X) conditional on false positive ** > . local dummies > . forvalues i = 1/1000 { > egen fpmean`i' = mean(id`i') if Y == 1 & Y_T == 0 > gen idfp`i' = fppct * fpmean`i' > local dummies "`dummies' idfp`i'" > drop fpmean`i' > } > > > . mata : st_view(F = .,., ("hhr_fp", "const_fp", "`dummies'")) > . mata : rows(F), cols(F) /* check matrix */ > . mata : F[|1,1 \ 15,15|] /* check matrix for empty rows */ > > . mata : st_subview(P = . , F, 1 , .) /* choose first non-empty row */ > . mata : rows(P), cols(P) > . mata : P[|1,1 \ 1,20|] /* check vector */ > > > ** construct N = mean(X) conditional on false negative ** > . sort lower year > > . local dummies > . forvalues i = 1/1000 { > egen fnmean`i' = mean(id`i') if Y == 0 & Y_T == 1 > gen idfn`i' = fnpct * fnmean`i' > local dummies "`dummies' idfn`i'" > drop fnmean`i' > } > > . mata : st_view(G = .,., ("hhr_fn", "const_fn", "`dummies'")) > . mata : rows(G), cols(G) /* check matrix */ > . mata : G[|1,1 \ 20,20|] /* check matrix for empty rows */ > > . mata : st_subview(N = . , G, 3 , .) /* choose first non-empty row */ > . mata : rows(N), cols(N) > . mata : N[|1,1 \ 1,20|] /* check vector */ > > > ** estimate bias ** > . mata : D = invsym(cross(X,X))*(P' - N') > . mata : rows(D), cols(D) > . mata : D[|1,1 \ 9,1|] /* check vector */ > > . mata : st_subview(E = . , D, (1::5) , .) /* coefficients of focal > variables */ > . mata : E > > . mata : H = 9898 * E /* total = 9898 */ > . mata : H > > === end fragment === > > > On 23 April 2013 11:45, Ivan Png <iplpng@gmail.com> wrote: >> >> Many thanks to Statalist members for their previous help on >> constructing the matrix. >> >> To recall, my dependent variable is categorical: Mobility = 1 if >> inventor changed employer, else = 0. I'm investigating the effect of >> classification error. Obviously, this cannot be classical. Let Y = >> true mobility and Z = recorded mobility. If Y = 0 and Z = 1, then >> error = -1, while if Y = 1 and Z = 0, error = +1. >> >> Meyer and Mittag, U of Chicago (2012) characterize the bias as >> N(X'X)^{-1} [ Pr( Y = 0 & Z = 1) E(X : Y = 0 & Z = 1) - Pr(Y = 1 & Z >> = 0) E(X : Y = 1 & Z = 0) ]. I have a benchmark data set with both >> the true and inaccurate mobility data, and would like to compute the >> bias. >> >> So, I need to compute the conditional expectations, E(X : Y = 0 & Z = >> 1) and E(X : Y = 1 & Z = 0), and then weight by the probabilities, >> take the difference, and pre-multiply by N(X'X)^{-1}. My idea: >> >> . keep if Y = 0 & Z = 1 >> . mata : F = mean(X) >> . mata : mata matsave filename >> >> and repeat for Y = 1 & Z = 0. >> >> But, sadly, I don't how to proceed. How to combine the original data >> with the two new files containing the conditional expectations, and >> then going back to MATA to calculate the bias. Grateful to >> Statalisters for help. >> > > > > -- > Best wishes > Ivan Png > Skype: ipng00 > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Mata: Calculating conditional expectation***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: Mata: Calculating conditional expectation***From:*Ivan Png <iplpng@gmail.com>

**st: Mata: Calculating conditional expectation***From:*Ivan Png <iplpng@gmail.com>

- Prev by Date:
**Re: st: Question about generating the weight of WLS** - Next by Date:
**Re: st: Mata: Calculating conditional expectation** - Previous by thread:
**st: Mata: Calculating conditional expectation** - Next by thread:
**Re: st: Mata: Calculating conditional expectation** - Index(es):