Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# RE: st: Multivariate Normal CDF

 From "David Roodman (droodman@cgdev.org)" To "statalist@hsphsun2.harvard.edu" Subject RE: st: Multivariate Normal CDF Date Fri, 29 Mar 2013 10:49:51 +0000

```Ali, for your bivariate example, you'll also need to divide the two coordinates by the standard deviations. Something like:

scalar rho= -.1 / (.2*.4)^.5
scalar x1s = (x1 - 3) / sqrt(.2)
scalar x2s = (x1 - (-1)) / sqrt(.4)
scalar cdf = binormal( x1s, x2s , rho )

For higher dimensions, Stata offers the ghk() and ghkfast() routines, which are bit more complicated to use. I've written my own version, ghk2(), which has some additional features and is often faster.
--David

-----Original Message-----
From: Ali Hashemi <hashemi@vt.edu>
Sent: Thu, 28 Mar 2013 12:11:39 -0400
To: statalist@hsphsun2.harvard.edu
Subject: st: Multivariate Normal CDF

Dear listserv members,

I'm trying to compute the normal cdf at 1000 points (each point is
defined by a combination of x1 and x2) using the following mean (mu)
and standard deviation (sigma).

mu=[ 3, -1 ]
sigma=[ 0.2, -0.1 \  -0.1, 0.4 ]

I know this can be easily done in MATLAB by P = normcdf(X,mu,sigma).
In Stata, I have used binormal(x1,x2,ro) function as follows:
gen ro= -.1 / (.2*.4)^.5
gen cdf=binormal( x1-3 , x2-(-1) , ro )

I have two questions:
1) Is this correct?
2) How can this be done for cases with more than two variables
(M-variate instead of  bivariate)? Is there a more general approach
(like in MATLAB) that can be used for generating the joint cumulative
distribution of an M-variate normal distribution?

Your help is greatly appreciated
Ali
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 16:40:07 +0000
From: Ronald McDowell <McDowell-R3@email.ulster.ac.uk>
Subject: st: ST: overdisperson with binary outcomes and survey data

Hello list members

I wish to test for overdisperson in my data (binary outcome), which is taken from a survey with weights and stratification, hence the svy: prefix. I've tired using the scale(x2) option as I would usually, but this isn't available with the svy prefix. Does anyone know of ways round this or alternatives?

Many thanks

Ron

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 13:17:18 -0400
From: Matthew Baker <matthew.baker@hunter.cuny.edu>
Subject: Re: st: Multivariate Normal CDF

Ali --

You might take a look at Capellari's and Jenkins's mvnp package (net
search mvnp). In fact, the 2nd Quarter 2006 Issue of the Stata Journal
describes how it works, and also contains a description of an
implementation of a GHK multivariate normal probability simulator in
Mata by Gates.

Best,

Matt Baker

On Thu, Mar 28, 2013 at 12:11 PM, Ali Hashemi <hashemi@vt.edu> wrote:
> Dear listserv members,
>
> I'm trying to compute the normal cdf at 1000 points (each point is
> defined by a combination of x1 and x2) using the following mean (mu)
> and standard deviation (sigma).
>
> mu=[ 3, -1 ]
> sigma=[ 0.2, -0.1 \  -0.1, 0.4 ]
>
> I know this can be easily done in MATLAB by P = normcdf(X,mu,sigma).
> In Stata, I have used binormal(x1,x2,ro) function as follows:
> gen ro= -.1 / (.2*.4)^.5
> gen cdf=binormal( x1-3 , x2-(-1) , ro )
>
> I have two questions:
> 1) Is this correct?
> 2) How can this be done for cases with more than two variables
> (M-variate instead of  bivariate)? Is there a more general approach
> (like in MATLAB) that can be used for generating the joint cumulative
> distribution of an M-variate normal distribution?
>
> Your help is greatly appreciated
> Ali
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

- --
Dr. Matthew J. Baker
Department of Economics
Hunter College and the Graduate Center, CUNY
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 19:06:34 +0100
From: thomas bourveau <thomas.bourveau@gmail.com>
Subject: Re: st: Drop columns based on an argument

Dear all,

thanks a lot your solutions worked perfectly to solve my two problems.

To answer to Jeph, it turns out that I was extracting data using a set
of identifiers from Datastream (stock prices data) and if Datastream
does not recognize the firm ID it still produces a empty colum with
only a message error in the place of the ID.

Best,
Thomas

2013/3/27 Jeph Herrin <stata@spandrel.net>:
> Not sure exactly what makes your identifiers "not valid", but here's how to
> fix your code and strip out your PEs all at once:
>
> forvalues i=2(1)1285{
>         replace var`i'=subinstr(var`i',"(PE)","",.)   in 1
>         if var`i'[1]=="#ERROR" {
>                 drop var`i'
>         }
> }
>
> cheers,
> Jeph
>
>
>
>
> On 3/27/2013 3:15 PM, thomas bourveau wrote:
>>
>> Dear Statalist members,
>>
>> I have a dataset that I will have to reshape but I am facing two
>> challenges for now. My dataset is currently organized as follows:
>>
>> The first column (var1) contains different dates.
>>
>> The next columns (var2-var1285) contains my variable of interest.
>> However, the first line contains the identifier of the firm and then
>> the value of the variable of interest.
>>
>>
>> My first problem is that some identifiers are not valid, and then the
>> first line returns an error message (string format). I want to drop
>> all columns with an error message. I have tried the following code:
>>
>> forvalues i=2(1)1285{
>> drop var`i' if var`i'[1]=="#ERROR"
>> }
>>
>> However, it returns the following message: "invalid syntax".
>>
>> My second problem is that the valid identifiers are of the following
>> form, 6 figures ending by (PE):
>>
>> example : 123456(PE)
>>
>> I need to eliminate the (PE) at the end of each identifier. I have
>> tried to use the regex command but it did not worked well.
>>
>> I welcome any idea !
>>
>> Thanks in advance
>> Best
>> Thomas
>>
>> --
>> Thomas Bourveau
>> thomas.bourveau@gmail.com
>> 0637573925
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

- --
Thomas Bourveau
PhD Candidate - HEC Paris
+ 33 6 37 57 39 25
Skype: thomas.bourveau

2013/3/27 Jeph Herrin <stata@spandrel.net>:
> Not sure exactly what makes your identifiers "not valid", but here's how to
> fix your code and strip out your PEs all at once:
>
> forvalues i=2(1)1285{
>         replace var`i'=subinstr(var`i',"(PE)","",.)   in 1
>         if var`i'[1]=="#ERROR" {
>                 drop var`i'
>         }
> }
>
> cheers,
> Jeph
>
>
>
>
> On 3/27/2013 3:15 PM, thomas bourveau wrote:
>>
>> Dear Statalist members,
>>
>> I have a dataset that I will have to reshape but I am facing two
>> challenges for now. My dataset is currently organized as follows:
>>
>> The first column (var1) contains different dates.
>>
>> The next columns (var2-var1285) contains my variable of interest.
>> However, the first line contains the identifier of the firm and then
>> the value of the variable of interest.
>>
>>
>> My first problem is that some identifiers are not valid, and then the
>> first line returns an error message (string format). I want to drop
>> all columns with an error message. I have tried the following code:
>>
>> forvalues i=2(1)1285{
>> drop var`i' if var`i'[1]=="#ERROR"
>> }
>>
>> However, it returns the following message: "invalid syntax".
>>
>> My second problem is that the valid identifiers are of the following
>> form, 6 figures ending by (PE):
>>
>> example : 123456(PE)
>>
>> I need to eliminate the (PE) at the end of each identifier. I have
>> tried to use the regex command but it did not worked well.
>>
>> I welcome any idea !
>>
>> Thanks in advance
>> Best
>> Thomas
>>
>> --
>> Thomas Bourveau
>> thomas.bourveau@gmail.com
>> 0637573925
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

- --
Thomas Bourveau
thomas.bourveau@gmail.com
0637573925
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 19:13:35 +0000
From: "Mahometa, Michael J" <michael.mahometa@ssc.utexas.edu>
Subject: st: Summer Statistics Institute at UT Austin, May 20-23, 2013

The Division of Statistics + Scientific Computation at The University of Texas at Austin will be hosting the Universityâs sixth annual UT Summer Statistics Institute on the UT Austin campus from May 20 â May 23, 2013.  Short courses are offered at all levels including introductory statistics, software, and statistical methods and applications. We are offering *Introduction to Stata* and *Introduction to Regression*, which both use Stata, as well as other courses that would be of interest to Stata users.

Learn the statistics youâ€™ve always wanted to know from some of the very finest faculty at UT!

Registration closes May 3. Students receive a 60% discount and groups can receive a 20% discount off the regular \$550 course fee. Visit our website at http://ssc.utexas.edu/programs/summer-statistics-institute to download the UT Summer Statistics Institute brochure and learn more. Short courses are offered at all levels including introductory statistics, software, and statistical methods and applications. New this year:

*Applied Text-Mining and Text-Analysis with R
*Introduction to Visual Analytics
*Pattern Analysis, Predictive Analytics and Big Data: Theory and Methods
*A Unifying Statistical Framework for Big Data: Graphical Modelsâ€¨
*Introduction to MapReduce Programming Model with Hadoopâ€¨
*Writing Competitive Federal Grant Proposalsâ€¨â€¨

We are offering these introductory courses in common statistical software:â€¨â€¨
*Introduction to Microsoft Accessâ€¨
*Introduction to R
*Introduction to Stata [sponsored by www.stata.com]â€¨
*Introduction to SPSS
*Data Analysis Using SAS

- --â€¨
Kat Snyder
Program Coordinatorâ€¨
Summer Statistics Instituteâ€¨
Division of Statistics + Scientific Computation
College of Natural Sciencesâ€¨
The University of Texas at Austin
ksnyder@utexas.edu

- ---------------------------------------------
Michael J. Mahometa, Ph.D.
Manager, Consulting Services
Division of Statistics and ScientificÂ Computation
College of Natural Sciences - G2500
University of Texas at Austin
512.471.4542

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 20:26:13 +0000
From: Fatma Romeh <fromeh1@student.gsu.edu>
Subject: st: foreach command with multiple varlists

Dear statalist,

I have a question regarding the "foreach" command.  I'm wondering whether I can use multiple valists with "foreach"? More specifically, I'm looking for something like that:

foreach x of varlist  x1-x4  &  y of varlist y1-y4   &   z of varlist z1-z4 {

gen  w= `x' if `y'==2 & `z'==1
}

when I did that, I got this error message: "& invalid name". What I want to do is to use the "foreach" command to replace these commands:
gen  w= x1 if y1==2 & z1==1
gen  w= x2 if y2==2 & z2==1
gen  w= x3 if y3==2 & z3==1
gen  w= x4 if y4==2 & z4==1

Is there a way to do that?

Many thanks,

Fatma,

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 14:33:09 -0600
From: Steve Nakoneshny <scnakone@ucalgary.ca>
Subject: Re: st: foreach command with multiple varlists

Fatma,

One initial problem I see is that you can only -gen- a var once. Every subsequent manipulation of that var will have to be done using -replace-. I suspect that you might be able to get away with something like as follows (untested):

g w=.
forval i 1/4 {
replace w=x`i' if y`i'==2 & z`i'==1
}

Steve

On 2013-03-28, at 2:26 PM, Fatma Romeh wrote:

> Dear statalist,
>
> I have a question regarding the "foreach" command.  I'm wondering whether I can use multiple valists with "foreach"? More specifically, I'm looking for something like that:
>
> foreach x of varlist  x1-x4  &  y of varlist y1-y4   &   z of varlist z1-z4 {
>
>     gen  w= `x' if `y'==2 & `z'==1
> }
>
> when I did that, I got this error message: "& invalid name". What I want to do is to use the "foreach" command to replace these commands:
> gen  w= x1 if y1==2 & z1==1
> gen  w= x2 if y2==2 & z2==1
> gen  w= x3 if y3==2 & z3==1
> gen  w= x4 if y4==2 & z4==1
>
> Is there a way to do that?
>
>
>
> Many thanks,
>
>
> Fatma,
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 13:42:19 -0700
From: "Sarah Edgington" <sedging@ucla.edu>
Subject: RE: st: foreach command with multiple varlists

Fatma,
Steve's solution will work if you really have variables with names of the
form x1 x2 x3 x4.  If your variable names don't have numbers in them then
you probably want a single loop using extended macro functions to reference
multiple lists containing the actual variable names.

For example:

gen w=.

forvalues i=1/4 {
local x : word `i' of x_one x_two x_three x_four
local y : word `i' of y_one y_two y_three y_four
local z : word `i' of z_one z_two z_three z_four

replace w=`x' if `y'==1 & `z'==1
}

Note that, as with Steve's example, you want to -replace w- rather than -gen
w- in the loop because otherwise you'll get an error message that w already
exists in your second iteration.

- -Sarah

- -----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Steve Nakoneshny
Sent: Thursday, March 28, 2013 1:33 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: foreach command with multiple varlists

Fatma,

One initial problem I see is that you can only -gen- a var once. Every
subsequent manipulation of that var will have to be done using -replace-. I
suspect that you might be able to get away with something like as follows
(untested):

g w=.
forval i 1/4 {
replace w=x`i' if y`i'==2 & z`i'==1
}

Steve

On 2013-03-28, at 2:26 PM, Fatma Romeh wrote:

> Dear statalist,
>
> I have a question regarding the "foreach" command.  I'm wondering whether
I can use multiple valists with "foreach"? More specifically, I'm looking
for something like that:
>
> foreach x of varlist  x1-x4  &  y of varlist y1-y4   &   z of varlist
z1-z4 {
>
>     gen  w= `x' if `y'==2 & `z'==1
> }
>
> when I did that, I got this error message: "& invalid name". What I want
to do is to use the "foreach" command to replace these commands:
> gen  w= x1 if y1==2 & z1==1
> gen  w= x2 if y2==2 & z2==1
> gen  w= x3 if y3==2 & z3==1
> gen  w= x4 if y4==2 & z4==1
>
> Is there a way to do that?
>
>
>
> Many thanks,
>
>
> Fatma,
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 17:43:43 -0400
From: SAM MCCAW <sam2stata@gmail.com>
Subject: Re: st: Using natural logs on RHS of maximum likelihood models

Dear David,

I am using a logistic regression model where the dependent variable is
a (0,1) dichotomous variable which stands for whether a firm
undertakes innovation or not. The predictor variables include a number
of firm level and industry level characteristics, including firm R&D
share, industry presence of foreign firms, number of competitors in
the industry and others.

When I am working with a non-dichotomous variable on the LHS I prefer
using natural logs as I am interested in estimating elasticities.
Since I am not very experienced with logistic models I was wondering
if there was a rule of thumb in using transformations.

Please let me know if I can provide any further information. I would

Thanks a lot!

SAM

On Tue, Mar 26, 2013 at 10:18 PM, David Hoaglin <dchoaglin@gmail.com> wrote:
> Dear Sam,
>
> Before anyone can make a constructive suggestion, you need to share
> more information on the details of your model.  Maximum likelihood is
> a method of estimating the parameters in a model.  It applies to a
> very wide range of models, some of which have a dichotomous (0, 1)
> outcome variable.  Is your model a logistic regression (logit) model,
> a probit model, or another type entirely?  Please be specific.
>
> Your question about the predictor variables does not have any single,
> simple answer.  The aim is usually to express each predictor variable
> in a form that appropriately captures its relation to the outcome
> variable (after adjusting for the contributions of the other predictor
> variables).  Generically, we could write something like
>
> g(y) = b0 + b1*f1(x1) + b2*f2(x2) + (more predictors)
>
> The functions g, f1, f2, etc. may differ as needed, and common choices
> include "leave it alone," take the logarithm, take the square root,
> and "square it."  Part of your challenge in analyzing data is to make
> appropriate choices of such functions.  For some classes of models,
> people have developed a variety of diagnostic tools that help this
> process.  Once you have explained more about your model and the
> context of your data, I or someone else reading this list may be able
> to recommend a book that discusses this and other steps in the
> model-building process and shows how they work on actual sets of data.
>
> I hope these comments are helpful.
>
> Regards,
>
> David Hoaglin
>
> On Tue, Mar 26, 2013 at 5:19 PM, SAM MCCAW <sam2stata@gmail.com> wrote:
>> Hello All,
>>
>> I am running a maximum likelihood model with a (0,1) categorical
>> dependent variable.
>>
>> On the right hand side is better to use natural logs of non
>> categorical variables or leave them as is as real numbers?
>>
>> Thanks a bunch.
>>
>> SAM
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 18:31:58 -0400
From: Steve Samuels <sjsamuels@gmail.com>
Subject: Re: st: ST: overdisperson with binary outcomes and survey data

Please describe the sampling process, and the analysis you've planned. You've omitted any mention of multiple stages and clusters (PSUs) from your description. Are they absent from the survey design?

On Mar 28, 2013, at 12:40 PM, Ronald McDowell wrote:

Hello list members

I wish to test for overdisperson in my data (binary outcome), which is taken from a survey with weights and stratification, hence the svy: prefix. I've tired using the scale(x2) option as I would usually, but this isn't available with the svy prefix. Does anyone know of ways round this or alternatives?

Many thanks

Ron

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 18:11:55 -0500
From: Phil Schumm <pschumm@uchicago.edu>
Subject: Re: st: ST: overdisperson with binary outcomes and survey data

On Mar 28, 2013, at 11:40 AM, Ronald McDowell wrote:
> I wish to test for overdisperson in my data (binary outcome), which is taken from a survey with weights and stratification, hence the svy: prefix. I've tired using the scale(x2) option as I would usually, but this isn't available with the svy prefix. Does anyone know of ways round this or alternatives?

If your response is truly binary, then you cannot have overdispersion (i.e., in the case of the Bernoulli distribution, the mean necessarily determines the variance), so the -scale()- option would not be appropriate anyway.  If the response is a proportion, or if there is clustering in your data not attributable to the sample design, then that is another matter.  If so, you'll need to provide more information in order to get help on how to proceed.

- -- Phil

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 28 Mar 2013 19:17:55 -0700 (PDT)
From: Zara Li <zara.25@gmail.com>
Subject: st: Probit with Endogenous Ordinal Regressors

Hi Statalist users!

I am trying to run a Probit model which has two endogenous ordinal
independent variables (they take a value of 0, 1, 2, 3 or 4). Since
- -ivprobit- cannot be used for discrete endogenous independent variables
(both under MLE or two-step), how can I run an instrumental variable
regression in this case, and carry out the usual tests for the validity of
my instruments post-estimation? My regression is not overidentified, so
- -overid- is not what I want to use. Of course, I could run the first-stage
regressions using -oprobit- for my two endogenous variables, and -probit- in
the second stage, but then what would be the best way to test (and show) the
validity of my instruments?

Thank you very much!

- --
View this message in context: http://statalist.1588530.n2.nabble.com/Probit-with-Endogenous-Ordinal-Regressors-tp7580333.html
Sent from the Statalist mailing list archive at Nabble.com.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

End of statalist-digest V4 #4835
********************************

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```