Re: AW: st: AW: Fitted probabilities using prvalue for logit model

 From Steven Samuels
Subject Re: AW: st: AW: Fitted probabilities using prvalue for logit model
Date Mon, 19 Jul 2010 10:20:50 -0400

```
```
A more accessible reference for pseudo-R squares ( a branch of proportional reduction in error measures) and which can be used to define "partial" r-squares is: http://www.ats.ucla.edu/stat/mult_pkg/faq/general/Psuedo_RSquareds.htm
```
Steve

On Jul 19, 2010, at 8:53 AM, Marc Michelsen wrote:

```
Many thanks for the various alternatives mentioned by Steve and Maarten. I
```will try to figure out which one is well suited for my kind of analysis.

Marc Michelsen wrote:
Von: owner-statalist@hsphsun2.harvard.edu
```
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Steve Samuels
```Gesendet: Freitag, 16. Juli 2010 17:02
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: AW: Fitted probabilities using prvalue for logit model

I don't know what you mean by "determine the relative importance of my
additional dummy variables relative to the benchmark model with its
explanatory variables?"  But in this case -prvalue- is obviously not
working for you.

How are you measuring "importance"?

If you mean "significance", have you tested the joint significance of
the two variables with -test-? (Adding variables will always increase
the log-likelihood, so "improvement" is not a guide).   If the
criterion of importance is "predictive accuracy", then compare ROC
curves for the two models with -roccomp-.  Unfortunately, the ROCs for
both models will be systematically optimistic, but the differences
could still be revealing. For better accuracy, some kind of
cross-validation approach is needed.

For cross-validation approaches, see:
http://www.stata.com/statalist/archive/2008-02/msg00686.html
An unreferenced Stata program for cross-validation is contained in:
http://www.mail-archive.com/r-help@r-project.org/msg82508.html

There is also a literature on "proportional reduction in error"
approaches, including partial r-squares.  See: Agrestic, Analysis of
Categorical Data, 2nd Ed (2002) Wiley, Chapter 6.  Measures of
r-square based on the log-likelihood are difficult to interpret (p.
227). A Google search will turn up many references.

(By the way, -prvalue- is not an official Stata command.  I presume it
is user-written. Please, as the FAQ request, give references for all
the non-Stata commands you use.)

Steve

On Fri, Jul 16, 2010 at 9:54 AM, Marc Michelsen wrote:
<marcmichelsen@t-online.de> wrote:
```
```Steve,

```
of course there are four possible combinations -- however, in my set- up
```there are only three valid combinations. 1/1 is not possible.

```
Does your statement mean that -prvalue- is not an appropriate measure to
```determine the relative importance of my additional dummy variables
```
```relative
```
```to the benchmark model with its explanatory variables?

Marc

Marc Michelsen wrote:
Von: owner-statalist@hsphsun2.harvard.edu
```
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Steve Samuels
```Gesendet: Freitag, 16. Juli 2010 15:18
An: statalist@hsphsun2.harvard.edu
```
Betreff: Re: st: AW: Fitted probabilities using prvalue for logit model
```
There are four combinations of two dummy variables, not three, so your
statements don't make sense. The coefficients of the variables, if you
hold others constant,  refer only to the relative associations among
those four categories, not to any absolute levels.  Those are
determined by the values at which you fix the other covariates and by
the constant term. .It is well-known that prediction at the means of
covariates will not even reproduce the mean prediction, which in turn
is the raw  prevalence.  It is quite possible that all four
predictions could be lower than the crude prevalence rate. So, there's
no reason to expect those predictions to match those of any
"benchmark" model and a (single?) benchmark probability.

Steve

On Fri, Jul 16, 2010 at 6:02 AM, Marc Michelsen wrote:
<marcmichelsen@t-online.de> wrote:
```
```Dear all,

```
as I didn't get an answer to my problem below, I am trying to rewrite the question more precisely/generally. The reference for the approach is the following: DeAngelo, H., L. DeAngelo, and R. M. Stulz. "Seasoned equity
```offerings, market timing, and the corporate lifecycle." Journal of
```
```Financial
```
Economics 95 (2009), 275-295. I am referring to the table on page 284.
```
```
I am estimating the fitted probabilities of a logit model at fixed levels
```of
```
the explanatory variables using -prvalue-. I have a benchmark model and
```therefore also a benchmark probability of the event. Including my two
```
```dummy
```
variables in a second model specification (improves Peusdo-R2 and Chi2) actually lowers the probability of the event. However, the probability should increase if the dummy variables are coded 0 (dummy 1)/1 (dummy 2). The probabilities are lower in all three possible combinations of the two dummies. Although the coefficients of the logit model show the correct
```signs
```
```and are statistically significant for one of the dummy variables.

Does anybody has a view on this?

Many thanks for considering this posting

Marc

Marc Michelsen wrote:
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Marc
```
```Michelsen
```
```Gesendet: Donnerstag, 15. Juli 2010 11:12
An: statalist@hsphsun2.harvard.edu
Betreff: st: Fitted probabilities using prvalue for logit model

Dear Statalist users,

```
I am running a logit model to estimate the effect and relative importance
```of
```
market timing and rating concerns on the decision to conduct a seasoned
```equity offering (panel data).

```
Including my rating concern proxy variables in the regressions improves
```the
```
```fit of the logit model (Pseudo-R2 and Chi2) compared to the standard
```
```model
```
```(including only market timing and control variables). One of the two
```
```rating
```
concern proxies (positive rating momentum) is statistically significant
```at
```
```5% with a marginal effect of -1.7%. The other one (negative rating
```
```momentum)
```
```shows a positive marginal effect but has no significant influence.

```
In order to gauge the relative importance of market timing versus rating concerns, I am trying to obtain predicted probabilities of conducting a seasoned equity offerings (SEO) in a given year. Therefore, I am using
```the
```
"prvalue" command to calculate the probabilities at representative values
```of
```
the explanatory variables (control variables at sample means, good vs.
```poor
```
```market timing opportunities). Neutral market timing opportunities
```
```translates
```
```into a SEO probability of 5.2%, which is comparable to the study von
```
DeAngelo/DeAngelo/Stulz (2009) p. 284. But if I measure the probabilities for positive, negative and neutral rating momentum (the other explanatory
```variables are set equal to the former model specification), the
```
probabilities are always lower compared to the benchmark model (3.8% /
```5.0%
```
/ 4.9%). While it is reasonable to assume that positive rating momentum lower the SEO probability, the results for the two other rating variables
```are surprising.

```
Obviously, this weakens my hypothesis that rating concerns are one of the
```drivers of seasoned equity offerings.

```
Does anybody have an idea why the fitted probabilities are lower in all
```three cases although the model fit is improved if I include the
```
```respective
```
```explanatory variables?

Many thanks
Marc

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```

--
Steven Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783

```
```

