Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Marginsplot on backtransformed data


From   Richard Williams <richardwilliams.ndu@gmail.com>
To   statalist@hsphsun2.harvard.edu, statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: Marginsplot on backtransformed data
Date   Thu, 19 Dec 2013 21:47:58 -0500

At 09:10 PM 12/19/2013, Daniel Herbert Opi wrote:
Thanks everyone,

Since my transformation was on the dependent variable, I've ended up
using the -expression()- option in -margins- as suggested by Scott
Merryman and marginsplot gives the desired plots with the dependent
variable now transformed.

An additional question, if I am to then add a plot on the same graph,
specifically a scatter plot, what variable name would I need to call
the dependent variable that has now been backtransformed in the
margins command?

So in Richard Williams initial example, the scatter plot on the data
before backtransforming would be something like:

marginsplot, name(regress, replace) addplot(scatter sqweight race)
(graph doesn't look as pretty but I can sort that out in my case)

This does sound like an ugly graph! Three or four straight lines with a bunch of points on them? But in any event, if you want to backtransform why would you use sqweight instead of weight?



From the explanation by Nick Cox on -glm- versus -regress- I think I
am right in assuming that I would rather first square transform the
data then run the regress.

Perhaps I misunderstand him, but I got the impression Nick preferred the glm approach. Of course, if it makes little difference in the results, you may want to go with whatever approach makes it easiest to generate the graph you want.


Herbert



From  Nick Cox <njcoxstata@gmail.com>
To  "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject  Re: st: Marginsplot on backtransformed data
Date  Thu, 19 Dec 2013 17:03:29 +0000
________________________________

The pros and cons of generalized linear models (e.g. as implemented in
-glm-) versus other approaches are a rather large subject, but one key
point is that in -glm- it is the predicted mean which is fitted on a
transformed scale, which is not the exact equivalent of transforming
the data.

You need only realise that

* logit(0) and logit(1) are undefined whereas logit(mean) is perfectly
well defined for mean in (0,1)

or

* log(0) is also undefined, but Poisson models can accommodate observed zeros

to see the advantage. Another key point is that functional form and
distribution family are separate choices.

A very good expository paper, often cited on this list by Scott
Merryman and myself, is

Lane, P.W. 2002. Generalized linear models in soil science.
European Journal of Soil Science 53: 241-251.

Anyone going "soil science???" should know that the examples are not
difficult and seize this opportunity to impress your colleagues with
your eclectic erudition.

Abstract: Classical linear models are easy to understand and fit.
However, when assumptions are not met, violence should not be used on
the data to force them into the linear mould. Transformation of
variables may allow successful linear modeling, but it affects several
aspects of the model simultaneously. In particular, it can interfere
with the scientific interpretation of the model. Generalized linear
models are a wider class, and they retain the concept of additive
explanatory effects. They provide generalizations of the
distributional assumptions of the response variable, while at the same
time allowing a transformed scale on which the explanatory effects
combine. These models can be fitted reliably with standard software,
and the analysis is readily interpreted in an analogous way to that of
linear models. Many further generalizations to the generalized linear
model have been proposed, extending them to deal with smooth effects,
non-linear parameters, and extra compone
nts of variation. Though the extra complexity of generalized linear
models gives rise to some additional difficulties in analysis, these
difficulties are outweighed by the flexibility of the models and ease
of interpretation. The generalizations allow the intuitively more
appealing approach to analysis of adjusting the model rather than
adjusting the data.

Nick
njcoxstata@gmail.com


On 19 December 2013 16:13, Richard Williams
<richardwilliams.ndu@gmail.com> wrote:

> At 10:50 AM 12/19/2013, Scott Merryman wrote:
>>
>> One could also use the -expression()- option in -margins-
>>
>>  margins race, expression(predict(xb)^2)
>>  marginsplot, name(regress2,replace)
>
>
> Good point. I've used the expression option to do thing like multiply
> numbers by 100 so you get 37.3 instead of .373.
>
> That still leaves open the question of whether you should use regress
> (computing the square root of the dv yourself) or use glm (using the power
> link.) In my example it doesn't make too much difference. In general is it
> better to use glm or are there pros and cons of each approach?
>
>
>> Scott
>>
>>
>> On Thu, Dec 19, 2013 at 9:30 AM, Richard Williams
>> <richardwilliams.ndu@gmail.com> wrote:
>> > Patrick Royston's -marginscontplot- (available from SSC) can be used
>> > when
>> > you've done a log or other transformation of an independent variable.
>> > See
>> > the help file example entitled "Example using a log-transformed
>> > covariate".
>> >
>> > For a dependent variable, I think you can use the glm command, at least
>> > some
>> > of the time. You should get a 2nd opinion on this, e.g. Austin Nichols
>> > is
>> > much better with these sorts of things than I am. When the dependent
>> > variable has been transformed I believe it is often better to use glm
>> > anyway. In the following you don't get exactly the same results from
>> > regress
>> > and glm but I don't think you are supposed to (and the results are
>> > similar).
>> >
>> > webuse nhanes2f, clear
>> > gen sqweight = weight ^.5
>> > reg sqweight i.race
>> > margins race
>> > marginsplot, name(regress)
>> > glm weight i.race, link(power .5)
>> > margins race
>> > marginsplot, name(glm)

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index