Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Constrained Lowess


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Constrained Lowess
Date   Fri, 2 May 2008 18:26:49 +0100

Good. Your variable isn't binary, but a proportion, and you are treating
age as numerical, so your problem is fit for -lowess- after all. (I
prefer restricted cubic splines, but a transformation is probably useful
there too.) 

Nick
n.j.cox@durham.ac.uk 

Sergiy Radyakin

Hello Nick,

I am sorry for being inprecise. Indeed, I smooth the rates (of e.g.
unemployment) by groups defined by age (which is truncated to
integers, and thus I concider it categorical).

So I start with a table like the following:

Age     Unemployment rate
10         0.01
11         0.02
...
99         0.01

Here unemployment rate is naturally between 0 and 1. It is the average
of the 0/1-responses within the group, defined by age.

If I just run lowess, it produces the picture similar to the one here:
sysuse auto
generate z=1/headroom^16
lowess z mpg
Note that the tails go below zero, and this is what I am trying to
avoid.

Your advice of logit transformation before/after smoothing worked.


On 5/2/08, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> Quite how to get useful results from smoothing a binary response is
not
> clear to me.
>
> If the data were proportions on (0,1) or even [0,1] I would suggest
> some kind of transformation approach. -lowess, logit- is presumably
> intended to help.
> Otherwise consider something like an angular or folded root
> transformation, applying -lowess- and then transforming back.
>
> But for binary data any transformation just maps two distinct values
to
> two other distinct values and so cannot help, so far as I can see.
>
> In the case of unemployment data, presumably you are dealing with
> individuals? If they are aggregate data for lots of individuals I
would
> collapse by age to get proportion of unemployed, and then smooth if
> necessary. It sounds as if you want something quite different,
however.
> Also, as you regard -age- as categorical I probably don't understand
> what you are trying to do.

 Sergiy Radyakin

> I am plotting a smoothed graph (-lowess-) of a binary variable (e.g.
> unemployed) by categorical (e.g. age). However the smoothed values are
> not necessarily in the [0;1] range, where unemployment must be by
> definition. I can save the smoothed values into a new variable with
> the option -generate(newvar)- and then truncate the negatives and
> values larger than one, but I believe smoothing must look differently
> if I could tell -lowess- to look for such a constrained value in the
> first place. As it follows from the description of -lowess- it doesn't
> have such a feature. Is there any user-written command or simple
> algorithm for this purpose?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index