[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: RE: Constrained Lowess |

Date |
Fri, 2 May 2008 18:26:49 +0100 |

Good. Your variable isn't binary, but a proportion, and you are treating age as numerical, so your problem is fit for -lowess- after all. (I prefer restricted cubic splines, but a transformation is probably useful there too.) Nick n.j.cox@durham.ac.uk Sergiy Radyakin Hello Nick, I am sorry for being inprecise. Indeed, I smooth the rates (of e.g. unemployment) by groups defined by age (which is truncated to integers, and thus I concider it categorical). So I start with a table like the following: Age Unemployment rate 10 0.01 11 0.02 ... 99 0.01 Here unemployment rate is naturally between 0 and 1. It is the average of the 0/1-responses within the group, defined by age. If I just run lowess, it produces the picture similar to the one here: sysuse auto generate z=1/headroom^16 lowess z mpg Note that the tails go below zero, and this is what I am trying to avoid. Your advice of logit transformation before/after smoothing worked. On 5/2/08, Nick Cox <n.j.cox@durham.ac.uk> wrote: > Quite how to get useful results from smoothing a binary response is not > clear to me. > > If the data were proportions on (0,1) or even [0,1] I would suggest > some kind of transformation approach. -lowess, logit- is presumably > intended to help. > Otherwise consider something like an angular or folded root > transformation, applying -lowess- and then transforming back. > > But for binary data any transformation just maps two distinct values to > two other distinct values and so cannot help, so far as I can see. > > In the case of unemployment data, presumably you are dealing with > individuals? If they are aggregate data for lots of individuals I would > collapse by age to get proportion of unemployed, and then smooth if > necessary. It sounds as if you want something quite different, however. > Also, as you regard -age- as categorical I probably don't understand > what you are trying to do. Sergiy Radyakin > I am plotting a smoothed graph (-lowess-) of a binary variable (e.g. > unemployed) by categorical (e.g. age). However the smoothed values are > not necessarily in the [0;1] range, where unemployment must be by > definition. I can save the smoothed values into a new variable with > the option -generate(newvar)- and then truncate the negatives and > values larger than one, but I believe smoothing must look differently > if I could tell -lowess- to look for such a constrained value in the > first place. As it follows from the description of -lowess- it doesn't > have such a feature. Is there any user-written command or simple > algorithm for this purpose? * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Constrained Lowess***From:*"Sergiy Radyakin" <serjradyakin@gmail.com>

**st: RE: Constrained Lowess***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: RE: Constrained Lowess***From:*"Sergiy Radyakin" <serjradyakin@gmail.com>

- Prev by Date:
**RE: st: Label range** - Next by Date:
**Re: st: RE: Constrained Lowess** - Previous by thread:
**Re: st: RE: Constrained Lowess** - Next by thread:
**Re: st: RE: Constrained Lowess** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |