[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Reza C Daniels <rdaniels@commerce.uct.ac.za> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: Generating skewed distributions on closed intervals |

Date |
Thu, 29 Sep 2005 13:28:39 +0200 |

Hi Maarten,

My problem is exactly one of data coarsening, as explained by Heitjan and Rubin (JASA, 1991). The exception is that they applied this to heights and I'm wanting to apply it to age.

I am also aware of the need to multiply impute. However, I wanted the uniform, normal and skewed distributions first before imputing, so that once I obtained the multiply imputed estimates, I would have something to compare them to.

Reza

Maarten Buis wrote:

Hi Reza,

Will you be using this new age variable as a

dependent/explained/y-variable or as an

independent/explanatory/x-variable?

If you are using age as an explained variable you will probably end

up in survival analysis, and they have good techniques of dealing

with discrete time, so I see no need to invent something new there.

See: "An Introduction to Survival Analysis Using Stata" by Mario

Cleves, William W. Gould, and Roberto Gutierrez available from Stata

Press.

If you will be using age as an explanatory variable than it is good

to know that even very coarsely categorized variables often produce

good estimates. If you still want to do something about the

categorisation, than you would probably want to do some form of

multiple imputation. The way to think about it is that there is one

age distribution, which was chopped up in bits. You don't want to use

different distributions for each age band, since than you would

assume a very bumpy overall age distribution. So you would first

estimate the parameters of this age distribution. Than if you wanted

to draw an age for a person in category 20-30, you would draw from a

value this distribution truncated between 20 and 30. You would create

multiple datasets this way, estimate the regression or whatever other

parameter of interest for each of these datasets, and the mean of

these effects would be your estimate controlling for the

categorisation of age. However, I repeat that this is probably more

trouble that its worth.

I'd like to be sure that this is what you want, before I spent an

afternoon writing Stata code for you.

Maarten

-----Original Message----- From: owner-statalist@hsphsun2.harvard.edu

[mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Reza C

Daniels Sent: donderdag 29 september 2005 12:34 To:

statalist@hsphsun2.harvard.edu Subject: Re: st: RE: Generating skewed

distributions on closed intervals

Hi Maarten,

I tried this in the following way:

set obs 100 -gen z1=invnorm(uniform())- where z>0 -gen z2=ln(z1)- for

positively skewed -gen z3=exp(z1)- for negatively skewed

As I'm sure you know, this gives me the correct shape of the distributions I'm looking for, but the incorrect range.

So, I still can't solve it.

Thanks anyway, Reza

Maarten Buis wrote:

It reminds me of an ordered probit problem: you have one unobserved

distribution, which is being carved up. Only now you also have

information about where the cuts are made. This should be solvable.

You might want to look at the log normal instead of the normal

though, since no one can get, or has ever been, -2 (even with

plastic surgery).

-----Original Message----- From:

owner-statalist@hsphsun2.harvard.edu

[mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox Sent: donderdag 29 september 2005 11:09 To:

statalist@hsphsun2.harvard.edu Subject: RE: st: RE: Generating

skewed distributions on closed intervals

Well, I guess wildly the literature you are unaware of holds better

solutions, but that's an empty comment as I don't know what it is.

The idea that an age distribution is a bunch of little truncated Gaussians sitting next to each other on a line sounds at best

strange to me, but as I said I don't understand what your problem

is.

Nick n.j.cox@durham.ac.uk

Reza C Daniels

There is a literature on this problem that I am aware of. I'm just having trouble with the code in Stata to generate my required results.Whatever your problem is, it is difficult to believe that there is not a literature on it, e.g. in demography, actuarial science, population ecology.* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**RE: st: RE: Generating skewed distributions on closed intervals***From:*"Maarten Buis" <M.Buis@fsw.vu.nl>

- Prev by Date:
**Re: st: RE: Generating skewed distributions on closed intervals** - Next by Date:
**RE: st: RE: Generating skewed distributions on closed intervals** - Previous by thread:
**RE: st: RE: Generating skewed distributions on closed intervals** - Next by thread:
**RE: st: RE: Generating skewed distributions on closed intervals** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |