Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# R: st: Regression with multiple age groups

 From To Subject R: st: Regression with multiple age groups Date Wed, 25 Apr 2012 18:19:56 +0200

```Dear Shirley,
just for topping off David's sound advice, you may find useful to run a
Poisson regression with the -irr- option, that reports  incidence-rate
ratios for the dependent variable (number of divorces in a given year). If I
understand correctly the description of your problem, you should then go
back to the total number of divorce in a given year (ie, a count variable)
instead of using the divorce rate which you calculated yourself from the
Office of National Statistics data.
As usual with Poisson regression, you should check for possible
overdispersion. Should it come alive, -nbreg- is the way to go.
Best wishes,
Carlo

-----Messaggio originale-----
Da: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Shirley Sy
Inviato: mercoledì 25 aprile 2012 17:22
A: statalist@hsphsun2.harvard.edu
Oggetto: FW: st: Regression with multiple age groups

Hi David,
I took the data from the Office of National Statistics website for the years
1980 to 2000. My independent variable is the divorce rate (which I
calculated myself using the total number of divorces in a given year divided
by the total number of marriages in the same year) and my explanatory
variables are: husband's age at divorce, wife's age at divorce, husband's
previous marital status, wife's previous marital status, combination of
husband's and wife's previous marital status (i.e. first marriage for both,
one party previously divorced, both previously divorced), duration of
marriage (under 2yrs, 2-5, 6-9, 10-14, 15-19, 20-24, 25-29, 30+, not
stated), average number of children per couple, female unemployment rate and
male unemployment rate.
I was planning to do OLS and have not considered poisson or negative
binomial as of yet. Unfortunately I was given this project to do without any
supervision and absolute minimal help and I was only taught the very basics
of Stata a year ago so I didn't intend on doing other models with the fear
of doing it completely wrong. Would an ARIMA model be appropriate for this
data? Shirley
----------------------------------------
> Date: Wed, 25 Apr 2012 09:09:33 -0400
> Subject: Re: st: Regression with multiple age groups
> From: dchoaglin@gmail.com
> To: statalist@hsphsun2.harvard.edu
>
> Dear Shirley,
>
> Others will agree that you need to tell the list more about the data
> and the analysis that you intend to do, before we can make useful
> suggestions.
>
> It would be appropriate to handle the age categories, at least
> initially, by using a separate dummy variable for each category except
> the first (which will be fitted by the intercept term in your model).
> You can then plot the coefficients against the midpoint of the
> category and consider whether to revise the model.
>
> You said that you have the age category, separately, for the husband
> and the wife. Thus, you would use two separate sets of dummy
> variables, one for the husband's age and the other for the wife's age.
> You may want to explore the possibility of interactions between the
> two ages.
>
> Please say more about the counts that you listed for 1985 and 1986 (I
> assume that those are two of the 20 years). They appear to be the
> frequency distribution of the numbers of divorces by one of the sets
> of age categories. If you are projecting divorce rates, and those
> counts represent the number of divorces in one year, what is the
> denominator for the rate? Do you have the denominator for each age
> category (and even the combination of husband's age category and
> wife's age category) or only for the year as a whole?
>
> Do the data come from a survey? If so, you will need to take the
> sampling design and the weights into account.
>
> What type of regression are you planning to use? Ordinary regression
> will probably not be appropriate for rates. You should consider
> Poisson regression (and perhaps negative binomial regression)?
>
> So far, I can envision a model that contains a time pattern (initially
> based on a dummy variable for each year after the first), effects for
> husband's age, and effects for wife's age (and perhaps some form of
> husband-wife interaction). Do you have covariates that you are
> planning to include?
>
>
> David Hoaglin
>
> On Wed, Apr 25, 2012 at 12:36 AM, Shirley Sy <shirleysy@hotmail.co.uk>
wrote:
> > Dear Statalisters,
> > I am a complete beginner at Stata so my question is very basic but I am
having trouble finding an answer on the web. I am doing a time series
regression project forecasting divorce rates. My data spans 20 years and for
both husband and wife the 'age at divorce' variable is split into groups i.e
it looks something like this:
> >
> > Year           under 20         20 to 29         30 to 39         40 to
49        50 to 59      60plus      not stated1985             458
1154             78                52             3             2
3
> > 1986             221               956              50                59
9             5            0
> > How would I run a regression with the total numbers in each age group?
Would I use dummy variables? I understand how I would do it if I had
individual ages but since this is a time series model and I have the total
number in each age group, I am finding it slightly more complicated.
> >
> > ThanksShirley
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```