Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: predicting consumption


From   Joerg Luedicke <[email protected]>
To   [email protected]
Subject   Re: st: predicting consumption
Date   Wed, 9 Mar 2011 11:34:24 -0500

On Wed, Mar 9, 2011 at 11:03 AM, gemini mtei <[email protected]> wrote:
> I am trying to predict household total consumption from the national household budget survey to a small survey that we conducted but didn't collect consumption. I have used a linear model (OLS) as follow,
>
> log(consumption)= B0 +B1wealth+B2log(household size) + B3wealth*log(household size) +B4wealth*location, where
>
> wealth is measured by asset index constructed from ownership of assets, housing characteristics, source of utilities, and household head specific characteristics (i.e. education and employment). Location captures urban-rural differences.
>
> The model is giving me R-square of .55 and i have done all diagnostic tests and it seems fine. I have used the split half method for validation of the predicted consumption but (i.e. selecting a random sample from the households survey, run consumption model and predict into the remaining sample then compare with actual consumption) the problem i am facing is the model over predicts consumption for the households with low consumption while it under predict for households with higher consumption.
>
> I need the predicted consumption for the analysis of out of pocket financing incidence in the small survey i mentioned above. These survey had small difference in their implimentation time and the assumption i am putting is that since the household budget survey is nationally representative i can use it to predict consumption into this small survey.  Can you advise whether i am making mistake in model specification? Is there a special case in predicting with interactions?
>

I am merely guessing but an OLS model might not be the right choice.
The fact that:

Quote:
 "i am facing is the model over predicts consumption for the
households with low consumption while it under predict for households
with higher consumption"

seems to indicate that other distributions may be of better fit. Have
you checked anything beyond OLS? For instance, using a gamma glm:

glm  consumption [your indep vars] , family(gamma) link(log)

would be a better fit?

J.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index