Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Am I using the correct Regression command/method?


From   "Polson, Jasmine" <polsonj@interdent.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Am I using the correct Regression command/method?
Date   Fri, 20 Jul 2012 15:52:03 -0700

Hi Everyone,

I work as a Real Estate Analyst and focus my analysis on site selection.


My Objective: Using customer demographic/psychographic variables, I'm
trying to derive descriptive statistic "thresholds" for potential
locations. Here is an example of a descriptive statistic I'm trying to
solve for:  Within a 5-mile radius, potential sites should have a
minimum of XX,XXX households (lower bound); the optimal amount is
approximately XX,XXX households (upper bound). 

I'm using quintiles for this analysis. The dependent variable is
customer count per location and the independent variable is family
households. To fit a regression for the lower and upper bounds, I used
the following interquantile regression commands:

For Lower Bound (LB): iqreg customercount familyhouseholds, quantiles (1
.20)
For Upper Bound (UB): iqreg customercount familyhouseholds, quantiles
(.8 99)

Once I derived the coefficients, I solved for our median customer count
per existing location (3,300), but the results were undeniably wrong. I
should mention that the range of family households in my dataset is
1,760-44,113. Knowing this, it makes no sense to look for a site with at
least 78,560 households, with an optimal amount of -12,080 households.

LB: 3,300=90.024+0.04086x
                  X= 78,560 households

UB: 3,300= 2561 + (-0.0611731)x
                X= -12,080 households


Is this the correct method for deriving a descriptive statistic such as
this? Am I on the right track? I've also considered truncated
regression, sample splitting, regression using dummy variables or
categorical variables (1 if in lower range, 5 if in upper range, etc.).
Does anyone suggest any of these methods or any alternatives?  I'm
fairly new to econometrics/Stata so I hope my questions aren't too
rudimentary. Any guidance/suggestions are greatly appreciated.

Thank you,
Jasmine

      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index