Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Am I using the correct Regression command/method?

From	"Polson, Jasmine" <[email protected]>
To	<[email protected]>
Subject	st: Am I using the correct Regression command/method?
Date	Fri, 20 Jul 2012 15:52:03 -0700

Hi Everyone,

I work as a Real Estate Analyst and focus my analysis on site selection.


My Objective: Using customer demographic/psychographic variables, I'm
trying to derive descriptive statistic "thresholds" for potential
locations. Here is an example of a descriptive statistic I'm trying to
solve for:  Within a 5-mile radius, potential sites should have a
minimum of XX,XXX households (lower bound); the optimal amount is
approximately XX,XXX households (upper bound). 

I'm using quintiles for this analysis. The dependent variable is
customer count per location and the independent variable is family
households. To fit a regression for the lower and upper bounds, I used
the following interquantile regression commands:

For Lower Bound (LB): iqreg customercount familyhouseholds, quantiles (1
.20)
For Upper Bound (UB): iqreg customercount familyhouseholds, quantiles
(.8 99)

Once I derived the coefficients, I solved for our median customer count
per existing location (3,300), but the results were undeniably wrong. I
should mention that the range of family households in my dataset is
1,760-44,113. Knowing this, it makes no sense to look for a site with at
least 78,560 households, with an optimal amount of -12,080 households.

LB: 3,300=90.024+0.04086x
                  X= 78,560 households

UB: 3,300= 2561 + (-0.0611731)x
                X= -12,080 households


Is this the correct method for deriving a descriptive statistic such as
this? Am I on the right track? I've also considered truncated
regression, sample splitting, regression using dummy variables or
categorical variables (1 if in lower range, 5 if in upper range, etc.).
Does anyone suggest any of these methods or any alternatives?  I'm
fairly new to econometrics/Stata so I hope my questions aren't too
rudimentary. Any guidance/suggestions are greatly appreciated.

Thank you,
Jasmine

      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: RE: Graphing Stacked Likert Scale With Neutral in Middle
Next by Date: Re: st: Why is Mata much slower than MATLAB at matrix inversion?
Previous by thread: st: Graphing Stacked Likert Scale With Neutral in Middle
Next by thread: st: Re: Cross sectional dependence panel data random effects
Index(es):
- Date
- Thread