Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Regression question


From   Cameron McIntosh <cnm100@hotmail.com>
To   STATA LIST <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Regression question
Date   Sat, 26 May 2012 20:28:25 -0400

Yes, for looking at the effect of AD spending, aggregation may be the most feasible option. But you may want to also try a different, or at least supplemental approach. I think that it would be quite interesting to know which media types clustered most often across companies -- how many companies are there?
Zhang, S., & Wu, X. (2011). Fundamentals of association rules in data mining and knowledge discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(2), 97-116.http://onlinelibrary.wiley.com/doi/10.1002/widm.10/pdf

Hahsler, M., Buchta, C., Gruen, B., & Hornik, K. (April 23, 2012). Mining Association Rules and Frequent Itemsets: package 	arules, Version, 1.0-8.http://cran.r-project.org/web/packages/arules/index.html

Liu, G., Zhang, H., & 	Wong, L. (2011). Controlling false positives in association rule mining. Proceedings of the VLDB Endowment, 5(2), 145-156. http://vldb.org/pvldb/vol5/p145_guimeiliu_vldb2012.pdf

Adamo, J.-M. (2001). Data Mining for Association Rules and Sequential Patterns: Sequential and Parallel Algorithms. New York, NY: Springer Verlag. 

Cam

> From: kalisperos@gmail.com
> To: statalist@hsphsun2.harvard.edu
> Subject: RE: st: Regression question
> Date: Sat, 26 May 2012 16:43:20 -0500
> 
> Hi David,
> 
> Thank you for your opinion. The data structure is more complicated in fact.
> Say, there are 50 different media types and each company (i) has different
> number of media spending (from 1 to 50). So, setting all these as
> independent variables is not possible. 
> 
> Anyway, the regression form I specified below does not seem correct.
> Probably the only way is to aggregate information about (j) and make all
> variables specific to only (i).
>  
> Thank you,
> Mike.
> 
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of David Hoaglin
> Sent: Friday, May 25, 2012 10:38 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: Regression question
> 
> Hi, Mike.
> 
> I may not understand the structure of your data, but it seems that the
> explanatory variable that you denote by X2(ij) is actually several related
> variables.  That is, in your example, j seems to index the various media (j
> = 1 for TV, j = 2 for Newspaper, etc.).  In that situation, you should treat
> each of the media as a separate explanatory variable, with its own
> regression coefficient.  You might learn from the analysis that those
> coefficients are essentially equal, in which case the interpretation would
> be that what matters is the total amount of AD spending.  Then you could
> simplify the model by using total AD spending as the explanatory variable,
> instead of the amount spent on each of the types.  It seems more likely,
> however, that the coefficients for the types of media will differ.
> 
> In interpreting the regression coefficients, please keep in mind that the
> set of other explanatory variables in the model is part of the definition of
> each coefficient, and that each estimated coefficient reflects the
> contribution of its explanatory variable after adjusting for the
> contributions of the other explanatory variables.
> 
> I hope you are planning to make plots of the data and use various regression
> diagnostics to spot influential observations.
> 
> David Hoaglin
> 
> On Fri, May 25, 2012 at 9:52 AM, Mike Kim <kalisperos@gmail.com> wrote:
> > Hi all,
> >
> > This question is not about Stata, but I would appreciate your opinion. 
> > I wonder whether the following regression (e.g., OLS) makes sense.
> >
> > Y(i) = b0 + b1*X1(i) + b2*X2(ij) + e(i) That is, Y varies by i, but 
> > some independent variables vary by i and j. Each
> > Y(i) is repeated j times, so data structure is:
> >
> > Y     X1   X2
> > 10   2    1
> > 10   2    2
> > 10   2    3
> > 20   3    4
> > 20   3    5
> > ...
> >
> > For example:
> > i: company, j: adverting spending by media (TV, Newspaper, etc.)
> > REVENUE(i) = b0 + b1*R&D SPENDING(i) + b2*AD SPENDING BY MEDIA(ij) + 
> > e(i)
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index