Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Mike Kim" <kalisperos@gmail.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Regression question |

Date |
Mon, 28 May 2012 09:45:29 -0500 |

Hi Cam and David, Thank you for your suggestions. The number of companies are not many (around 150), but I will try your suggestions. Mike. -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Cameron McIntosh Sent: Saturday, May 26, 2012 7:28 PM To: STATA LIST Subject: RE: st: Regression question Yes, for looking at the effect of AD spending, aggregation may be the most feasible option. But you may want to also try a different, or at least supplemental approach. I think that it would be quite interesting to know which media types clustered most often across companies -- how many companies are there? Zhang, S., & Wu, X. (2011). Fundamentals of association rules in data mining and knowledge discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(2), 97-116.http://onlinelibrary.wiley.com/doi/10.1002/widm.10/pdf Hahsler, M., Buchta, C., Gruen, B., & Hornik, K. (April 23, 2012). Mining Association Rules and Frequent Itemsets: package arules, Version, 1.0-8.http://cran.r-project.org/web/packages/arules/index.html Liu, G., Zhang, H., & Wong, L. (2011). Controlling false positives in association rule mining. Proceedings of the VLDB Endowment, 5(2), 145-156. http://vldb.org/pvldb/vol5/p145_guimeiliu_vldb2012.pdf Adamo, J.-M. (2001). Data Mining for Association Rules and Sequential Patterns: Sequential and Parallel Algorithms. New York, NY: Springer Verlag. Cam > From: kalisperos@gmail.com > To: statalist@hsphsun2.harvard.edu > Subject: RE: st: Regression question > Date: Sat, 26 May 2012 16:43:20 -0500 > > Hi David, > > Thank you for your opinion. The data structure is more complicated in fact. > Say, there are 50 different media types and each company (i) has > different number of media spending (from 1 to 50). So, setting all > these as independent variables is not possible. > > Anyway, the regression form I specified below does not seem correct. > Probably the only way is to aggregate information about (j) and make > all variables specific to only (i). > > Thank you, > Mike. > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of David > Hoaglin > Sent: Friday, May 25, 2012 10:38 PM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: Regression question > > Hi, Mike. > > I may not understand the structure of your data, but it seems that the > explanatory variable that you denote by X2(ij) is actually several > related variables. That is, in your example, j seems to index the > various media (j = 1 for TV, j = 2 for Newspaper, etc.). In that > situation, you should treat each of the media as a separate > explanatory variable, with its own regression coefficient. You might > learn from the analysis that those coefficients are essentially equal, > in which case the interpretation would be that what matters is the > total amount of AD spending. Then you could simplify the model by > using total AD spending as the explanatory variable, instead of the > amount spent on each of the types. It seems more likely, however, that the coefficients for the types of media will differ. > > In interpreting the regression coefficients, please keep in mind that > the set of other explanatory variables in the model is part of the > definition of each coefficient, and that each estimated coefficient > reflects the contribution of its explanatory variable after adjusting > for the contributions of the other explanatory variables. > > I hope you are planning to make plots of the data and use various > regression diagnostics to spot influential observations. > > David Hoaglin > > On Fri, May 25, 2012 at 9:52 AM, Mike Kim <kalisperos@gmail.com> wrote: > > Hi all, > > > > This question is not about Stata, but I would appreciate your opinion. > > I wonder whether the following regression (e.g., OLS) makes sense. > > > > Y(i) = b0 + b1*X1(i) + b2*X2(ij) + e(i) That is, Y varies by i, but > > some independent variables vary by i and j. Each > > Y(i) is repeated j times, so data structure is: > > > > Y X1 X2 > > 10 2 1 > > 10 2 2 > > 10 2 3 > > 20 3 4 > > 20 3 5 > > ... > > > > For example: > > i: company, j: adverting spending by media (TV, Newspaper, etc.) > > REVENUE(i) = b0 + b1*R&D SPENDING(i) + b2*AD SPENDING BY MEDIA(ij) + > > e(i) > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Regression question***From:*"Mike Kim" <kalisperos@gmail.com>

**Re: st: Regression question***From:*David Hoaglin <dchoaglin@gmail.com>

**RE: st: Regression question***From:*"Mike Kim" <kalisperos@gmail.com>

**RE: st: Regression question***From:*Cameron McIntosh <cnm100@hotmail.com>

- Prev by Date:
**Re: st: Is it valid to use the individual ratios (i.e. Xi/Yi) in the dependent or independent part of a regression model?** - Next by Date:
**RE: st: Hodrick-Prescott Filter issues** - Previous by thread:
**RE: st: Regression question** - Next by thread:
**Re: st: Regression question** - Index(es):