Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: exploratory data analysis for finding substitutes and complements


From   Cameron McIntosh <cnm100@hotmail.com>
To   STATA LIST <statalist@hsphsun2.harvard.edu>
Subject   RE: st: exploratory data analysis for finding substitutes and complements
Date   Fri, 30 Sep 2011 13:11:16 -0400

Hi Dimitriy,
This type of analysis might be a bit dicey without basket data (record per customer with a transaction date, along with items purchased), but I don't imagine ecological data is completely prohibitive, either -- this is discussed in the Nestorov and Jukić (2003) paper below. I don't know about Stata specifically... 
Hahsler, M., Buchta, C., Gruen, B., & Hornik, K. (September 19, 2011). Mining Association Rules and Frequent Itemsets: Package 'arules', Version 1.0-6.http://cran.r-project.org/web/packages/arules/arules.pdf http://cran.r-project.org/web/packages/arules/index.htmlhttp://cran.r-project.org/web/packages/arules/vignettes/arules.pdf
Hahsler, M., Chelluboina, S. Hornik, K., & Buchta, C. (2011). The arules R-Package Ecosystem: Analyzing Interesting Patterns from Large Transaction Data Sets. Journal of Machine Learning Research, 12, 2021-2025.http://jmlr.csail.mit.edu/papers/volume12/hahsler11a/hahsler11a.pdf
Zhang, S., & Wu, X. (2011). Fundamentals of association rules in data mining and knowledge discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(2), 97-116.http://onlinelibrary.wiley.com/doi/10.1002/widm.10/pdf ;
Ben Messaoud, R., Loudcher Rabaséda, S. Missaoui, R. & Boussaid, O. (2008). OLEMAR: an On-Line Environment for Mining Association Rules in Multidimensional Data. In D. Taniar, (Ed.), Data Mining and Knowledge Discovery Technologies (pp. 1-35). IGI Global, 2008.http://eric.univ-lyon2.fr/~sabine/adwm_2007.pdf
Khan, A., Baharudin, B., & Khan, K. (2011). Mining customer data for decision-making using new hybrid classification algorithm. Journal of Theoretical and Applied Information Technology, 27(1), 54-61. http://www.jatit.org/volumes/research-papers/Vol27No1/7Vol27No1.pdf
Nestorov, S., & Jukić, N. (2003). Ad-Hoc Association-Rule Mining within the Data Warehouse. Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 8 - Volume 8. Washington, DC, USA: IEEE Computer Society.
Cam
> Date: Fri, 30 Sep 2011 11:34:50 -0400
> Subject: st: exploratory data analysis for finding substitutes and complements
> From: dvmaster@gmail.com
> To: statalist@hsphsun2.harvard.edu
> 
> I have a panel data set with store-level sales data for 125 items at a
> chain restaurant. My variables are quantity sold of that item in a
> particular store and time. My data looks like this: store_id, week,
> hot_dogs, burgers, fries, and drinks. For each item, I would like to
> figure out which items are substitutes or complements. For example, I
> would expect hamburgers and fries and hot dogs and fries to be
> complements, while hot dogs and hamburgers to be substitutes. I would
> like to group items into clusters to make some time-series graphs, but
> plotting all 125 items on the same graph is messy.
> 
> My first attempt at this involved calculating pairwise correlations
> between items, and grabbing those where the correlation is above some
> threshold X in absolute value. This works reasonably well, but I don't
> want to do this by hand for all the items and my loop-over-items
> approach is slow and inefficient.
> 
> Is there a command that can accomplish this for me? Or is there a
> better way of doing this using some sort of clustering algorithm?
> 
> DVM
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index