Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Peng <david.peng99@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: generate a column of a summary statistic conditioning on the comparison of the values in two other columns |

Date |
Thu, 6 Dec 2012 10:57:39 -0600 |

Nick, Thank you very much. Your hint on combining orderdate and deliverydate is great. I was able to generate the summary statistic. David On Wed, Dec 5, 2012 at 11:55 AM, Nick Cox <njcoxstata@gmail.com> wrote: > -egen- is the wrong place to look for this. You need to drill down to > a more fundamental level. > > As a first stab at this > > bysort A (C) : gen mean_so_far = sum(B) / sum(B < .) > > except that it's previous values that you want. > > This doesn't really grapple with the difference between order date and > delivery date. I expect you may need > to combine order dates and delivery dates in a single date variable. > > See http://www.stata.com/statalist/archive/2012-11/msg01186.html for a > similar (but not identical) problem. > > Nick > > On Wed, Dec 5, 2012 at 4:29 PM, David Peng <david.peng99@gmail.com> wrote: >> I have a dataset with four variables, A, B, C, and D. A is a variable >> representing the customer number. B is the main variable of interest >> (in this case the dollar amount of a customer order). C is the date a >> customer order was placed and D is the date the same customer order >> was delivered. >> >> I would like to generate a column of summary statistic (let's say I >> want the mean) in the table. Basically, for each customer order, I >> would like to generate a mean value of the dollar amount for all of >> the orders placed by a customer prior to the date the order is placed. >> >> For each observation (i.e., a cusomer order) in the data table, I >> would like to get: >> >> bysort A: egen mean_dollar_amount=mean(B) if B is associated with a >> delivery date D that is earlier than the order date C of the customer >> order in question. >> >> >> As an example, if I have an obervation representing an order placed by >> the customer x with the order date of 12/30/2011, I would like to >> generate the mean of the dollar amount for all of the orders that were >> delivered earlier than the order date of 12/30/2011 for the order >> mentioned above. I need a mean value like this for each observation >> (i.e., customer order) in the data. >> >> Thanks in advance for your help. >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: generate a column of a summary statistic conditioning on the comparison of the values in two other columns***From:*David Peng <david.peng99@gmail.com>

**Re: st: generate a column of a summary statistic conditioning on the comparison of the values in two other columns***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Too many macros, but I create 1!** - Next by Date:
**Re: st: importance of independent variables** - Previous by thread:
**Re: st: generate a column of a summary statistic conditioning on the comparison of the values in two other columns** - Next by thread:
**st: Too many macros, but I create 1!** - Index(es):