Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: generate a column of a summary statistic conditioning on the comparison of the values in two other columns


From   David Peng <david.peng99@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: generate a column of a summary statistic conditioning on the comparison of the values in two other columns
Date   Thu, 6 Dec 2012 10:57:39 -0600

Nick,

Thank you very much. Your hint on combining orderdate and deliverydate
is great. I was able to generate the summary statistic.

David

On Wed, Dec 5, 2012 at 11:55 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> -egen- is the wrong place to look for this. You need to drill down to
> a more fundamental level.
>
> As a first stab at this
>
> bysort A (C) : gen mean_so_far = sum(B) / sum(B < .)
>
> except that it's previous values that you want.
>
> This doesn't really grapple with the difference between order date and
> delivery date. I expect you may need
> to combine order dates and delivery dates in a single date variable.
>
> See http://www.stata.com/statalist/archive/2012-11/msg01186.html for a
> similar (but not identical) problem.
>
> Nick
>
> On Wed, Dec 5, 2012 at 4:29 PM, David Peng <david.peng99@gmail.com> wrote:
>> I have a dataset with four variables, A, B, C, and D. A is a variable
>> representing the customer number. B is the main variable of interest
>> (in this case the dollar amount of a customer order). C is the date a
>> customer order was placed and D is the date the same customer order
>> was delivered.
>>
>> I would like to generate a column of summary statistic (let's say I
>> want the mean) in the table. Basically, for each customer order, I
>> would like to generate a mean value of the dollar amount for all of
>> the orders placed by a customer prior to the date the order is placed.
>>
>> For each observation (i.e., a cusomer order) in the data table, I
>> would like to get:
>>
>> bysort A: egen mean_dollar_amount=mean(B) if B is associated with a
>> delivery date D that is earlier than the order date C of the customer
>> order in question.
>>
>>
>> As an example, if I have an obervation representing an order placed by
>> the customer x with the order date of 12/30/2011, I would like to
>> generate the mean of the dollar amount for all of the orders that were
>> delivered earlier than the order date of 12/30/2011 for the order
>> mentioned above. I need a mean value like this for each observation
>> (i.e., customer order) in the data.
>>
>> Thanks in advance for your help.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index