Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Dealing with panel data


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: Dealing with panel data
Date   Tue, 6 Aug 2013 16:54:45 +0100

You need to let the other observations in each group know what
happened in race 1. That sounds like

egen fav_won_1 = total(favourite == 1 & race_no == 1), by(meeting)

which will give you a meeting-wide indicator variable.

Then you go

egen mean_overround = mean(overround/(fav_won_1 & race_no > 1)), by(meeting)

The division sign / is not a typo. Fortuitously, but fortunately, it
works here as if it were a condition sign |.

For a more systematic review of such tricks, see

http://www.stata-journal.com/article.html?article=dm0055

P.S. -summarize- is a command, not a function.
Nick
[email protected]


On 6 August 2013 16:44, John Kenny <[email protected]> wrote:
> Dear Statalist,
>
> I'm relatively new to stata and I cannot find a standard way too solve
> my problem and I may need to write a .*do file.
>
> I'm dealing with a very large data set that has about 20 variable that
> outlines horse racing results with 711,000 observations. The problem
> that I am having is that I cannot get the mean of one variable
> 'overround' if two other variables are a certain value.
>
>  To be more specific within the data set I am using the 'overround'
> determines the bookmakers profit margin. What I want to do is get the
> mean of the 'overround'  for each race from 2 to the last race 7 if
> the 'favorite' (which is a dummy variable) won the first race. I have
> a number of variables that outline the time of each race and when it
> occurs at a certain meeting. These are some of the variables that are
> outlined in the data set for each race there is a variable that says
> were the race is held [ 'Meeting' ], the date and time, ['date',
> 'time'] , the odds given for each horse ['odds'], the race number at
> that meeting ['race_no' (lists the races from 1-7)], whether the
> favorite won that race ['favorite'] and the overround which is the
> bookmakers profit [ 'overround' ].
>
> What I have tried is using the summarize function and try and get the
> mean of the 'overround' if 'race_no'==1 & 'favorite'==1. However every
> combination of variables I tried using it always just got the mean of
> the 'overround' for race 1 if the favourite won and not the mean of
> the second race or third race if the favorite won the first race.
>
> Any help would be greatly appreciated as I have been stuck on this for a while.
>
> Thanks in advance.
>
> John
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index