Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: how to generate variable for non-standard data


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: how to generate variable for non-standard data
Date   Mon, 17 Nov 2003 11:33:24 -0000

Oleksandr Shepotylo

> I have data on soccer game results. I want to generate
> variable that will reflect how a team played in the last 2 games:
> win_streak=sum of points in
> the last 2 games.
>
> Simple example, with 4 teams and 3 rounds:
>
> Round   Home_team    Away_team   Points_home_team  Points_away_team
> 1                      A                    B
>         3
> 0
> 1                      C                    D
>         1
> 1
> 2                      B                    D
>         0
> 3
> 2                      A                    C
>         1
> 1
> 3                      D                    A
>         3
> 0
> 3                      B                    C
>         1
> 1
>
> Then I want to create the variable:
>
> Round  Team       Win_streak
>     3          A                    4
>     3          B                    0
>     3          C                    2
>     3          D                    4
>
>
> The problem is that in the data a team could be in column 2
> or column 3
> depending on  playing home or away. Also, when I add points
> I should check
> if I look at 4th or 5th column. Therefore, I can not just
> use: by sort team
> (round): egen win_streak= points[_n-1]+points[_n-2].

Oleksandr Talavera posted a solution:

> Try the following:
> ****************
> * soccer.dta has your data
> use soccer, clear
> list
> keep round h_team h_points
> rename h_team team
> rename h_points points
> * additional dataset  soccerH that contains home team data
> save soccerH, replace
> use soccer, clear
> keep round a_team a_points
> rename a_team team
> rename a_points points
> append using soccerH
> egen teams=group(team)
> tsset teams round
> g win_streak = L.points + L2.points
> sort round team
> list

Here is a similar approach, but one which
is done in place with any file manipulation:

expand 2
bysort Round Home : gen Team = cond(_n==1, Home, Away)
bysort Round Home : gen points = cond(_n==1, Points_home, Points_away)
encode Team, gen(team)
tsset team Round
gen win_streak = L.points + L2.points

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index