Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Selecting specific observations based on two variables

From   henrik andersson <>
To   "" <>
Subject   st: Selecting specific observations based on two variables
Date   Wed, 19 Jun 2013 13:15:11 +0000


I would like to select specific observations from my data set and wonder if anyone knows how to do it in the most efficient and certain way?

I have data on properties where I for some individual properties have several observations. I want to use only one observations per property. The data and variables of interest look like this:

	Id	Year	
1.	1	2001
2.	1	2002
3.	1	2005
4.	1	2009
5.	1	2011
6.	2	2001
7.	2	2002
8.	2	2003
9.	2	2004
10.	2	2006
11.	2	2007
12.	2	2011

As explained, I want to use only one observation per property, defined by Id, and I want to use the observation closest the value 2009 in the variable Year. For instance for Id==1 I would use observation 4. For Id==2 there is a tie between observations 11 and 12. Here I would like to use the earlier observation, i.e. the one with the lowest value for Year, which means that I would use observation 11.

Do anyone has an idea how to create a dummy equal to one if the criteria below are met and zero otherwise?

I want to:

(a) Use only one observation per Id
(b) Use observation which value of Year is closest in absolute terms to 2009 (if Year==2009 then that observation(s) should be chosen).
(c) If tie in (b) use observation with lowest value of Year.

In addition if the above criteria is not able to single out one observation per Id, e.g. if there are two observation in the year 2009, it would be great if Stata then randomly could decide which one to pick. 

I am using Stata SE 11.2 and Stata MP 12.1.

Thank you in advance


*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index