
From  "Gauri Khanna" <gwkhanna@hotmail.com> 
To  statalist@hsphsun2.harvard.edu 
Subject  RE: st: Dropping/Keeping observations25 April 
Date  Thu, 27 Apr 2006 13:56:04 +0000 
Thank you Maarten. Gauri
From: "Maarten Buis" <M.Buis@fsw.vu.nl>_________________________________________________________________
ReplyTo: statalist@hsphsun2.harvard.edu
To: <statalist@hsphsun2.harvard.edu>
Subject: RE: st: Dropping/Keeping observations25 April
Date: Tue, 25 Apr 2006 18:45:15 +0200
Gauri:
Michael pointed this out already, but our discussion might
have been a bit too terse.
My answer was tainted by the typical reason why I would
use such keep or drop commands in combination with
graphs. I keep just one case for each combination of x
and y, since multiple observations would just be
plotted on top of one another so don't show but do make
your graph file a lot larger. This is probably not the
case for you, so Michael solution is just fine.
The command I used that droped cases was:
by rep78: keep if _n==_N & _N>1 & rep78 <. 
So my command contained three conditions a case should
match in order to be retained in the dataset:
_n==_N: _n is the current observation number within the
by group, and _N is the total within the by group. So
say a group consists of 5 cases than only the last case
(with observation number 5) will be retained.
_N>1: only groups with more than one case are retained.
rep78 < .: only retain cases with nonmissing values on
the grouping variable.
In my example code I needed a group with only one case
to show that it would drop that group. The grouping
variable (rep78) in the example case did not contain
such a group, so I created one by replacing one value
with a unique number. This was done with the command
replace rep78 = 10 in 10 So in your case you should not
include this line.
HTH,
Maarten

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214
+31 20 5986715
http://home.fsw.vu.nl/m.buis/

Original Message
From: ownerstatalist@hsphsun2.harvard.edu [mailto:ownerstatalist@hsphsun2.harvard.edu]On Behalf Of Gauri Khanna
Sent: dinsdag 25 april 2006 18:15
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Dropping/Keeping observations25 April
I tried Michael's approach and typed
bysort fid: drop if _N==1
*113 farmers remained in the dataset and 49 plots with only one farmer id
were dropped.
I decided to try Maarten's method but admit that I don't understand
completely what I am doing. So I followed the instructions exactly
(pseudopanel is the name of my dataset)
. sysuse pseudopanel, clear
. drop if stype==2
(164 observations deleted) ( I need to do for another reason, needed to
eliminate plots on which another crop was being grown. So now I am down to
the crop if interest and have data on plots where some farmer's id appear
only once and I need to get rid ot)
. replace fid = 10 in 10 /*create a unique value in fid*/
(1 real change made)
Gauri: what does this mean ?
. sort fid
. list fid in 1/10
++
 fid 

1.  1 
2.  2 
3.  2 
4.  3 
5.  3 

6.  3 
7.  4 
8.  4 
9.  5 
10.  6 
++
. by fid: keep if _n==_N & _N>1 & fid <.
(110 observations deleted)
Gauri: only 49 plots had one fid (farmerid) associated with it. So this has
dropped more...
. list fid
++
 fid 

1.  2 
2.  3 
3.  4 
<snip>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
© Copyright 1996–2015 StataCorp LP  Terms of use  Privacy  Contact us  What's new  Site index 