Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Dropping/Keeping observations-25 April

From   "Gauri Khanna" <[email protected]>
To   [email protected]
Subject   RE: st: Dropping/Keeping observations-25 April
Date   Thu, 27 Apr 2006 13:56:04 +0000

Thank you Maarten.


From: "Maarten Buis" <[email protected]>
Reply-To: [email protected]
To: <[email protected]>
Subject: RE: st: Dropping/Keeping observations-25 April
Date: Tue, 25 Apr 2006 18:45:15 +0200

Michael pointed this out already, but our discussion might
have been a bit too terse.

My answer was tainted by the typical reason why I would
use such -keep- or -drop- commands in combination with
graphs. I keep just one case for each combination of x
and y, since multiple observations would just be
plotted on top of one another so don't show but do make
your graph file a lot larger. This is probably not the
case for you, so Michael solution is just fine.

The command I used that droped cases was:
-by rep78: keep if _n==_N & _N>1 & rep78 <. -

So my command contained three conditions a case should
match in order to be retained in the dataset:

_n==_N: _n is the current observation number within the
by group, and _N is the total within the by group. So
say a group consists of 5 cases than only the last case
(with observation number 5) will be retained.

_N>1: only groups with more than one case are retained.

rep78 < .: only retain cases with non-missing values on
the grouping variable.

In my example code I needed a group with only one case
to show that it would drop that group. The grouping
variable (rep78) in the example case did not contain
such a group, so I created one by replacing one value
with a unique number. This was done with the command
-replace rep78 = 10 in 10- So in your case you should not
include this line.


Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214

+31 20 5986715

-----Original Message-----
From: [email protected] [mailto:[email protected]]On Behalf Of Gauri Khanna
Sent: dinsdag 25 april 2006 18:15
To: [email protected]
Subject: Re: st: Dropping/Keeping observations-25 April

I tried Michael's approach and typed
bysort fid: drop if _N==1
*113 farmers remained in the dataset and 49 plots with only one farmer id
were dropped.

I decided to try Maarten's method but admit that I don't understand
completely what I am doing. So I followed the instructions exactly
(pseudopanel is the name of my dataset)

. sysuse pseudopanel, clear

. drop if stype==2
(164 observations deleted) ( I need to do for another reason, needed to
eliminate plots on which another crop was being grown. So now I am down to
the crop if interest and have data on plots where some farmer's id appear
only once and I need to get rid ot)

. replace fid = 10 in 10 /*create a unique value in fid*/
(1 real change made)
Gauri: what does this mean ?

. sort fid

. list fid in 1/10

| fid |
1. | 1 |
2. | 2 |
3. | 2 |
4. | 3 |
5. | 3 |
6. | 3 |
7. | 4 |
8. | 4 |
9. | 5 |
10. | 6 |

. by fid: keep if _n==_N & _N>1 & fid <.
(110 observations deleted)
Gauri: only 49 plots had one fid (farmerid) associated with it. So this has
dropped more...
. list fid

| fid |
1. | 2 |
2. | 3 |
3. | 4 |

* For searches and help try:
Express yourself instantly with MSN Messenger! Download today it's FREE!

* For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index