Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Easy Question? Counting cases based on a "target" case

From   "David Radwin" <>
To   <>
Subject   st: RE: Easy Question? Counting cases based on a "target" case
Date   Wed, 26 Dec 2012 10:24:58 -0800 (PST)


I don't think you need to loop over observations, but you can loop over
values which is fairly efficient. Something like this:

levelsof price, local(prices)
foreach p of local prices {
	gen near`p' = inrange(price, `=`p'-2000', `=`p'+2000')
egen countnear = rowtotal(near*)

In the example above I use all prices, but you could substitute the
following line for the first and second line above:

	foreach p of numlist 1900 2500 4000 6500 10000 {

David Radwin
Senior Research Associate
MPR Associates, Inc.
2150 Shattuck Ave., Suite 800
Berkeley, CA 94704
Phone: 510-849-4942
Fax: 510-849-0794

> -----Original Message-----
> From: [mailto:owner-
>] On Behalf Of Ben Hoen
> Sent: Wednesday, December 26, 2012 10:06 AM
> To:
> Subject: st: Easy Question? Counting cases based on a "target" case
> I want to perform a function that I think would be easy but I can't wrap
> my
> head around how to perform it without looping through each case.
> I want to create a count of the number of records in the file that meet
> certain criteria based on a respective case's value.  So for example
> the auto dataset:
> *====================begin
> sysuse auto, clear
> g id=_n
> egen nearprice2000=count(id) if... //count the number of other cases in
> the
> dataset if the price of the car is within $2000 of the price of this
> cases'
> (i.e., target) car's price
> *====================end
> The egen command is how I thought I would resolve this, but I can't
> it out exactly.  The nearprice2000 would equal the count for each case
> the number of other cases in the dataset that have a price that is
> +/- $2000 from the particular case's price.  So if the full dataset had
> only
> 5 prices: 1900, 2500, 4000, 6500, and 10000, their respective
> values would be: 2, 3, 2, 2, and 1 (if itself would be included in the
> count) or 1, 2, 1, 1, and 0 (if itself would NOT be included in the
> I might be able to do this by looping through the cases, but I know that
> is
> not encouraged by other more experienced users.
> Any advice would be greatly appreciated.
> Ben

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index