Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: data management - loop?

From   n j cox <>
Subject   Re: st: Re: data management - loop?
Date   Tue, 22 May 2007 21:53:35 +0100

The same approach is taken further in -egen, mode()-. It is
careful about missing values, and offers some handles for
alternative definitions of modes in the presence of ties.

A way in would be

bysort household : egen modevar = mode(var), <options>
gen byte dummyvar = var == modevar if var < .

I don't think there is any obvious way to handle ties
for mode. There might be substantive solution(s) to that.


Michael Blasnik

There's a much easier way -- without creating the 250 dummies or any looping:

bysort household var: gen count=_N
bysort household (count): gen byte dummy=count==count[_N]

In the event of ties (two values of var with equal frequency), then both values
will be included by the dummy.

Alexander Staus

> in my panel dataset I want a dummy for the most occurred value in a variable.
> e.g. for a household a variable can take values from 1 to 250, value 15 is
> the most named
> value in one household, so I want a dummy which is 1 when the household named
> 15
> otherwise 0.

* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index