Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: data management - loop?


From   n j cox <n.j.cox@durham.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: data management - loop?
Date   Tue, 22 May 2007 21:53:35 +0100

The same approach is taken further in -egen, mode()-. It is
careful about missing values, and offers some handles for
alternative definitions of modes in the presence of ties.

A way in would be

bysort household : egen modevar = mode(var), <options>
gen byte dummyvar = var == modevar if var < .

I don't think there is any obvious way to handle ties
for mode. There might be substantive solution(s) to that.

Nick
n.j.cox@durham.ac.uk

Michael Blasnik

There's a much easier way -- without creating the 250 dummies or any looping:

bysort household var: gen count=_N
bysort household (count): gen byte dummy=count==count[_N]

In the event of ties (two values of var with equal frequency), then both values
will be included by the dummy.

Alexander Staus

> in my panel dataset I want a dummy for the most occurred value in a variable.
>
> e.g. for a household a variable can take values from 1 to 250, value 15 is
> the most named
> value in one household, so I want a dummy which is 1 when the household named
> 15
> otherwise 0.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index