Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Marking Levels of Categorical Variable


From   "Friedrich Huebler" <fhuebler@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Marking Levels of Categorical Variable
Date   Wed, 24 Sep 2008 22:09:11 -0400

Is your variable in string or numeric format? The following example
assumes string format. Missing values are excluded from the analysis.

count if var!=""
local n = r(N)
gen newvar = 0 if var!=""
levelsof var, local(levels)
foreach l of local levels {
  count if var=="`l'"
  replace newvar = 1 if var=="`l'" & r(N)/`n' > 0.1
}

Friedrich

On Wed, Sep 24, 2008 at 9:14 PM,  <jasonm@ucla.edu> wrote:
> I have a categorical variable with 30 levels. How do I create a variable
> that is equal to 1 if a category of the variable shows up more than 10% of
> the time.
>
> For example:
> var  Percent
> A      5
> B      5
> C      10
> D      20
> E      60
> How would I create "newvar" equal to 1 for C, D, and E and equal to 0 for A
> and B?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index