# RE: st: Marking Levels of Categorical Variable

 From "Martin Weiss" <[email protected]> To <[email protected]> Subject RE: st: Marking Levels of Categorical Variable Date Thu, 25 Sep 2008 09:44:28 +0200

```Following up on Friedrich`s reply, it is easy enough to apply his idea to
non-string variables

**********
prog tenpercent
vers 10.1
args varname
count if `1'!=.
local n = r(N)
gen new`1' = 0 if `1'!=.
levelsof `1', local(levels)

foreach l of local levels {
count if `1'==`l'
replace new`1' = 1 if `1'==`l' & r(N)/`n' > 0.1
}
end
**********

HTH
Martin

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Friedrich Huebler
Sent: Thursday, September 25, 2008 4:09 AM
To: [email protected]
Subject: Re: st: Marking Levels of Categorical Variable

Is your variable in string or numeric format? The following example
assumes string format. Missing values are excluded from the analysis.

count if var!=""
local n = r(N)
gen newvar = 0 if var!=""
levelsof var, local(levels)
foreach l of local levels {
count if var=="`l'"
replace newvar = 1 if var=="`l'" & r(N)/`n' > 0.1
}

Friedrich

On Wed, Sep 24, 2008 at 9:14 PM,  <[email protected]> wrote:
> I have a categorical variable with 30 levels. How do I create a variable
> that is equal to 1 if a category of the variable shows up more than 10% of
> the time.
>
> For example:
> var  Percent
> A      5
> B      5
> C      10
> D      20
> E      60
> How would I create "newvar" equal to 1 for C, D, and E and equal to 0 for
A
> and B?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```