# Re: st: stacking unique values of several variables under one new variable

 From Nick Cox
To statalist@hsphsun2.harvard.edu
Subject Re: st: stacking unique values of several variables under one new variable
Date Mon, 25 Feb 2013 08:44:37 +0000

```For "unique" read "distinct".

My code is very similar to Maarten's but I will post it nevertheless.

If it's as simple as your example implies then you can do this:

. gen long obs = _n

. split technology , p(,)
variables created as string:
technology1  technology2

. local k = r(nvars)

. expand `k'
(4 observations created)

. forval j = 1/`k' {
2.     bysort obs : replace technology = technology`j'[1] if _n == `j'
3. }
(2 real changes made)
(4 real changes made)

. drop if missing(technology)
(2 observations deleted)

. replace technology = trim(technology)
(2 real changes made)

. drop technology?

. duplicates drop technology, force

Duplicates in terms of technology

(1 observation deleted)

. list

+-------------------+
|  technology   obs |
|-------------------|
1. | Monoclonals     1 |
2. |    Vaccines     2 |
3. |    Adjuvant     3 |
4. |     Vaccine     3 |
5. |  Combinchem     4 |
+-------------------+

Here's the code in one

gen long obs = _n
split technology , p(,)
local k = r(nvars)
expand `k'
forval j = 1/`k' {
bysort obs : replace technology = technology`j'[1] if _n == `j'
}
drop if missing(technology)
replace technology = trim(technology)
drop technology?
duplicates drop technology, force
list

Notes: Knowing that "Vaccines" and "Vaccine" mean the same, and
anything similar, will have to be part of extra code.

Maarten's code assumes that the separator is always ", ". I don't
assume that there is a space always, so I am obliged to trim spaces
afterwards.

Nick

On Mon, Feb 25, 2013 at 6:15 AM, James Bernard <jamesstatalist@gmail.com> wrote:

> I have been struggling with the following. I would appreciate you help
>
> I have a variable ("Technology) that indicates type(s) of a technology
> for each record. I want to aggregate the unique values of this
> variable under one new variable, say, called "Type:
>
>
> Technology
> -------------------------
> Monoclonals
> Vaccines
> Adjuvant, Vaccine
> Combinchem, Monoclonals
>
>
>
>
>
> Now, i want to create a variable that stores unique values:
>
> Type
> -----------
> Monoclonals
> Vaccines
> Adjuvant,
> Combinchem
>
