Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: stacking unique values of several variables under one new variable


From   James Bernard <jamesstatalist@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: stacking unique values of several variables under one new variable
Date   Mon, 25 Feb 2013 21:20:48 +0800

thanks a lot

helpful as usual

On Mon, Feb 25, 2013 at 4:44 PM, Nick Cox <njcoxstata@gmail.com> wrote:
> For "unique" read "distinct".
>
> My code is very similar to Maarten's but I will post it nevertheless.
>
> If it's as simple as your example implies then you can do this:
>
> . gen long obs = _n
>
> . split technology , p(,)
> variables created as string:
> technology1  technology2
>
> . local k = r(nvars)
>
> . expand `k'
> (4 observations created)
>
> . forval j = 1/`k' {
>   2.     bysort obs : replace technology = technology`j'[1] if _n == `j'
>   3. }
> (2 real changes made)
> (4 real changes made)
>
> . drop if missing(technology)
> (2 observations deleted)
>
> . replace technology = trim(technology)
> (2 real changes made)
>
> . drop technology?
>
> . duplicates drop technology, force
>
> Duplicates in terms of technology
>
> (1 observation deleted)
>
> . list
>
>      +-------------------+
>      |  technology   obs |
>      |-------------------|
>   1. | Monoclonals     1 |
>   2. |    Vaccines     2 |
>   3. |    Adjuvant     3 |
>   4. |     Vaccine     3 |
>   5. |  Combinchem     4 |
>      +-------------------+
>
> Here's the code in one
>
> gen long obs = _n
> split technology , p(,)
> local k = r(nvars)
> expand `k'
> forval j = 1/`k' {
>     bysort obs : replace technology = technology`j'[1] if _n == `j'
> }
> drop if missing(technology)
> replace technology = trim(technology)
> drop technology?
> duplicates drop technology, force
> list
>
> Notes: Knowing that "Vaccines" and "Vaccine" mean the same, and
> anything similar, will have to be part of extra code.
>
> Maarten's code assumes that the separator is always ", ". I don't
> assume that there is a space always, so I am obliged to trim spaces
> afterwards.
>
> Nick
>
> On Mon, Feb 25, 2013 at 6:15 AM, James Bernard <jamesstatalist@gmail.com> wrote:
>
>> I have been struggling with the following. I would appreciate you help
>>
>> I have a variable ("Technology) that indicates type(s) of a technology
>> for each record. I want to aggregate the unique values of this
>> variable under one new variable, say, called "Type:
>>
>>
>> Technology
>> -------------------------
>> Monoclonals
>> Vaccines
>> Adjuvant, Vaccine
>> Combinchem, Monoclonals
>>
>>
>>
>>
>>
>> Now, i want to create a variable that stores unique values:
>>
>> Type
>> -----------
>> Monoclonals
>> Vaccines
>> Adjuvant,
>> Combinchem
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index