Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Assigning new values to group variables

 From Florian Seliger To statalist@hsphsun2.harvard.edu Subject Re: st: Assigning new values to group variables Date Wed, 11 May 2011 14:41:49 -0700

```Dear Robert,

it took me a while to understand the logic behind your code, but it seems to work perfectly.

Thank you very much!

Am 09.05.2011 um 08:23 schrieb Robert Picard:

> There are many issues here but I assume that you want to preserve the
> relationship found in each observation. The following example creates
> a variable called rel_id that identifies each relationship. Your main
> issue of having consistent Group values is done by converting the data
> to long form. Then I create a new variable called gid that identifies
> groups of companies based on the relationships stated in the initial
> dataset. This requires a program of mine called -group_id-, available
> from SSC. Just in case you needed it, I convert back to wide form.
>
> Hope this helps,
>
> Robert
>
> * --------------------- begin example ---------------------
> clear
> input Group1 str10 Var1 Group2 str10 Var2
> 1 companyABC 1 companyABD
> 1 companyABC . .
> 2 companyABD . .
> 3 companyABE . .
> 4 companyABF 2 companyCCC
> 5 companyACF 3 companyDDD
> 6 companyACG . .
> end
>
> * Assign a unique identifier to each observation
> * These identify a relationship
>
> gen rel_id = _n
>
> * Reshape to long form; drop obs with no company
>
> reshape long Group Var, i(rel_id) j(j)
> drop if Var == "."
>
> * Disregard Group values if they are not Group1
>
> replace Group = . if j > 1
>
> * Each company should have the same Group value
>
> sort Var Group
> by Var: replace Group = Group[1]
>
> * Assign new Group values for companies that were
> * not part of Group1
>
> by Var: gen first = _n == 1
> sum Group, meanonly
> replace Group = r(max) + sum(first) if Group == .
> drop first
>
> * Group co_id when they are part of the same
> * relationship. This requires -group_id-, available
> * from SSC. To install, type ssc install group_id
>
> gen gid = Group
> group_id gid, matchby(rel_id)
> sort gid Var
> list, sepby(gid) noobs
>
> * If desired, convert back to wide
>
> sort rel_id
> reshape wide Var Group gid, i(rel_id) j(j)
> list, noobs sep(0)
> * --------------------- end example -----------------------
>
>
>
>
> On Mon, May 9, 2011 at 7:35 AM, Florian Seliger <florian.seliger@gmx.de> wrote:
>> Dear Stalalist,
>>
>> I have a dataset from a firm survey containing several thousand observations.
>>
>> There are six variables with company names (Var1-Var6) where firms are asked to indicate to which other firms they have relationships.
>>
>> Similar companies may occur within Var1-Var6. These are grouped as indicated by the variables group1-group6.
>>
>> Var2-Var6 contain many missing values because many firms answer to have only a relationship to a single firm.
>>
>>  The variables group1-group6 have different numbers although the companies are the same in var1 and var2 (and var3…), e.g., group1 may take on value 2 whereas group2 takes on value 1 for the same company. The problem is that there may also occur other companies in var2-var6 than in var1.
>>
>> Please see the example below for a few companies.
>>
>>
>>
>> Group1          Var1                       Group2          Var2
>>
>> 1                     companyABC            1                  companyABD
>>
>> 1                     companyABC            .                       .
>>
>> 2                     companyABD            .                       .
>>
>> 3                     companyABE            .                       .
>>
>> 4                     companyABF            2                  companyCCC
>>
>> 5                     companyACF            3                  companyDDD
>>
>> 6                     companyACG            .                       .
>>
>>
>>
>>
>>
>>
>>
>> At the end, all similar companies across Var1-Var6 should have the same value as in group1. In addition, companies that do not occur in Var1 should be assigned another number. Please look below for an example.
>>
>>
>>
>>
>>
>> Group1          Var1                        Group2          Var2
>>
>> 1                     companyABC            1                     .
>>
>> 1                     companyABC            1                     .
>>
>> 2                     companyABD            2                   companyABD
>>
>> 3                     companyABE            3                     .
>>
>> 4                     companyABF            4                     .
>>
>> 5                     companyACF            5                     .
>>
>> 6                     compaynACG            6                     .
>>
>> 6                     companyACG            6                     .
>>
>>
>>
>>
>> 9                     .                     9                   companyCCC
>>
>> 10                   .                      10                  companyDDD
>>
>> 11                   .                      11                  companyCCD
>>
>>
>>
>> As I did not find the right approach to assign new numbers with STATA if a company does not occur in var1, I would like to ask you if you have any ideas.
>>
>>
>>
>> Thank you.
>>
>>
>>
>> Best,
>>
>> Florian
>> --
>> NEU: FreePhone - kostenlos mobil telefonieren und surfen!
>> Jetzt informieren: http://www.gmx.net/de/go/freephone
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```