Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Assigning new values to group variables


From   Robert Picard <[email protected]>
To   [email protected]
Subject   Re: st: Assigning new values to group variables
Date   Mon, 9 May 2011 11:23:59 -0400

There are many issues here but I assume that you want to preserve the
relationship found in each observation. The following example creates
a variable called rel_id that identifies each relationship. Your main
issue of having consistent Group values is done by converting the data
to long form. Then I create a new variable called gid that identifies
groups of companies based on the relationships stated in the initial
dataset. This requires a program of mine called -group_id-, available
from SSC. Just in case you needed it, I convert back to wide form.

Hope this helps,

Robert

* --------------------- begin example ---------------------
clear
input Group1 str10 Var1 Group2 str10 Var2
1 companyABC 1 companyABD
1 companyABC . .
2 companyABD . .
3 companyABE . .
4 companyABF 2 companyCCC
5 companyACF 3 companyDDD
6 companyACG . .
6 companyACG 4 companyADK
7 companyADK . .
8 companyADL 5 companyCCD
8 companyADL . .
end

* Assign a unique identifier to each observation
* These identify a relationship

gen rel_id = _n

* Reshape to long form; drop obs with no company

reshape long Group Var, i(rel_id) j(j)
drop if Var == "."

* Disregard Group values if they are not Group1

replace Group = . if j > 1

* Each company should have the same Group value

sort Var Group
by Var: replace Group = Group[1]

* Assign new Group values for companies that were
* not part of Group1

by Var: gen first = _n == 1
sum Group, meanonly
replace Group = r(max) + sum(first) if Group == .
drop first

* Group co_id when they are part of the same
* relationship. This requires -group_id-, available
* from SSC. To install, type ssc install group_id

gen gid = Group
group_id gid, matchby(rel_id)
sort gid Var
list, sepby(gid) noobs

* If desired, convert back to wide

sort rel_id
reshape wide Var Group gid, i(rel_id) j(j)
list, noobs sep(0)
* --------------------- end example -----------------------




On Mon, May 9, 2011 at 7:35 AM, Florian Seliger <[email protected]> wrote:
> Dear Stalalist,
>
> I have a dataset from a firm survey containing several thousand observations.
>
> There are six variables with company names (Var1-Var6) where firms are asked to indicate to which other firms they have relationships.
>
> Similar companies may occur within Var1-Var6. These are grouped as indicated by the variables group1-group6.
>
> Var2-Var6 contain many missing values because many firms answer to have only a relationship to a single firm.
>
>  The variables group1-group6 have different numbers although the companies are the same in var1 and var2 (and var3…), e.g., group1 may take on value 2 whereas group2 takes on value 1 for the same company. The problem is that there may also occur other companies in var2-var6 than in var1.
>
> Please see the example below for a few companies.
>
>
>
> Group1          Var1                       Group2          Var2
>
> 1                     companyABC            1                  companyABD
>
> 1                     companyABC            .                       .
>
> 2                     companyABD            .                       .
>
> 3                     companyABE            .                       .
>
> 4                     companyABF            2                  companyCCC
>
> 5                     companyACF            3                  companyDDD
>
> 6                     companyACG            .                       .
>
> 6                     companyACG            4                  companyADK
>
> 7                     companyADK            .                       .
>
> 8                     companyADL            5                  companyCCD
>
> 8                     companyADL            .                       .
>
>
>
> At the end, all similar companies across Var1-Var6 should have the same value as in group1. In addition, companies that do not occur in Var1 should be assigned another number. Please look below for an example.
>
>
>
>
>
> Group1          Var1                        Group2          Var2
>
> 1                     companyABC            1                     .
>
> 1                     companyABC            1                     .
>
> 2                     companyABD            2                   companyABD
>
> 3                     companyABE            3                     .
>
> 4                     companyABF            4                     .
>
> 5                     companyACF            5                     .
>
> 6                     compaynACG            6                     .
>
> 6                     companyACG            6                     .
>
> 7                     companyADK            7                   companyADK
>
> 8                     companyADL            8                     .
>
> 8                     companyADL            8                     .
>
> 9                     .                     9                   companyCCC
>
> 10                   .                      10                  companyDDD
>
> 11                   .                      11                  companyCCD
>
>
>
> As I did not find the right approach to assign new numbers with STATA if a company does not occur in var1, I would like to ask you if you have any ideas.
>
>
>
> Thank you.
>
>
>
> Best,
>
> Florian
> --
> NEU: FreePhone - kostenlos mobil telefonieren und surfen!
> Jetzt informieren: http://www.gmx.net/de/go/freephone
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index