Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Brandon Olszewski <olszewski.brandon@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Assistance with manipulating a social network dataset? |

Date |
Tue, 11 Oct 2011 15:05:53 -0700 |

Hi Statalist: I have a social network dataset, and I can’t figure out how to perform the proper manipulations. People in rows were asked if they know people listed in columns. In cells, “0” indicates two people don’t know each other, and “1” indicates otherwise. So what I have looks like this: Adam Beth Charlie Adam 1 1 0 Beth 0 1 0 Charlie 0 1 1 Note that while Adam claims to know Beth, Beth doesn’t claim the same, and while Beth says she doesn’t know Charlie, he says otherwise. For my purposes, I want to assume that if anyone says they know someone else, to treat it as a “1” both ways. The software I want to use (Sonoma) wants the data in one of two formats. Here’s the wide option, which offers only one half the matrix, with “1” coded in the diagonal and “.” coded in the bottom half, with max values for combinations in cells: Adam Beth Charlie Adam 1 1 0 Beth . 1 1 Charlie . . 1 Question 1: How would I do this in Stata? I looked at -help mata-, but I don’t even know if that’s the right direction. Is it? If not, how might I do it? This option seems more difficult for me (given my familiarity with Stata’s functionality) than the “long option” below. Here’s the long option, which seems more feasible for me, given my level of skill. Note that each combination is listed just once, again with maximum values: Adam Adam 1 Adam Beth 1 Adam Charlie 0 Beth Beth 1 Beth Charlie 1 Charlie Charlie 1 Question 2: I can get the data to long format fine no problem. But end up with duplicates of combinations, as Adam is asked about Ben, and Ben is asked about Adam (i.e. a total of 9 observations, rather than the six above). How could I drop duplicate combinations, saving only the max value for each? While I am pretty familiar with the -duplicates- set of commands, I’m running into the problem that I don’t know how to use the command since combinations go both ways, where Adam-Beth is a duplicate of Beth-Adam. I’ve also thought about it substituting numbers for people (i.e. 1-2 & 2-1), but that doesn’t change my problem that I can’t figure out how to tell Stata to treat those as duplicates. Thanks for any help. Brandon * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Assistance with manipulating a social network dataset?***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**RE: st: endogenous dichotomous variables in SEM** - Next by Date:
**st: xtivreg2, clustered errors and F statistic** - Previous by thread:
**st: Residuals after count model (nbreg)** - Next by thread:
**Re: st: Assistance with manipulating a social network dataset?** - Index(es):