Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Re: Snowball sampling

From	"Ray Hawkins" <[email protected]>
To	<[email protected]>
Subject	st: RE: Re: Snowball sampling
Date	Tue, 16 Apr 2013 09:19:08 -0500

Hello,

I posted a similar question a while ago, but have a related question. My
data looks the following (without 'keep'), a network data. To generate
'keep', I ran the following code. However, this code covers the first two
steps (up to giveid 11) of the whole network starting giveid=17 -> 2 -> 11
-> 18 -> 32 -> 40 -> 16. That is, keep=1 only for giveid=17,2,11. I would
like to keep all 7 giveid that are connected. The problem is that I don't
know how many steps I need to go to keep all connected id. Or, even if I
know, the code will be very complicated to go to 50 or 100 steps, for
example. Is there any way I can keep all those connected id starting from a
seed id? If Stata cannot handle this problem, then maybe I need to use a
social network software? Thank you.

Ray Hawkins.


gen byte keep=giveid==17 // seed id
qui levelsof recid if giveid==17, local(keepers)
foreach keeper of local keepers {
 replace keep=1 if giveid==`keeper' // keep all recid for seed id
 qui levelsof recid if giveid==`keeper', local(keepers1)
 foreach keeper of local keepers1 {
  replace keep=1 if giveid==`keeper' // keep all recid for those who
received from giveid=17 and gave to others
 }
}
keep if keep==1


giveid	recid	keep
2	11	1
3	15	0
6	10	0
11	18	1
11	190	1
16	187	0
17	1	1
17	2	1
17	5	1
17	23	1
18	32	0
19	782	0
32	17	0
32	21	0
32	23	0
32	37	0
32	40	0
32	68	0
33	111	0
40	16	0
40	20	0
40	70	0
40	92	0
41	15	0
41	22	0
41	23	0
41	27	0


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Joseph Coveney
Sent: Friday, March 22, 2013 11:06 PM
To: [email protected]
Subject: st: Re: Snowball sampling

Ray Hawkins wrote:

I am working with social network data, but it is too big, so I would like to
do snowball sampling. My data look like the following. Can you help me
figure out how to keep 'giveid' and corresponding 'recid' for a seed id? For
example, giveid=17 is a seed id. So, I would like to keep giveid=17 and all
giveid (=recid for giveid=17) = 6, 2, 5, 23, 1, 11, 4, 16, 33, 27, 16 (if
exist, of course). Then, for another seed id=32, for example, I would like
to repeat the same process to get certain data size. Thank you in advance.

--------------------------------------------------------------------------
------

Not very efficient, but it should work and it has an easy-to-debug
one-to-one correspondence with the specifications:

* "keep giveid=17"
generate byte keep = giveid == 17

* "and all giveid (=recid for giveid=17)"
quietly levelsof recid if giveid == 17, local(keepers) quietly foreach
keeper of local keepers {
    replace keep = 1 if giveid == `keeper'
}

* "repeat the same process" "for another seed id=32"
quietly replace keep = 1 if giveid == 32 quietly levelsof recid if giveid ==
32, local(keepers) quietly foreach keeper of local keepers {
    replace keep = 1 if giveid == `keeper'
}

quietly keep if keep

Joseph Coveney

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: Re: Snowball sampling
  - From: "Joseph Coveney" <[email protected]>

Prev by Date: SV: st: Removing outliers from my dataset
Next by Date: st: Point-to-Polygon in Stata
Previous by thread: st: Removing outliers from my dataset
Next by thread: st: Re: Snowball sampling
Index(es):
- Date
- Thread