Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: reorganizing data


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: reorganizing data
Date   Wed, 6 Sep 2006 17:33:05 +0100

It can be done. 

Here is one solution. It is easiest if you know 
in advance roughly the maximum number of "others"
any person might have. 

Warning: untested code ahead. 

Guess this number, and then add some. Suppose 
you guess 20, and then add 10. You get 30 

forval i = 1/30 { 
	gen Other`i' = ""
}

levelsof Person, local(Persons)
qui foreach P of local Persons {
	levelsof City if Person == "`P'", local(Cities)
	local which
	foreach C of local Cities {
		levels Person if Person != "`P'" & City == "`C'", ///
  				 local(work) clean
		local which : list which | work
	}
	noi di "`P': `which'"
	local nothers : word count `which' 
	tokenize `which' 
	forval i = 1/`nothers' { 
		replace Other`i' = "``i''" if Person == "`P'"
	}
}

Then clean up any empty variables: 

	forval j = 30(-1)1 { 
		assert Other`i' == "" 
		drop Other`i' 
	}

This loop is designed to fail at the first 
Other? variable that is not all empty. It 
will drop in turn Other30, Other29, ... if 
and only if it is all empty. 

Alternatively, -dropmiss- from STB-60 can be used. 

30 is just pulled out of the air. Your number will differ. 

Nick 
n.j.cox@durham.ac.uk 

Anna Lehman
 
> Going back to your suggestion,
> If the number of observations is large and the information 
> does not fit into 
> a string variable,
> is there any way I can still store the obtained information?
> For example, since  for person 1, the list is: 2, 5 4 and 8,
> Others would contain a string with "2 5 4 8". That is fine. 
> The problem is 
> that if the list has many numbers they won't fit into the 
> variable "Others".
> Can I store the different numbers (2,5, 4 and 8) in different 
> columns/variables (instead of creating the variable Others)? 
> This is the 
> only way I can think of dealing with a large number of 
> observations but I'm 
> not sure how to operationalize it... Any suggestions?
> Thanks for your help,
> Anna
> 
> >From: n j cox <n.j.cox@durham.ac.uk>
> >Reply-To: statalist@hsphsun2.harvard.edu
> >To: statalist@hsphsun2.harvard.edu
> >Subject: Re:st: reorganizing data
> >Date: Mon, 04 Sep 2006 15:25:01 +0100
> >
> >This should work with toy datasets. If your identifiers are
> >long, or your number of observations is large, the information
> >won't fit into a string variable, so the lines mentioning
> >"Others" should be deleted.
> >
> >gen Others = ""
> >levelsof Person, local(Persons)
> >qui foreach P of local Persons {
> >	levelsof City if Person == "`P'", local(Cities)
> >	local which
> >	foreach C of local Cities {
> >		levels Person if Person != "`P'" & City == 
> "`C'", ///  				 local(work) 
> >clean
> >		local which : list which | work
> >	}
> >	noi di "`P': `which'"
> >	replace Others = "`which'" if Person == "`P'"
> >}
> >
> >Nick
> >n.j.cox@durham.ac.uk
> >
> >Anna Lehman
> >
> >I have a dataset with the following structure:
> >
> >City   Person_id
> >A          1
> >A          2
> >B          1
> >B          5
> >C          1
> >C          5
> >C          4
> >D          8
> >D          1
> >
> >I would like to obtain the following:
> >for each and every person, a list with the people that have 
> apartments in
> >the same city (independently of which city). For example, 
> for person 1, 
> >this
> >list would be: 2, 5 4 and 8. And for person 5 the list would be: 1 .
> >
> >Can you think of a relatively easy way of acomplishing this?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index