Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: More efficient way of programming


From   Ulrich Kohler <kohler@wz-berlin.de>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: More efficient way of programming
Date   Tue, 6 Jun 2006 12:34:13 +0200

As often, -reshape long- makes life easier. Try out:

. reshape long d, i(id) j(station)
. by id (d), sort: gen min_dis_new = d[2]
. by id (d), sort: gen min_dis_id = station[2]

which also is a neat way to produce "min_dis".  You might than use -reshape 
wide- to get back your original data structure, but I bet that you are better 
of with long. See -help reshape- for details.

many regards
Uli


Jitian Sheu wrote:
> Dear listers:
>
> I have a data set with the following structure:
>
> id  d1   d2    d3.....      d2500   min_dis
> 1   0    23   21          530      21
> 2   23   0
> 3
> 4
> 5
> ...
> (up to 2500)
>
> i.e. number of observation=2500, and each one represent to one station(id)
>    dX= the distance to stationX, X=1...2500
>    (since there are 2500 observation,==> I have 2500 distance variables)
>
>    min_dis=minimum distance of the nearest station.
>
>
> So, for each observation(station), I know its minimum distance to another
> station.
> Now, I want to know its nearest station id.
> i.e. I want to have another variable (say called near_id). By this new
> variable, I can then obtain the id number of each observation's nearest
> station id.
>
> For example (using the above data)
>
> id  d1   d2    d3.....      d2500   min_dis  ==> near_id
> 1   0    23   29          530      21     ==>     2
> 2   23   0    32          41       23     ==>     1
> 3   29   32   0            52       21    ==>     2
> 4
> 5
> ...
>
> For this purpose, I use the following programming code.
> Basically, I am doing this observation by observation:
>
> gen near_id=.
>
> forvalues	i=1(1)2500{
>
>            forvalues	j=1(1)2500{
> 				replace near_id =`j'	if id==`i'&
> d`j'==min_dis
>
> 				}
>   		}
>
> Therefore, there are totally 2500X2500 loops
> If each loop takes 2 seconds==> totally, I need 5000 seconds to finish the
> whole process, which is 1.4 hours.
>
> Is there any efficient way to do that?
>
> Many thanks.
>
> JT
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

-- 
kohler@wz-berlin.de
+49 (030) 25491-361
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index