Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Ordering a varlist according to "parent" and "child" variables


From   A Loumiotis <[email protected]>
To   [email protected]
Subject   Re: st: Ordering a varlist according to "parent" and "child" variables
Date   Tue, 20 Dec 2011 17:52:26 +0200

@ Nick
Thanks for your help.  I thought that this is tricky so that's why I
asked for help!!

@Matt
Thanks for the code!!! I have to get familiar with Mata.  Running the
code for the last example that I gave though does not produce the
ordering that I would like.  I would like the new ordering to keep as
much as possible the original ordering of the varlist (v1-v11) as well
as the ordering of the children variables for each parent variable.

Working today on this problem I developed the following code for this task.
If there are more than three child generations I will have to manually
increase the iterations in the code. Perhaps I can use -while- to further
automate this task but I'm not sure how.

Antonis

local all v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11
local cofv4 v1 v2 v6
local cofv5 v3
local cofv6 v8 v11
local cofv7 v4 v10

foreach var of local all {
	local rvars:list all - var
	local i=0
	foreach rvar of local rvars {
		if !(`:list var in cof`rvar'') local i=`i'+0
		else local i=`i'+1
	}
	if `i'==0 local nparlist "`nparlist' `var'"
}	
di "`nparlist'"

foreach var of local nparlist {
	local eocl
	if `:list sizeof cof`var''>=1 {
		forvalues i=1/`:list sizeof cof`var'' { // loop for 1st gen
			local varg`i': word `i' of `cof`var''
			local eocl`i' "`varg`i''"
			if `:list sizeof cof`varg`i'''>=1 {
				forvalues j=1/`:list sizeof cof`varg`i''' { // loop for 2nd gen.
					local var2g`j': word `j' of `cof`varg`i'''
					local eocl`i'`j' "`var2g`j''" //
					if `:list sizeof cof`var2g`j'''>=1 {
						forvalues k=1/`:list sizeof cof`var2g`j''' { // loop for 3rd gen
							local var3g`k': word `k' of `cof`var2g`j'''
							local eocl`i'`j'`k' "`var3g`k''"
							if `:list sizeof cof`var3g`k'''>=1 {
								di as error "`The family of `var' has more than three
generations.  Consider adding more iteration(s)"
							}
							local eocl`i'`j' "`eocl`i'`j'' `eocl`i'`j'`k''"
						}
					}
					local eocl`i' "`eocl`i'' `eocl`i'`j''"
				}
			}
			local eocl "`eocl' `eocl`i''"
		}
		local expslist "`expslist' `var' `eocl'"
	}
	else local expslist "`expslist' `var'"
}

On Tue, Dec 20, 2011 at 4:48 AM, Matthew White
<[email protected]> wrote:
> Hi Antonis,
>
> The following code isn't pretty, but it might do the job. I'm sure
> there are more elegant ways (maybe using more Mata). Strategy: input
> as the dataset in memory all the relations; then create a local `tree'
> that describes the branches of the relationship trees; then load the
> dataset whose variables you want to sort and reorder them using
> `tree'.
>
> Best,
> Matt
>
> clear
> input str32 parent str32 child
> var9
> var4 var1
> var4 var2
> var4 var6
> var5 var3
> var6 var8
> var6 var11
> var7 var4
> var7 var10
> end
>
> capture program drop gettree
> program gettree, rclass
>        syntax anything(name=p), parent(varname) child(varname) [branch(str)]
>
>        local branch `branch' `p'
>        quietly levelsof `child' if `parent' == "`p'"
>        if `"`r(levels)'"' == "" return local tree `""`branch'""'
>        else {
>                foreach level in `r(levels)' {
>                        if `:list level in branch' {
>                                local branch `branch' `level'
>                                local dibranch : subinstr local branch " " "@"
>                                local dibranch : subinstr local dibranch " " ", which is the
> parent of ", all
>                                local dibranch : subinstr local dibranch "@" " is the parent of "
>                                display as err "circular structure: `dibranch'."
>                                exit 198
>                        }
>                        gettree `level', parent(`parent') child(`child') branch(`branch')
>                        local rtree `"`rtree' `r(tree)'"'
>                }
>                return local tree `"`rtree'"'
>        }
> end
>
> levelsof parent, local(parents)
> levelsof child, local(children)
> foreach parent of local parents {
>        if !`:list parent in children' {
>                gettree `parent', parent(parent) child(child)
>                local tree `"`tree' `r(tree)'"'
>        }
> }
>
> clear
> forvalues i = 1/11 {
>        generate var`i' = 0
> }
>
> mata:
> tokens = tokens(st_local("tree"))'
> st_local("tree", invtokens(J(1, rows(tokens), `"""') + collate(tokens,
> order((strlen(tokens) - strlen(subinstr(tokens, " ", "", .)),
> (1::rows(tokens))), (-1, -2)))' + J(1, rows(tokens), `"""')))
> end
>
> foreach branch of local tree {
>        gettoken progenitor progeny : branch
>        if "`progeny'" == "" order `progenitor'
>        else order `progeny', after(`progenitor')
> }
>
> On Mon, Dec 19, 2011 at 2:02 PM, Nick Cox <[email protected]> wrote:
>> Evidently you want code that will recognise not just parents and children, but several generations too. In your simple example I count at least four generations, as var7 has child var4 which has child var6 which has child var8.
>>
>> No doubt this is standard stuff for some tree- or network-processing software, but I'm not sure that it's possible to do what you want with a few lines of Stata (or even Mata).
>>
>> I tried recasting your macros as
>>
>> local all v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11
>> local c1 v4 v1 v2 v6
>> local c2 v5 v3
>> local c3 v6 v8 v11
>> local c4 v7 v4 v10
>>
>> which made the problem a bit clearer to me. One naïve algorithm is then
>>
>> 1. Parse each -c- macro into parent (first) and children (others)
>> 2. Zap the children.
>> 3. Put each -c- macro in the place of the parent.
>>
>> And this is, I think, equivalent to what Matthew White was doing.
>>
>> The problem is that this will only do what you want if you process the macros in the right order.
>>
>> The failure of one simple algorithm clearly does not rule out the existence of another, but I guess that this is trickier than you think.
>>
>> Nick
>> [email protected]
>>
>> A Loumiotis
>>
>> Thanks a lot Matt for your help.
>> It works for the particular example that I gave but if I modify the
>> example a bit I think it will not work.  Consider the following:
>>
>> var4 has three "child" variables var1 var2 var6
>> var5 has one "child" variable var3
>> var6 has two "child" variables var8 var11
>> var7 has two "child" variable var4 var10
>>
>> The ordered varlist should be as follows:
>> v5 v3 v7 v4 v1 v2 v6 v8 v11 v10 v9
>>
>> The method you suggested will produce the following:
>> v1 v2 v6 v8 v11 v5 v3 v7 v4 v10 v9
>>
>> Here is the output
>> . forvalues i=1/11 {
>>  2.         gen v`i'=.
>>  3. }
>> . local childofv4 v1 v2 v6
>> . local childofv5 v3
>> . local childofv6 v8 v11
>> . local childofv7 v4 v10
>> . ds
>> v1   v2   v3   v4   v5   v6   v7   v8   v9   v10  v11
>> . foreach var in `r(varlist)' {
>>  2.         if "`childof`var''" != "" order `childof`var'', after(`var')
>>  3. }
>> . ds
>> v1   v2   v6   v8   v11  v5   v3   v7   v4   v10  v9
>>
>>
>>
>> On Mon, Dec 19, 2011 at 6:28 PM, Matthew White
>> <[email protected]> wrote:
>>
>>> I think your setup so far seems fine, though note that the names of
>>> -local-s can be at most 31 characters, so since you're using 7 for
>>> "childof", you'll have trouble with variables whose names are more
>>> than 24 characters. How about a loop like this:
>>>
>>> ds
>>> foreach var in `r(varlist)' {
>>>    if "`childof`var''" != "" order `child`var'', after(`var')
>>> }
>>>
>>> On Mon, Dec 19, 2011 at 11:15 AM, A Loumiotis
>>> <[email protected]> wrote:
>>
>>>> I'm working with survey data where the variables are related in the
>>>> following way.  There are "parent" and "child" variables where each
>>>> "child" variable can have at most one "parent" variable.  What I would
>>>> like to do is to find a general way to reorder the initial varlist in
>>>> a way that the "child" variables follow right after the "parent"
>>>> variables.
>>>>
>>>> Consider the following simplified example of a survey dataset with 11
>>>> variables var1-var11.
>>>>
>>>> var1 has three "child" variables var2-var4
>>>> var3 has two "child" variables var5-var6
>>>> var5 has two "child" variables var8-var9
>>>> var7 has one "child" variable var11
>>>>
>>>> What I would like to do is to find a (general) way to reorder the
>>>> initial varlist var1-var11 as follows:
>>>>
>>>> var1 var2 var3 var5 var8 var9 var6 var4 var7 var11 var10
>>>>
>>>> What I have done up to now is to create locals for each variable that
>>>> contains the child variables if any.  For the simplified example these
>>>> locals are defined as follows:
>>>>
>>>> local childofvar1 var2 var3 var4
>>>> local childofvar2
>>>> local childofvar3 var5 var6
>>>> local childofvar4
>>>> local childofvar5 var8 var9
>>>> local childofvar6
>>>> local childofvar7 var11
>>>> local childofvar8
>>>> local childofvar9
>>>> local childofvar10
>>>> local childofvar11
>>>>
>>>> I've tried to use loops to generate the ordered varlist but I'm not
>>>> successful.  Any help will be greatly appreciated!!!
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
> --
> Matthew White
> Data Coordinator
> Innovations for Poverty Action
> 101 Whitney Avenue, New Haven, CT 06510 USA
> +1 434-305-9861
> www.poverty-action.org
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index