Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Ordering a varlist according to "parent" and "child" variables
From
Matthew White <[email protected]>
To
[email protected]
Subject
Re: st: Ordering a varlist according to "parent" and "child" variables
Date
Mon, 19 Dec 2011 21:48:52 -0500
Hi Antonis,
The following code isn't pretty, but it might do the job. I'm sure
there are more elegant ways (maybe using more Mata). Strategy: input
as the dataset in memory all the relations; then create a local `tree'
that describes the branches of the relationship trees; then load the
dataset whose variables you want to sort and reorder them using
`tree'.
Best,
Matt
clear
input str32 parent str32 child
var9
var4 var1
var4 var2
var4 var6
var5 var3
var6 var8
var6 var11
var7 var4
var7 var10
end
capture program drop gettree
program gettree, rclass
syntax anything(name=p), parent(varname) child(varname) [branch(str)]
local branch `branch' `p'
quietly levelsof `child' if `parent' == "`p'"
if `"`r(levels)'"' == "" return local tree `""`branch'""'
else {
foreach level in `r(levels)' {
if `:list level in branch' {
local branch `branch' `level'
local dibranch : subinstr local branch " " "@"
local dibranch : subinstr local dibranch " " ", which is the
parent of ", all
local dibranch : subinstr local dibranch "@" " is the parent of "
display as err "circular structure: `dibranch'."
exit 198
}
gettree `level', parent(`parent') child(`child') branch(`branch')
local rtree `"`rtree' `r(tree)'"'
}
return local tree `"`rtree'"'
}
end
levelsof parent, local(parents)
levelsof child, local(children)
foreach parent of local parents {
if !`:list parent in children' {
gettree `parent', parent(parent) child(child)
local tree `"`tree' `r(tree)'"'
}
}
clear
forvalues i = 1/11 {
generate var`i' = 0
}
mata:
tokens = tokens(st_local("tree"))'
st_local("tree", invtokens(J(1, rows(tokens), `"""') + collate(tokens,
order((strlen(tokens) - strlen(subinstr(tokens, " ", "", .)),
(1::rows(tokens))), (-1, -2)))' + J(1, rows(tokens), `"""')))
end
foreach branch of local tree {
gettoken progenitor progeny : branch
if "`progeny'" == "" order `progenitor'
else order `progeny', after(`progenitor')
}
On Mon, Dec 19, 2011 at 2:02 PM, Nick Cox <[email protected]> wrote:
> Evidently you want code that will recognise not just parents and children, but several generations too. In your simple example I count at least four generations, as var7 has child var4 which has child var6 which has child var8.
>
> No doubt this is standard stuff for some tree- or network-processing software, but I'm not sure that it's possible to do what you want with a few lines of Stata (or even Mata).
>
> I tried recasting your macros as
>
> local all v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11
> local c1 v4 v1 v2 v6
> local c2 v5 v3
> local c3 v6 v8 v11
> local c4 v7 v4 v10
>
> which made the problem a bit clearer to me. One naïve algorithm is then
>
> 1. Parse each -c- macro into parent (first) and children (others)
> 2. Zap the children.
> 3. Put each -c- macro in the place of the parent.
>
> And this is, I think, equivalent to what Matthew White was doing.
>
> The problem is that this will only do what you want if you process the macros in the right order.
>
> The failure of one simple algorithm clearly does not rule out the existence of another, but I guess that this is trickier than you think.
>
> Nick
> [email protected]
>
> A Loumiotis
>
> Thanks a lot Matt for your help.
> It works for the particular example that I gave but if I modify the
> example a bit I think it will not work. Consider the following:
>
> var4 has three "child" variables var1 var2 var6
> var5 has one "child" variable var3
> var6 has two "child" variables var8 var11
> var7 has two "child" variable var4 var10
>
> The ordered varlist should be as follows:
> v5 v3 v7 v4 v1 v2 v6 v8 v11 v10 v9
>
> The method you suggested will produce the following:
> v1 v2 v6 v8 v11 v5 v3 v7 v4 v10 v9
>
> Here is the output
> . forvalues i=1/11 {
> 2. gen v`i'=.
> 3. }
> . local childofv4 v1 v2 v6
> . local childofv5 v3
> . local childofv6 v8 v11
> . local childofv7 v4 v10
> . ds
> v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11
> . foreach var in `r(varlist)' {
> 2. if "`childof`var''" != "" order `childof`var'', after(`var')
> 3. }
> . ds
> v1 v2 v6 v8 v11 v5 v3 v7 v4 v10 v9
>
>
>
> On Mon, Dec 19, 2011 at 6:28 PM, Matthew White
> <[email protected]> wrote:
>
>> I think your setup so far seems fine, though note that the names of
>> -local-s can be at most 31 characters, so since you're using 7 for
>> "childof", you'll have trouble with variables whose names are more
>> than 24 characters. How about a loop like this:
>>
>> ds
>> foreach var in `r(varlist)' {
>> if "`childof`var''" != "" order `child`var'', after(`var')
>> }
>>
>> On Mon, Dec 19, 2011 at 11:15 AM, A Loumiotis
>> <[email protected]> wrote:
>
>>> I'm working with survey data where the variables are related in the
>>> following way. There are "parent" and "child" variables where each
>>> "child" variable can have at most one "parent" variable. What I would
>>> like to do is to find a general way to reorder the initial varlist in
>>> a way that the "child" variables follow right after the "parent"
>>> variables.
>>>
>>> Consider the following simplified example of a survey dataset with 11
>>> variables var1-var11.
>>>
>>> var1 has three "child" variables var2-var4
>>> var3 has two "child" variables var5-var6
>>> var5 has two "child" variables var8-var9
>>> var7 has one "child" variable var11
>>>
>>> What I would like to do is to find a (general) way to reorder the
>>> initial varlist var1-var11 as follows:
>>>
>>> var1 var2 var3 var5 var8 var9 var6 var4 var7 var11 var10
>>>
>>> What I have done up to now is to create locals for each variable that
>>> contains the child variables if any. For the simplified example these
>>> locals are defined as follows:
>>>
>>> local childofvar1 var2 var3 var4
>>> local childofvar2
>>> local childofvar3 var5 var6
>>> local childofvar4
>>> local childofvar5 var8 var9
>>> local childofvar6
>>> local childofvar7 var11
>>> local childofvar8
>>> local childofvar9
>>> local childofvar10
>>> local childofvar11
>>>
>>> I've tried to use loops to generate the ordered varlist but I'm not
>>> successful. Any help will be greatly appreciated!!!
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
--
Matthew White
Data Coordinator
Innovations for Poverty Action
101 Whitney Avenue, New Haven, CT 06510 USA
+1 434-305-9861
www.poverty-action.org
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/