Uli Kohler and Magdalena Luniak, <[email protected]> and
<[email protected]>, want to implement "list processing" in Mata.
I put "list processing" in quotes only because some of you may not have 
heard of the term.  In programming, a list is a set of elements, linked 
one to the other.  In the jargon of programming, Uli and Magdalena 
want to create a forward linked list.  This is a great *ADVANCED* programming
problem.
Uli and Magdalena included code and basically asked, "What's wrong?".
I have only glanced at their code, got the gist of it, and then set about
writing my own.  I apologize because that means more work for them, but 
less for me.  I hope that I will answer all of their questions but, if I 
do not, I ask them to point me in the right direction.
My comments appear under the headings, 
        Statement of problem
        Solution without explanation
        Result of execution
        Explanation
Statement of problem
--------------------
As an exercise, create a forward linked list of a set of n points.
Let vector seq contain real values.  The points to be linked are 
(seq[1], seq[1), (seq[2], seq[2]), ...
Solution without explanation
----------------------------
mata:
mata set matastrict on
struct listelem {
        real scalar                              a, b
        pointer(struct listelem scalar) scalar   next
}
struct listelem scalar trythis(real vector seq)
{
        real scalar                              i
        struct listelem scalar                   first
        pointer(struct listelem scalar) scalar   p
        first.a = first.b = seq[1]
        p = &first 
        for (i=2; i<=length(seq); i++) {
                p->next = &(listelem())
                p       = p->next
                p->a = p->b = seq[i]
        }
        return(first)
}
void displaylist(struct listelem scalar li)
{
        real scalar                              i
        pointer(struct listelem scalar) scalar   p
        p = &li
        for (i=0; p!=NULL; i++) {
                printf("%g.  (%g, %g)\n", i, p->a, p->b)
                p = p->next
        }
}
void testit()
{
        struct listelem scalar    first
        first = trythis((1,2,3,4))
        displaylist(first)
}
testit()
end
Result of execution
-------------------
        : testit()
        0.  (1, 1)
        1.  (2, 2)
        2.  (3, 3)
        3.  (4, 4)
Explanation
-----------
First, my solution is for demonstration purposes only.  It has all the 
elements to understand and manipulate linked lists, but it is not 
organized the way I would organize real code, where I would want easy-to-use
subroutines to perform common tasks for me.  I'll come back to that later.
My definition of the structure -listelem- is the same as U&M's:
        struct listelem {
                real scalar                              a, b
                pointer(struct listelem scalar) scalar   next
        }
Elements -a- and -b- contain the data, and element -next- points to
(links to) the next element.  -next- is NULL when there is no next element.
My program to define the list for a given vector -seq- is
        struct listelem scalar trythis(real vector seq)
        {
                real scalar                              i
                struct listelem scalar                   first
                pointer(struct listelem scalar) scalar   p
        
                first.a = first.b = seq[1]
                p = &first 
                for (i=2; i<=length(seq); i++) {
                        p->next = &(listelem())
                        p       = (p->next)
                        p->a = p->b = seq[i]
                }
        
        
                return(first)
        }
In this program, -first- is the head of the list.  The lines
                first.a = first.b = seq[1]
define it.  The next part defines the remaining members:
                p = &first 
                for (i=2; i<=length(seq); i++) {
                        p->next = &(listelem())
                        p       = p->next
                        p->a = p->b = seq[i]
                }
In this loop, p is a pointer to the last structure defined.  I obtain the next
structure by coding
                        p->next = &(listelem())
and then I reset p to point to that and define it, 
                        p       = p->next
                        p->a = p->b = seq[i]
The only tricky thing here is the line 
                        p->next = &(listelem())
When I first coded it, I omitted the parantheses, 
                        p->next = &listelem()
and my program did not work.  I was, for a time, convinced there was a bug 
in Mata.  There was no bug, however.  Without the parens, the line stores 
in p->next the address of the function listelem().  I wanted to record the 
address of the result recorded by the function.  
Understand the logic:
                set p to point to the last structure defined
                for (i=2; i<=length(seq); i++) {
                        set p->next to point to a new listelem structure
                        fill in the new listelem structure
                        set p = p->next
                }
The line 
                        p->next = &(listelem())
is not only tricky sytactically, the line 
                        set p->next to point to a new listelem structure
is tricky conceptually.  Every element of the list must be a unique structure
with its own location in memory.  It would not do to use a preexisting 
structure scalar for p->next, because in that case, we would be using the 
same structure over and over again.
That went by too fast.  Let's pretend I made that error.  Let's pretend 
my code read:
                p = &first 
                for (i=2; i<=length(seq); i++) {
                        p->next = &second
                        p       = p->next
                        p->a = p->b = seq[i]
                }
where second was declared as a structure listelem scalar.  Now let's 
trace the loop.  The first time, i==2.  We store the address of -second-
in p->next.  Say that is 0x50010.  We fill in the structure at 0x50010.
Now we move to i=3, and we are working on p=0x50010.  We fill in the address
of second in p->next.  That is 0x50010 (the same structure!).  We keep 
doing this, and we end up with a linked list that is the same structure
over and over again:
       first    ->  second  -> +
                |              |
                +--------------+
when what we wanted was 
       first -> second -> third -> fourth -> <nothing>
I hate things like 
                        p->next = &(listelem())
because I forget to do them.  
I already told you the above is not how I would have written this problem 
for production code.  What I would have done is introduce an easy and 
safe to use subroutine:
        pointer(struct listelem scalar) scalar void insert_into_list(
                       pointer(struct listelem scalar) scalar where, 
                       struct listelem scalar new)
        {
                struct listelem scalar     copy
                copy  = new 
                if (el != NULL) {
                        copy.next  = el->next
                        el->next   = ©
                }
                return(©)
        }
To use this routine, the user codes 
        loc = insert_into_list(where, newstruct)
where -where- is a pointer and -newstruct- is the new structure istself.
insert_into_list() first copies -new- into -copy-, a struct listlem scalar.
I know I warned against using structure scalars in pointer contexts, but
what I meant was to warn against using them repeatedly, such as in a loop.
Using them once is perfectly safe.  Here "once" means "once per invocation of
the function" because variables declared inside a function are guaranteed by
Mata to be new on each on each invocation.
Anyway, the point of the line 
                copy  = new 
is to guard against the CALLER of insert_into_list() using the same structure
over and over again, each time with different values.
The next neat thing about insert_into_list() is that -where- can be NULL
or a structure pointer.  I can use insert_into_list() to create the head 
of the list, or a subsequent element.
The final neat thing about insert_into_list() is that, if -where- is a 
structure pointer, it can be the last structure (add onto the tail of the 
list) or a middle structure (add into the middle of the list).
(By the way, I do not bother to code -copy.next = NULL- because I know that
Mata already filled in all of its SCALAR values with missing, and missing for
a pointer is NULL.)
copy_into_list() returns the point at which -newstruct- is inserted.
In some cases, the caller will want that, and in others, the caller won't.
In those cases, the caller can code 
                (void) insert_into_list(where, newstruct)
Now with this neat and safe routine, I can rewrite trythis():
        struct listelem scalar trythis(real vector seq)
        {
                real scalar                              i
                struct listelem scalar                   el 
                pointer(struct listelem scalar) scalar   head, p
                el.a = el.b = seq[1]
                head = insert_into_list(NULL, el)
                p = head
                for (i=2; i<=length(el); i++) {
                        el.a = el.b = sew[i]
                        p = insert_into_list(p, el)
                }
                return(*head)
        }
-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/