Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: loops and strpos to test if an observations string variable value is also in another string variable


From   dckersh <dckersh@email.unc.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: loops and strpos to test if an observations string variable value is also in another string variable
Date   Thu, 24 Jun 2010 21:42:04 -0400


I could use some help on using loops and strpos() to see whether an
observation’s string variable value is present in another string variable.
I have statewide roster data for a number of different years. Within each
year, schools track students in classes in different ways. Some schools are
very detailed and keep track of classrooms at numerous times throughout the
years. So you will have different records for a child that will contain the
same teacher, course title, course code, but different sections, semesters,
meeting times, etc. The result is a situation where the same "class" has
different students (some students move, etc.). I am trying to find a way to
identify which students (and the overall proportion of students that) were
in a class in each instance. To do this, I want to be able to flag the
first semester students who were also present in the second semester.

A fairly simple way to do this is to select the second semester classes,
reshape the data to one record per class, capture the names of students in
the class in one variable, merge those names (that variable) onto the first
semester version of that class, and then search for the last names of the
students in the first semester of the class within the variable that
captures the last names of the students in the class during the second
semester. Skipping the reshaping and merging which I've figured out, I
theoretically get what I want with the following code on data similar to
below:


gen in_sem_b = 0
replace in_sem_b  = 1 if Child=="<NAME>" &
strpos(class_sem2_names,"<NAME>")>0
replace in_sem_b  = 1 if Child=="Smith" &
strpos(class_sem2_names,"Smith")>0

*using data structured like this*
Class  Teacher   Subject Semester  Child in_sem_b  class_sem2_names 
1          Mrs. Fox  Math      a   Smith    1    
SmithJonesFoxTolmieKershawBarker
1          Mrs. Fox  Math      a   Jones    1    
SmithJonesFoxTolmieKershawBarker



I have seen some code in older listserv posts that substitutes a relative
`X' variable within strpos [strpos(class_sem2_names,`X' )] but I am at a
loss as to what code I can use to simultaneously take each observation's
name from the child variable and to search for it in the class_sem2_names
variables (which will vary by class). Note that the same child may be in
other classes, so the child names are not unique.  

I am open to all suggestions. Thanks for any assistance.

Warm Regards,

Dave Kershaw


HERE’S MORE DETAILED DATA TO HIGHLIGHT WHAT I’M DOING
*TRUNCATED RAW DATA - one class, two semesters
Class  Teacher   Subject Semester  Child     
1          Mrs. Fox  Math      a    Smith  
1          Mrs. Fox  Math      a   Jones  
1          Mrs. Fox  Math      a   Barker  
1          Mrs. Fox  Math      a   Kershaw
1          Mrs. Fox  Math      a   Tanner 
1          Mrs. Fox  Math      a   Tolmie 
2          Mrs. Fox  Math      b   Smith  
2          Mrs. Fox  Math      b   Jones
2          Mrs. Fox  Math      b   Fox
2          Mrs. Fox  Math      b   Tolmie
2          Mrs. Fox  Math      b   Kershaw
2          Mrs. Fox  Math      b   Barker  
.
.
.
*Data for only the first semester of a class only
Class  Teacher   Subject Semester  Child   
1          Mrs. Fox  Math      a   Smith  
1          Mrs. Fox  Math      a   Jones  
1          Mrs. Fox  Math      a   Barker  
1          Mrs. Fox  Math      a   Kershaw
1          Mrs. Fox  Math      a   Tanner 
1          Mrs. Fox  Math      a   Tolmie 

*Data for the first semester of a class with names of 2nd merged, kids
flagged.
Class  Teacher   Subject Semester  Child   in_sem_b  class_sem2_names
1          Mrs. Fox  Math      a   Smith   1 
SmithJonesFoxTolmieKershawBarker
1          Mrs. Fox  Math      a   Jones   1 
SmithJonesFoxTolmieKershawBarker
1          Mrs. Fox  Math      a   Barker  1 
SmithJonesFoxTolmieKershawBarker
1          Mrs. Fox  Math      a   Kershaw 1 
SmithJonesFoxTolmieKershawBarker
1          Mrs. Fox  Math      a   Tanner  0 
SmithJonesFoxTolmieKershawBarker
1          Mrs. Fox  Math      a   Tolmie  1 
SmithJonesFoxTolmieKershawBarker
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index