Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: loops and strpos to test if an observations string variable value is also in another string variable


From   Dan Blanchette <dan.blanchette@duke.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: loops and strpos to test if an observations string variable value is also in another string variable
Date   Fri, 25 Jun 2010 09:32:46 -0400 (EDT)


You can use a variable name instead of a string/local macro variable:

(input the last example dataset you posted)

. list if strpos(class_sem2_names,Child)

     +-----------------------------------------------------------------------------------------------+
     | Class    Teacher   Subject   Semester     Child   in_sem_b                   class_sem2_names |
     |-----------------------------------------------------------------------------------------------|
  1. |     1   Mrs. Fox      Math          a     Smith          1   SmithJonesFoxTolmieKershawBarker |
  2. |     1   Mrs. Fox      Math          a     Jones          1   SmithJonesFoxTolmieKershawBarker |
  3. |     1   Mrs. Fox      Math          a    Barker          1   SmithJonesFoxTolmieKershawBarker |
  4. |     1   Mrs. Fox      Math          a   Kershaw          1   SmithJonesFoxTolmieKershawBarker |
  6. |     1   Mrs. Fox      Math          a    Tolmie          1   SmithJonesFoxTolmieKershawBarker |
     +-----------------------------------------------------------------------------------------------+

If you wanted to search all obs using the value of Child at a certain observation like obs 4:

 list if strpos(class_sem2_names,Child[4])

works.

You could loop through all obs:

forvalues n= 1/`c(N)' {
 list if strpos(class_sem2_names,Child[`n'])
}

HTH,

Dan Blanchette
Research Associate
Center for Entrepreneurship and Innovation
Duke University's Fuqua School of Business
dan.blanchette@duke.edu

From	  dckersh <dckersh@email.unc.edu>
To	  <statalist@hsphsun2.harvard.edu>
Subject	  st: loops and strpos to test if an observations string variable value is also in another string variable
Date	  Thu, 24 Jun 2010 21:42:04 -0400

I could use some help on using loops and strpos() to see whether an
observationâ??s string variable value is present in another string variable.
I have statewide roster data for a number of different years. Within each
year, schools track students in classes in different ways. Some schools are
very detailed and keep track of classrooms at numerous times throughout the
years. So you will have different records for a child that will contain the
same teacher, course title, course code, but different sections, semesters,
meeting times, etc. The result is a situation where the same "class" has
different students (some students move, etc.). I am trying to find a way to
identify which students (and the overall proportion of students that) were
in a class in each instance. To do this, I want to be able to flag the
first semester students who were also present in the second semester.

A fairly simple way to do this is to select the second semester classes,
reshape the data to one record per class, capture the names of students in
the class in one variable, merge those names (that variable) onto the first
semester version of that class, and then search for the last names of the
students in the first semester of the class within the variable that
captures the last names of the students in the class during the second
semester. Skipping the reshaping and merging which I've figured out, I
theoretically get what I want with the following code on data similar to
below:


gen in_sem_b = 0
replace in_sem_b  = 1 if Child=="<NAME>" &
strpos(class_sem2_names,"<NAME>")>0
replace in_sem_b  = 1 if Child=="Smith" &
strpos(class_sem2_names,"Smith")>0

*using data structured like this*
Class Teacher Subject Semester Child in_sem_b class_sem2_names 1 Mrs. Fox Math a Smith 1 SmithJonesFoxTolmieKershawBarker 1 Mrs. Fox Math a Jones 1 SmithJonesFoxTolmieKershawBarker



I have seen some code in older listserv posts that substitutes a relative
`X' variable within strpos [strpos(class_sem2_names,`X' )] but I am at a
loss as to what code I can use to simultaneously take each observation's
name from the child variable and to search for it in the class_sem2_names
variables (which will vary by class). Note that the same child may be in
other classes, so the child names are not unique.

I am open to all suggestions. Thanks for any assistance.

Warm Regards,

Dave Kershaw


HEREâ??S MORE DETAILED DATA TO HIGHLIGHT WHAT Iâ??M DOING
*TRUNCATED RAW DATA - one class, two semesters
Class Teacher Subject Semester Child 1 Mrs. Fox Math a Smith 1 Mrs. Fox Math a Jones 1 Mrs. Fox Math a Barker 1 Mrs. Fox Math a Kershaw 1 Mrs. Fox Math a Tanner 1 Mrs. Fox Math a Tolmie 2 Mrs. Fox Math b Smith 2 Mrs. Fox Math b Jones
2          Mrs. Fox  Math      b   Fox
2          Mrs. Fox  Math      b   Tolmie
2          Mrs. Fox  Math      b   Kershaw
2 Mrs. Fox Math b Barker .
.
.
*Data for only the first semester of a class only
Class Teacher Subject Semester Child 1 Mrs. Fox Math a Smith 1 Mrs. Fox Math a Jones 1 Mrs. Fox Math a Barker 1 Mrs. Fox Math a Kershaw 1 Mrs. Fox Math a Tanner 1 Mrs. Fox Math a Tolmie

*Data for the first semester of a class with names of 2nd merged, kids
flagged.
Class  Teacher   Subject Semester  Child   in_sem_b  class_sem2_names
1 Mrs. Fox Math a Smith 1 SmithJonesFoxTolmieKershawBarker 1 Mrs. Fox Math a Jones 1 SmithJonesFoxTolmieKershawBarker 1 Mrs. Fox Math a Barker 1 SmithJonesFoxTolmieKershawBarker 1 Mrs. Fox Math a Kershaw 1 SmithJonesFoxTolmieKershawBarker 1 Mrs. Fox Math a Tanner 0 SmithJonesFoxTolmieKershawBarker 1 Mrs. Fox Math a Tolmie 1 SmithJonesFoxTolmieKershawBarker


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index