Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Extract elements from string variables


From   <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: Extract elements from string variables
Date   Wed, 8 Jul 2009 18:45:29 -0400

Thanks Howie, 

That is a pretty good estimate of what the data set looks like, however the names I am using are often of different forms: 
Mr John Smith, Dr. Seuss, etc.  

So I don't believe that I can use -word-.  

Also, I've tried using split, but it generates new variables, which I can use to obtain my goal, but the code is then tedious and kind of disgusting, i.e 

gen advisor = advisorindividual1 if duplicate == 1.  all the way through the max amount.  

and then drop the variables advisorindividual1 through 15.  

Does anyone know if it would be possible to do something along the lines of: 
gen advisor = advisorindividual`i' if duplicate == `i'? 

Thanks again, 
Conor

----------------------------------------
> From: [email protected]
> To: [email protected]
> Date: Wed, 8 Jul 2009 14:57:10 -0400
> Subject: st: RE: Extract elements from string variables
>
> Conor,
>
> An illustration of your data might help. For now, I assume that your data started out something like this, where origob is the original observation number for each group of names, newid is the new observation number, and X and Y are some other variables:
>
> origob names X Y
> 1 "Alex Charles Henry" 123 456
> 2 "Mary Liz" 789 324
>
> And now looks something like this:
>
> origob newid names X Y
> 1 1 "Alex Charles Henry" 123 456
> 1 2 "Alex Charles Henry" 123 456
> 1 3 "Alex Charles Henry" 123 456
> 2 1 "Mary Liz" 789 324
> 2 2 "Mary Liz" 789 324
>
> And you want this:
> origob newid names name X Y
> 1 1 "Alex Charles Henry" "Alex" 123 456
> 1 2 "Alex Charles Henry" "Charles" 123 456
> 1 3 "Alex Charles Henry" "Henry" 123 456
> 2 1 "Mary Liz" "Mary" 789 324
> 2 2 "Mary Liz" "Liz" 789 324
>
>
> If this is right, you can go from the second stage to the third stage by doing the following:
>
> bysort origob: gen name = word(names, _n)
>
> Hope this helps.
> Howie
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of [email protected]
> Sent: Wednesday, July 08, 2009 2:37 PM
> To: [email protected]
> Subject: st: Extract elements from string variables
>
>
> In the data set I have a string variable with the names of several people. However, I want to look at the relationship across each individual, so I am trying to separate the names.
>
> What I have done: expanded the observations based on how many individuals are in each observation's string variable. Labeled the duplicate observations 1, 2, 3, etc.
>
> What I need now, is a command that will take the number in my duplicates and take the corresponding individual's name with respect to its position in the string variable, separated by a comma.
>
> Any insight would be helpful.
>
> Thanks,
> Conor
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index