Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Extract elements from string variables


From   Howard Lempel <HLempel@brookings.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Extract elements from string variables
Date   Wed, 8 Jul 2009 19:07:06 -0400

This is probably not the most efficient way to solve your problem, but one method follows.  Assume that you have used -split- already and your duplicate variable is equivalent to my newid from my previous example.

sum duplicate
forval i = 1/r(max) {
	replace advisor = advisorindividual`i' if duplicate==`i'
	}
drop advisorindividual?

Hope this helps.
Howie

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of sontag@sas.upenn.edu
Sent: Wednesday, July 08, 2009 6:53 PM
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: RE: Extract elements from string variables

Yes the names are separated by commas

----------------------------------------
> From: HLempel@brookings.edu
> To: statalist@hsphsun2.harvard.edu
> Date: Wed, 8 Jul 2009 18:49:46 -0400
> Subject: RE: st: RE: Extract elements from string variables
>
> No problem.
>
> Are your names separated by commas, then?
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of sontag@sas.upenn.edu
> Sent: Wednesday, July 08, 2009 6:45 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: RE: st: RE: Extract elements from string variables
>
> Thanks Howie,
>
> That is a pretty good estimate of what the data set looks like, however the names I am using are often of different forms:
> Mr John Smith, Dr. Seuss, etc.
>
> So I don't believe that I can use -word-.
>
> Also, I've tried using split, but it generates new variables, which I can use to obtain my goal, but the code is then tedious and kind of disgusting, i.e
>
> gen advisor = advisorindividual1 if duplicate == 1. all the way through the max amount.
>
> and then drop the variables advisorindividual1 through 15.
>
> Does anyone know if it would be possible to do something along the lines of:
> gen advisor = advisorindividual`i' if duplicate == `i'?
>
> Thanks again,
> Conor
>
> ----------------------------------------
>> From: HLempel@brookings.edu
>> To: statalist@hsphsun2.harvard.edu
>> Date: Wed, 8 Jul 2009 14:57:10 -0400
>> Subject: st: RE: Extract elements from string variables
>>
>> Conor,
>>
>> An illustration of your data might help. For now, I assume that your data started out something like this, where origob is the original observation number for each group of names, newid is the new observation number, and X and Y are some other variables:
>>
>> origob names X Y
>> 1 "Alex Charles Henry" 123 456
>> 2 "Mary Liz" 789 324
>>
>> And now looks something like this:
>>
>> origob newid names X Y
>> 1 1 "Alex Charles Henry" 123 456
>> 1 2 "Alex Charles Henry" 123 456
>> 1 3 "Alex Charles Henry" 123 456
>> 2 1 "Mary Liz" 789 324
>> 2 2 "Mary Liz" 789 324
>>
>> And you want this:
>> origob newid names name X Y
>> 1 1 "Alex Charles Henry" "Alex" 123 456
>> 1 2 "Alex Charles Henry" "Charles" 123 456
>> 1 3 "Alex Charles Henry" "Henry" 123 456
>> 2 1 "Mary Liz" "Mary" 789 324
>> 2 2 "Mary Liz" "Liz" 789 324
>>
>>
>> If this is right, you can go from the second stage to the third stage by doing the following:
>>
>> bysort origob: gen name = word(names, _n)
>>
>> Hope this helps.
>> Howie
>>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of sontag@sas.upenn.edu
>> Sent: Wednesday, July 08, 2009 2:37 PM
>> To: statalist@hsphsun2.harvard.edu
>> Subject: st: Extract elements from string variables
>>
>>
>> In the data set I have a string variable with the names of several people. However, I want to look at the relationship across each individual, so I am trying to separate the names.
>>
>> What I have done: expanded the observations based on how many individuals are in each observation's string variable. Labeled the duplicate observations 1, 2, 3, etc.
>>
>> What I need now, is a command that will take the number in my duplicates and take the corresponding individual's name with respect to its position in the string variable, separated by a comma.
>>
>> Any insight would be helpful.
>>
>> Thanks,
>> Conor
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index