Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Replacing grouped string values with longest string value in the group


From   Maarten buis <maartenbuis@yahoo.co.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Replacing grouped string values with longest string value in the group
Date   Thu, 19 Aug 2010 17:48:25 +0000 (GMT)

--- On Thu, 19/8/10, McDermaid, Cameron wrote:
> I want to replace string values in a group to the longest
> string value from the same variable within the group.
> 
> I have a dataset with a number of symptoms that come from
> different hospitals. Because of differences in coding
> practices and errors, there can be differences in values
> (e.g. spelling).  I will not know how many symptom values
> will be present in a given data set and the symptom values
> are always strings. I've grouped the variables by soundex
> code and want to use the longest Symptom value to replace
> the shorter ones in the group. 
> 
> The intent is to write an ado file that anyone can run with
> minimum interfacing to generate frequencies of Symptoms
> after they've been processed as above.

No looping is necesary, the -by- prefix is very useful for
situations like these:

*----------------------- begin example ---------------------
drop _all
input str30 ptom              str4 slike  slength  slongest
"ALTEREDLEVELOFCONSCIOUSNESS" A436         27       1
"ALTERED CONSCIOUSNESS"       A436         20       0
"ALTERED CONSCIOUSNESS"       A436         20       0
"BLURREDVISION"               B463         13       1
"CONVULSIONS"                 C514         11       1
"DIZZY"                       D200         5        1
"DIZZINESS/VERTIGO"           D252         17       1
"DIZZY/VERTIGO"               D252         13       0
end

bys slike (slongest) : gen longest = ptom[_N]
list ptom longest
*---------------------- end example ------------------------
(For more on examples I sent to the Statalist see: 
http://www.maartenbuis.nl/example_faq )

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index