Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Dropping parts of strings (i.e. Baldwin County becomes Baldwin)


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: RE: Dropping parts of strings (i.e. Baldwin County becomes Baldwin)
Date   Thu, 6 Mar 2014 18:55:41 +0000

Syntax, style and strategy are all involved here.

I just want to emphasise building up code that does things in steps,
so that each bit of the code

1. solves part of the problem

2. is something you understand

That may (should) sound blindingly obvious...

Nick
[email protected]


On 6 March 2014 18:40, Radwin, David <[email protected]> wrote:

> Would something simpler like this work? Note the compound quotation marks.
>
> foreach x in `"County"' `"Borough"' `"Township"' `"Census Area"' {
>         replace county = trim(subinstr(county, "`x'", "", .))
>         }

Andrew Maurer

>> No, that's why I gave united_kingdom as an example. It:
>> - removes everything including and after the first + sign observed
>> - replaces all remaining underscores with spaces
>> - trims the leading and trailing spaces
>>
>> "__united_kingdom__+___county" becomes "united kingdom"
>>
>> There would be problems if you had a country with a + sign in the name,
>> but I don't expect you'll have that scenario.


Cody Cook

>> Thanks for the response! I think this route is promising, but (just by
>> looking at the code, I haven't run it) it looks like it may have trouble
>> with counties that have two words in their name, or no? Is there a way to
>> account for this?

Andrew Maurer <[email protected]> wrote:

>> > Maybe not the most efficient way, but this works (replaces internal
>> underscores with spaces):
>> >
>> > local name __united_kingdom__+___county
>> > di trim(subinstr(substr(`"`name'"',1,strpos(`"`name'"',"+")-1),"_","
>> ",.))
>> >
>> > or for if your variable is named "oldname":
>> > gen newname = trim(subinstr(substr(oldname,1,strpos(oldname,"+")-
>> 1),"_"," ",.))
>> >
>> > (if you want internal underscores then gen newname2 = subinstr(newname,"
>> ","_",.)

Cody Cook

>> > I have a ton of census data that I'm trying to merge with some other
>> data. The other data only has county names, not FIPS.
>> >
>> > For the census data, it is formatted as __name___ + ___ census type ___
>> where census type is either "County", "Borough", "Census Area" and maybe a
>> few more. This is all in one string. I want to only keep the part that
>> identifies the specific county.
>> >
>> > Basically, is there a way to say "if county includes "XXXX" replace
>> without "XXXX""
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index