Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: using encode to order string distances


From   Tim Evans <Tim.Evans@wmciu.nhs.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: using encode to order string distances
Date   Tue, 25 Oct 2011 09:53:53 +0100

Thanks Nick,

I'll give that a try.

As I only had "100+" to deal with I used the following:

encode nearestcentre if nearestcentre!="100+", generate( nearestcentre2)
replace nearestcentre2=12 if nearestcentre2==.
label define nearestcentre2 12 "100+", add

I know its not particularly pretty, but it did work when I needed it, but will take a look at what you suggest.

Best wishes

Tim
-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: 25 October 2011 09:41
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: using encode to order string distances

An alternative is to start by using regex (regular expression) functions.

-moss- (SSC) is a convenience tool wrapped around Stata's regex
technology. Its use here is perhaps overkill:

. l

     +-------+
     |  var1 |
     |-------|
  1. |   0-9 |
  2. | 10-19 |
  3. | 20-29 |
  4. |   30+ |
     +-------+

. moss var1, match("([0-9]+)") regex

. l

     +----------------------------------------------------+
     |  var1   _count   _match1   _pos1   _match2   _pos2 |
     |----------------------------------------------------|
  1. |   0-9        2         0       1         9       3 |
  2. | 10-19        2        10       1        19       4 |
  3. | 20-29        2        20       1        29       4 |
  4. |   30+        1        30       1                 . |
     +----------------------------------------------------+

Nick

On Mon, Oct 24, 2011 at 6:28 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> You need to parse your strings and order on the numeric equivalent of the lower value. That will be sufficient because the upper values increase too, except that the open upper limit is implicit.
>
> Here is one way to do it.
>
> Suppose -sdist- is string variable with distances.
>
> gen pos = max(strpos(sdist, "-"), strpos(sdist, "+"))
>
> We look for "-" or "+".
>
> gen n1 = real(substr(sdist, 1, pos-1))
> egen group = group(n1)
> labmask group, values(sdist)
>
> where -search labmask- will point to download locations.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Tim Evans
>
> I have distances stored in strings like:
>
> 0-9
> 10-19
> 20-29
> 30-39
>
>
> up to
> 90-99
> 100+
>
> I want to encode these to keep the current values as labels and replace the numbers behind them as 1-12.
>
> When I encode and generate a new variable I end up with 100+ as the 3rd value rather than the 12th value.
>
> How can I force encode to make sure 100+ is the 12th and not the 3rd value.
>
> I'm using
>
> encode nearestcentre, generate(NRCENT)
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

_DISCLAIMER:
This email and any attachments hereto contains proprietary information, some or all of which may be confidential or legally privileged. It is for the exclusive use of the intended recipient(s) only. If an addressing or transmission error has misdirected this e-mail and you are not the intended recipient(s), please notify the author by replying to this e-mail. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this e-mail or any attachments, as this may be unlawful.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index