Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: how can I group airport markets

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: how can I group airport markets
Date	Wed, 20 Apr 2011 08:47:57 +0100

This was last asked on 14 April. That shows how consultation of the
archives is a good idea.  See

http://www.stata.com/statalist/archive/2011-04/msg00767.html

and the resulting thread.

I will assume string variables. The trick is to realise that LAX MIA
and MIA LAX both sort alphabetically to the same pair, so how do we do
that? The functions -min()- and -max()- don't take string arguments,
so we turn instead to -cond()-.

gen first = cond(origin > destination, destination, origin)
gen second = cond(origin < destination, destination, origin)

Note that there is no difficulty about applying > and < to strings --
the expression ("b" > "a") evaluates as true (1), for example.

Then you are home and dry with

egen group = group(first second), label

The thread started with a reference to

http://www.ats.ucla.edu/stat/stata/faq/dyad_ids.htm

but after reading it I still prefer the method above, which was
earlier documented at

SJ-8-4  dm0043  . Tip 71: The problem of split identity, or how to group dyads
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q4/08   SJ 8(4):588--591                                 (no commands)
        tip on how to handle dyadic identifiers

If you have numeric variables with value labels, you can still use
exactly the same technique, but the grouping will not necessarily be
in alphabetical order: that depends on your value labels.

Also, it is possible that this method will uncover some typos in your
data, so you would need to fix those and re-do the grouping.

Nick

P.S. Does "PhD in Economics" mean you have one, or you hope to get one?

On Wed, Apr 20, 2011 at 6:18 AM,  <[email protected]> wrote:
> I have a large dataset on flights with information about origins and
> destinations.
> For example: origin - destination
>             LAX    -    MIA
>             LAX    -    MIA
>             MIA    -    LAX
>             MIA    -    LAX
>             MIA    -    LAX
>             LAS    -    LAX
>             LAX    -    LAS
>             ...         ...
> How can I group LAX-MIA and MIA-LAX into the same market? If I use
> egen market = group (origin destination), they will be grouped into 2
> different markets. Another way is to sort the data into the same origins
> to the same destinations, but I don't know how to do that either.
> For example: origin - destination
>             LAX    -    MIA
>             LAX    -    MIA
>             LAX    -    MIA
>             LAX    -    MIA
>             LAX    -    MIA
>             LAX    -    LAS
>             LAX    -    LAS
>             ...         ...
> Thanks!
>
>
> Dan Luo
> PhD in Economics
> University of California, Irvine

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: how can I group airport markets
  - From: [email protected]

Prev by Date: Re: st: Changing the reference category
Next by Date: st: Meta-analysis of rates greater than 1 (when the event number is greater than the sample size)
Previous by thread: st: how can I group airport markets
Next by thread: st: Changing the reference category
Index(es):
- Date
- Thread