Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Stack trick by Nicholas Cox


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: Stack trick by Nicholas Cox
Date   Mon, 4 Mar 2013 09:50:33 +0000

The reference is to a Speaking Stata column in the _Stata Journal_.
This is accessible to all  in

http://stata-journal.com/sjpdf.html?articlenum=gr0004

The example uses Stata's auto data and can be replicated by

sysuse auto

bysort foreign mpg: gen foreign2 = ///
cond(foreign==1,  1-0.1*(_n-1)/7, foreign+0.1* (_n-1)/7)

The results can be inspected. Here is enough to give a flavour of what
is produced

. l foreign mpg foreign2

     +---------------------------+
     |  foreign   mpg   foreign2 |
     |---------------------------|
  1. | Domestic    12          0 |
  2. | Domestic    12   .0142857 |
  3. | Domestic    14          0 |
  4. | Domestic    14   .0142857 |
  5. | Domestic    14   .0285714 |
     |---------------------------|
  6. | Domestic    14   .0428571 |
  7. | Domestic    14   .0571429 |
  8. | Domestic    15          0 |
  9. | Domestic    15   .0142857 |

<snip>

 53. |  Foreign    14          1 |
 54. |  Foreign    17          1 |
 55. |  Foreign    17   .9857143 |
     |---------------------------|
 56. |  Foreign    18          1 |
 57. |  Foreign    18   .9857143 |
 58. |  Foreign    21          1 |
 59. |  Foreign    21   .9857143 |
 60. |  Foreign    23          1 |
     |---------------------------|
 61. |  Foreign    23   .9857143 |
 62. |  Foreign    23   .9714286 |
 63. |  Foreign    24          1 |
 64. |  Foreign    25          1 |
 65. |  Foreign    25   .9857143 |
     |---------------------------|
 66. |  Foreign    25   .9714286 |
 67. |  Foreign    25   .9571428 |
 68. |  Foreign    26          1 |
 69. |  Foreign    28          1 |
 70. |  Foreign    30          1 |
     |---------------------------|
 71. |  Foreign    31          1 |
 72. |  Foreign    35          1 |
 73. |  Foreign    35   .9857143 |
 74. |  Foreign    41          1 |
     +---------------------------+

The idea is to get a y coordinate at which to plot each pair of values
in a scatter plot of -foreign- versus -mpg-. The context is that we
are plotting a logit fit from -logit foreign mpg- and we are adding
the raw data at the top and bottom of the plot as what are now often
called as rugs.

Consider the last observation, which is the only observation with
foreign = 1 (Foreign), mpg = 41. We can just plot it as y = 1, x = 41.

The previous two observations tie at foreign = 1, mpg = 35. If we
plotted them, the marker symbols would just be superimposed.

So we stack them vertically. One can be plotted at y = 1, x = 35, but
the other must be nudged downwards from y = 1.

A similar decision applies for values with foreign = 0. Pairs that
occur once only can be plotted at y = 0, x = mpg value, but ties must
be separated to be discernible.

The general rule for this dataset -- chosen after experiment -- was

cond(foreign==1,  1-0.1*(_n-1)/7, foreign+0.1* (_n-1)/7)

meaning

for foreign = 1, use y = 1 if _n == 1, 1 - 0.1/7 if _n == 2, and so on.

for foreign = 0, use y = 0 if _n == 1, 0 + 0.1/7 if _n == 3, and so on.

The -cond()- function handles both cases at once. -search cond, sj-
for access to a 2005 tutorial by David Kantor and myself if needed.

What is _n here? It is crucial that the observation number _n is
counted _within_ distinct groups of -foreign mpg-. -search by, sj- for
access to a 2002 tutorial if needed.

7 is just a choice that works well in this dataset, or so I thought.

There is no use of options in this code.

Nick

On Sun, Mar 3, 2013 at 8:10 PM, Michael Stewart
<michaelstewartresearch@gmail.com> wrote:

> I am a novice and trying to learn stata graphics.I read Speaking Stat
> by Nick, vol 4 , number 2, page 190-215 regarding Graphing Categorical
> and compositional data.
> Nick writes a conditional statement on page 193.I could not understand
> the second option
> His cond statement is bysort foreign mpg:gen foreign2=cond(foreign==1,
> 1-0.1*(_n-1)/7, foreign+0.1*(_n-1)/7)
> I cannot understand  what does 1-0.1*(_n-1)/7 and
> foreign+0.1*(_n-1)/7  compute and their purpose.I tried to read the
> article but am still at loss.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index