Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: -linkplot- available on SSC


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: -linkplot- available on SSC
Date   Wed, 9 Jul 2003 15:47:10 +0100

Thanks to Kit Baum, a new package -linkplot- has been 
added to SSC. This requires Stata 8. 

To install, type 

. ssc inst linkplot 

in an up-to-date net-aware version of Stata 8. 

-linkplot- draws linked (i.e. connected) scatter plots. 

How does this differ from what is already available through 
-connect()- options? Nothing in principle, but a bit more than 
that in practice. 

Let's dive straight into an example. Box, Hunter and Hunter 
(1978, p.100) gave data for 10 boys on the wear of shoes
made using materials A and B. The data are also analysed by 
Wild and Seber (2000, p.446). The units are not specified. 

One natural data structure would be something like this:

         A          B         id
      13.2       14.0          1
       8.2        8.8          2
      10.9       11.2          3
      14.3       14.2          4
      10.7       11.8          5
       6.6        6.4          6
       9.5        9.8          7
      10.8       11.3          8
       8.8        9.3          9
      13.3       13.6         10

Broadly speaking, variations within boys (same boy, 
different shoes) are less than variations between 
boys, but of more interest. (The design assigns materials
randomly to left and right feet, to avoid "left shoeness"
or "right shoeness", etc.) Graphically, therefore, we 
need ways of showing the data that let us appreciate 
the fine structure. Some possibilities are provided 
by -pairplot- on SSC, and -linkplot- provides others. 

This data structure permits some Stata graphs, 
but inhibits others.  A scatter plot such as 

. scatter A B 

may be useful, but does not allow easy decoding of the
difference, say A - B, which is here, and elsewhere with 
paired data, likely to be of central interest. 

Similarly, it is difficult to read off ratios such as 
A / B. If A and B are plotted versus id, or vice versa, 
the resulting graphs suffer from the arbitrariness of id. 

Other possibilities are available after a -reshape-:

. rename A wearA
. rename B wearB
. reshape long wear, string i(id) j(j)
. encode j, gen(material)

            id  material       wear
  1.         1         A       13.2
  2.         1         B         14
  3.         2         A        8.2
  4.         2         B        8.8
  5.         3         A       10.9
  6.         3         B       11.2
  7.         4         A       14.3
  8.         4         B       14.2
  9.         5         A       10.7
 10.         5         B       11.8
 11.         6         A        6.6
 12.         6         B        6.4
 13.         7         A        9.5
 14.         7         B        9.8
 15.         8         A       10.8
 16.         8         B       11.3
 17.         9         A        8.8
 18.         9         B        9.3
 19.        10         A       13.3
 20.        10         B       13.6

Now we can plot -wear- and -material- on 
different axes. (-material- was produced 
by -encode-, so is numeric underneath 
its value labels.) 

But with this data structure, any 
connections will typically not be all 
vertical or all horizontal. As it happens, 
you can use -connect()- for virtually any kind of 
connection, so long as 

	the data have been put 
	in the right sort order, and (for 
	some problems) missing values have 
	been inserted, which you do not want 
	to connect over, 

but that's a fairly awkward "so long as", which is
why -linkplot- codifies the nitty-gritty. 

Some possibilities are 

. linkplot material wear, link(id) yla(1 2, valuelabel) 
	ysc(r(0.5 2.5)) yla(, ang(h))
. linkplot wear material, link(id) xla(1 2, valuelabel) 
	xsc(r(0.5 2.5)) yla(, ang(h))

The general idea is that you need to specify a -link()- variable
defining groups to be linked. Usually this will be 
some sort of identifier variable, so the idea has 
panel data applications. 

Some of the tricks for getting data in the right sort 
order are discussed in rather dusty old FAQs at 

http://www.stata.com/support/faqs/graphics/connect.html

http://www.stata.com/support/faqs/graphics/vplplot.html

although Stata 8 adds a nicer way to do it all, 
through -cmissing()-, which is in fact the main 
trick within -linkplot-. 

More technicalities are covered in the help file. 

Vince Wiggins provided encouraging noises 
as I worked my way towards this. 

Box, G.E.P., W.G. Hunter and J.S. Hunter, 1978.  
Statistics for experimenters: an
introduction to design, data analysis, 
and model building.  New York: John Wiley.

Wild, C.J. and G.A.F. Seber. 2000.  
Chance encounters: a first course in data
analysis and inference.  New York: John Wiley.

Nick 
n.j.cox@durham.ac.uk 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index