[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: -twoway scatter- different colors for different observations

From   "Kieran McCaul" <>
To   <>
Subject   st: RE: -twoway scatter- different colors for different observations
Date   Mon, 27 Apr 2009 06:58:42 +0800


How about:

local f0 = "red"
local f1 = "green"
twoway (scatter mpg rep78 [fweight=N] if foreign==0, msymbol(Oh)
mcolor(`f0')) ///
       (scatter mpg rep78 [fweight=N] if foreign==1, msymbol(Oh)
mcolor(`f1')) ///
       , legend(off)

Kieran McCaul MPH PhD
WA Centre for Health & Ageing (M573)
University of Western Australia
Level 6, Ainslie House
48 Murray St
Perth 6000
Phone: (08) 9224-2701
Fax: (08) 9224 8009
Epidemiology is so beautiful and provides such an important perspective
on human life and death, 
but an incredible amount of rubbish is published.  Richard Peto (2007) 

-----Original Message-----
[] On Behalf Of Jacob Wegelin
Sent: Monday, 27 April 2009 6:30 AM
Subject: st: -twoway scatter- different colors for different

The following syntax specifies a plot marker or -mlabel- that can take
on a different value at each plotting point (at each observation),
according to the value of variable specified in the -mlabel-

clear all
sysuse auto
scatter price mpg, mlabel(rep78) m(i) mlabposition(3)

But I would like to define a variable that specifies that certain
observations be plotted green, others red. Alternatively that certain
be plotted with color -none- and others with a visible color. How does
one do this?

To motivate this (with an artificial example constructed to mimic a
real example): The following code plots average mpg by rep78, averaged
separately for foreign and domestic autos. Crucially, the area of the
plotting symbol is proportional to the sample size. I would like to
distinguish visually between domestic and foreign autos, though. Two
*separate* -scatter- statements (the second -twoway- command below)
don't give the desired result, because the plotting symbols are
re-scaled for each -scatter- statement. You can see this by switching
rapidly between the two exported graphs, junk1.pdf and junk2.pdf. And
an attempt to define a string variable -mycolor- which takes on values
"red" and "green", and then to specify -mcolor(mycolor)- analogously
to the -mlabel(rep78)- statement above, returns an error.

clear all
set more on
sysuse auto
drop if rep78==.
sort foreign rep78
collapse (mean) mpg (count) N=price, by(foreign rep78)
A larger value for N will make the problem easier to see.
replace N=50 in 6
set scheme lean1
twoway (scatter mpg rep78 [fweight=N], msymbol(Oh))
graph export junk1.pdf, replace
twoway ///
   (scatter mpg rep78 [fweight=N] if foreign==0, msymbol(Oh)
mcolor(red)) ///
   (scatter mpg rep78 [fweight=N] if foreign==1, msymbol(Oh)
mcolor(green)) ///
   , legend(off)
graph export junk2.pdf, replace
/* The following returns an error */
gen mycolor=""
replace mycolor="red" if foreign==0
replace mycolor="green" if foreign==1
tabulate foreign mycolor
twoway (scatter mpg rep78 [fweight=N], msymbol(Oh) mcolor(mycolor))

The [G] GRAPHICS manual under -marker_options- says that one could
define color by specifying a list of elements, as

-mcolor( red green red)-

but this would be clumsy and error-prone. There must be a way to use
the values of a variable, as in the -mlabel(rep78)- example?

Thanks for any insights

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index