help marker_label_options
-------------------------------------------------------------------------------
Title
[G] marker_label_options -- Options for specifying marker labels
Syntax
marker_label_options description
---------------------------------------------------------------------
mlabel(varname) specify marker variable
mlabstyle(markerlabelstyle) overall style of label
mlabposition(clockposstyle) where to locate the label
mlabvposition(varname) where to locate the label 2
mlabgap(relativesize) gap between marker and label
mlabangle(anglestyle) angle of label
mlabtextstyle(textstyle) overall style of text
mlabsize(textsizestyle) size of label
mlabcolor(colorstyle) color of label
---------------------------------------------------------------------
All options are rightmost; see repeated options.
Sometimes -- such as when used with scatter -- lists are allowed inside
the arguments. A list is a sequence of the elements separated by
spaces. Shorthands are allowed to make specifying the list easier;
see [G] stylelists. When lists are allowed, option mlabel() allows a
varlist in place of a varname.
Description
Marker labels are labels that appear next to (or in place of) markers.
Markers are the ink used to mark where points are on a plot.
Options
mlabel(varname) specifies the (usually string) variable to be used that
provides, observation by observation, the marker "text". For
instance, you might have
. sysuse auto
. list mpg weight make in 1/4
+------------------------------+
| mpg weight make |
|------------------------------|
1. | 22 2,930 AMC Concord |
2. | 17 3,350 AMC Pacer |
3. | 22 2,640 AMC Spirit |
4. | 20 3,250 Buick Century |
+------------------------------+
Typing
. scatter mpg weight, mlabel(make)
would draw a scatter of mpg versus weight and label each point in the
scatter according to its make. (We recommend that you include "in
1/10" on the above command. Marker labels work well only when there
are few data.)
mlabstyle(markerlabelstyle) specifies the overall look of marker labels,
including their position, their size, their text style, etc. The
other options documented below allow you to change each attribute of
the marker label, but mlabstyle() is the starting point.
You need not specify mlabstyle() just because there is something you
want to change about the look of a marker and, in fact, most people
seldom specify the mlabstyle() option. You specify mlabstyle() when
another style exists that is exactly what you desire or when another
style would allow you to specify fewer changes to obtain what you
want.
mlabposition(clockposstyle) and mlabvposition(varname) specify where the
label is to be located relative to the point. mlabposition() and
mlabvposition() are alternatives; the first specifies a constant
position for all points and the second specifies a variable that
contains clockposstyle (a number 0--12) for each point. If both
options are specified, mlabvposition() takes precedence.
If neither option is specified, the default is mlabposition(3) (3
o'clock) -- meaning to the right of the point.
mlabposition(12) means above the point, mlabposition(1) means above
and to the right of the point, and so on. mlabposition(0) means that
the label is to be put directly on top of the point (in which case
remember to also specify the msymbol(i) option so that the marker
does not also display; see [G] marker_options).
mlabvposition(varname) specifies a numeric variable containing values
0--12, which are used, observation by observation, to locate the
labels relative to the points.
See [G] clockposstyle for more information on specifying
clockposstyle.
mlabgap(relativesize) specifies how much space should be put between the
marker and the label. See [G] relativesize.
mlabangle(anglestyle) specifies the angle of text. The default is
usually mlabangle(horizontal). See [G] anglestyle.
mlabtextstyle(textstyle) specifies the overall look of text of the marker
labels, which here means their size and color. When you see [G]
textstyle, you will find that a textstyle defines much more, but all
those other things are ignored for marker labels. In any case, the
mlabsize() and mlabcolor() options documented below allow you to
change the size and color, but the mlabtextstyle is the starting
point.
As with mlabstyle(), you need not specify mlabtextstyle() just
because there is something you want to change. You specify
mlabtextstyle() when another style exists that is exactly what you
desire or when another style would allow you to specify fewer changes
to obtain what you want.
mlabsize(textsizestyle) specifies the size of the text. See [G]
textsizestyle.
mlabcolor(colorstyle) specifies the color of the text. See [G]
colorstyle.
Remarks
Remarks are presented under the following headings:
Typical use
Eliminating overprinting and overruns
Advanced use
Using marker labels in place of markers
Typical use
Markers are the ink used to mark where points are on a plot, and marker
labels optionally appear beside the markers to identify the points. For
instance, if you were plotting country data, marker labels would allow
you to have "Argentina", "Bolivia", ..., appear next to each point.
Marker labels visually work well when there are few data.
To obtain marker labels, you specify the mlabel(varname) option, such as
mlabel(country). varname is the name of a variable that, observation by
observation, specifies the text with which the point is to be labeled.
varname may be a string or numeric variable, but usually it is a string.
For instance, consider the South American subset of the
life-expectancy-by-country data:
. sysuse lifeexp
. list country lexp gnppc if region==2
+------------------------------------+
| country lexp gnppc |
|------------------------------------|
45. | Canada 79 19170 |
46. | Cuba 76 . |
47. | Dominican Republic 71 1770 |
48. | El Salvador 69 1850 |
49. | Guatemala 64 1640 |
|------------------------------------|
50. | Haiti 54 410 |
51. | Honduras 69 740 |
52. | Jamaica 75 1740 |
53. | Mexico 72 3840 |
54. | Nicaragua 68 1896 |
|------------------------------------|
55. | Panama 74 2990 |
56. | Puerto Rico 76 . |
57. | Trinidad and Tobago 73 4520 |
58. | United States 77 29240 |
+------------------------------------+
We might graph these data and use labels to indicate the country by
typing
. scatter lexp gnppc if region==2, mlabel(country)
(click to run)
Eliminating overprinting and overruns
In the graph, the label "United States" runs off the right edge and the
labels for Honduras and El Salvador are overprinted. Problems like that
invariably occur when using marker labels. The mlabposition() allows
specifying where the labels appear, and we might try
. scatter lexp gnppc if region==2, mlabel(country) mlabpos(9)
to move the labels to the 9 o'clock position, meaning to the left of the
point. Here, however, that will introduce more problems than it will
solve. You could try other clock positions around the point, but we
could not find one that was satisfactory.
If our only problem were with "United States" running off the right, an
adequate solution might be to widen the x axis so that there would be
room for the label "United States" to fit:
. scatter lexp gnppc if region==2, mlabel(country)
xscale(range(35000))
(click to run)
That would solve one problem but will leave us with the overprinting
problem. The way to solve that problem is to move the Honduras label to
being to the left of its point, and the way to do that is to specify the
option mlabvposition(varname) rather than mlabposition(clockposstyle).
We will create new variable pos stating where we want each label:
. generate pos = 3
. replace pos = 9 if country=="Honduras"
. scatter lexp gnppc if region==2, mlabel(country) mlabv(pos)
xscale(range(35000))
(click to run)
We are near a solution: Honduras is running off the left edge of the
graph, but we know how to fix that. You may be tempted to solve this
problem just as we solved the problem with the United States label:
expand the range, say, to range(-500 35000). That would be a fine
solution.
Here, however, we will increase the margin between the left edge of the
plot area and the y axis by adding the option plotregion(margin(l+9));
see [G] region_options. plotregion(margin(l+9)) says to increase the
margin on the left by 9%, and this is really the "right" way to handle
margin problems:
. scatter lexp gnppc if region==2, mlabel(country) mlabv(pos)
xscale(range(35000))
plotregion(margin(l+9))
(click to run)
The overall result is adequate. Were we producing this graph for
publication, we would move the label for United States to the left of its
point, just as we did with Honduras, rather than widening the x axis.
Advanced use
Let us now consider properly graphing the life-expectancy data and
graphing more of it. This time, we will include South America, as well
as North and Central America and we will graph the data on a log(GNP)
scale.
. sysuse lifeexp, clear
. keep if region==2 | region==3 (note 1)
. replace gnppc = gnppc / 1000
. label var gnppc "GNP per capita (thousands of dollars)" (note 2)
. generate lgnp = log(gnp)
. qui reg lexp lgnp
. predict hat
. label var hat "Linear prediction" (note 3)
. replace country = "Trinidad" if country=="Trinidad and Tobago"
. replace country = "Para" if country == "Paraguay" (note 4)
. generate pos = 3
. replace pos = 9 if lexp > hat (note 5)
. replace pos = 3 if country == "Colombia"
. replace pos = 3 if country == "Para"
. replace pos = 3 if country == "Trinidad"
. replace pos = 9 if country == "United States" (note 6)
. twoway (scatter lexp gnppc, mlabel(country) mlabv(pos))
(line hat gnppc, sort)
, xscale(log) xlabel(.5 5 10 15 20 25 30, grid)
legend(off)
title("Life expectancy vs. GNP per capita")
subtitle("North, Central, and South America")
note("Data source: World bank, 1998")
ytitle("Life expectancy at birth (years)")
(click to run)
Notes:
1. In these data, region 2 is North and Central America, and region
3 is South America.
2. We divide gnppc by 1,000 to keep the x axis labels from running
into each other.
3. We add a linear regression prediction. We cannot use graph
twoway lfit because we want the predictions to be based on a
regression of log(GNP), not GNP.
4. The first time we graphed the results, we discovered that there
was no way we could make the names of these two countries fit on
our graph, so we shortened them.
5. We are going to place the marker labels to the left of the marker
when life expectancy is above the regression line and to the
right of the marker otherwise.
6. To keep labels from overprinting, we need to override rule (5)
for a few countries.
Also see [G] scale_option for another rendition of this graph. In that
rendition, we specify one more option -- scale(1.1) -- to increase the
size of the text and markers by 10%.
Using marker labels in place of markers
In addition to specifying where the marker label goes relative to the
marker, you can specify that the marker label be used instead of the
marker. mlabposition(0) means that the label is to be centered where the
marker would appear. To suppress the display of the marker as well,
specify option msymbol(i); see [G] marker_options.
Using the labels in place of the points tends to work well in analysis
graphs where our interest is often in identifying the outliers. Below we
graph the entire lifeexp.dta data:
. scatter lexp gnppc, xscale(log) mlab(country) m(i)
(click to run)
In the above graph, we also specified xscale(log) to convert the x axis
to a log scale. A log x scale is more appropriate for these data, but
had we used it earlier, the overprinting problem with Honduras and El
Salvador would have disappeared, and we wanted to show how to handle the
problem.
Also see
Manual: [G] marker_label_options
Help: [G] graph twoway scatter; [G] markerlabelstyle, [G]
clockposstyle, [G] relativesize, [G] anglestyle, [G] textstyle,
[G] textsizestyle, [G] colorstyle