[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
khigbee@stata.com |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Profile Plots of Cluster Solution - How to? |

Date |
Sat, 23 Nov 2002 09:59:54 -0600 |

Tim Victor <tvictor@dolphin.upenn.edu> asks: > This should be easy but I've been working at it for three days now and > can't find the solution. What I am trying to do is simply plot the > profiles for 5 cluster solution. All I want to do is plot each cluster > mean (and error bar) for each attribute in the same graph. Oddly, doing > this in SAS is only a few lines after transposing the data: > > symbol i=std1mjt; > proc gplot data=plotme; > plot value * attribute = cluster / haxis=axis1 vaxis=axis2 frame; > run; > > Any suggestions? Thanks. As Nick Cox might say -- what is SAS? There is not currently (to my knowledge) a single command or two in Stata that will produce what you want. However, it can be done. Let me outline the steps. These steps could be combined up into an ado program if you were doing this kind of thing a lot. Step 1 -- obtain the needed data (the means and std deviations or std errors) in a layout that can be used in Step 2. Step 2 -- use -serrbar- (or for more fine control use -graph- and -gph-) to produce the graph I will illustrate with the auto data and I will be plotting means and error bars that are +/- 1.96 * std. deviation. If you want std. errors, then alter code below. Step 0 is to obtain a five group cluster solution. Step 0: use auto, clear keep head trunk turn disp replace disp = disp/20 cluster completelink head trunk turn disp, name(mycl) cluster gen my5 = group(5) The variable my5 indicates the five groups. We can view the data we will want to obtain (the means and std. dev. of the four variables by the five groups) with: bysort my5 : summarize head trunk turn disp Step 1: There are probably better ways, but here is one way that I thought of to produce the desired dataset to be used in graphing. preserve foreach var in head trunk turn disp { statsby "summarize `var'" mean = (r(mean)) sd = (r(sd)) /* */ , by(my5) clear gen str2 name = substr("`var'",1,2) save mytmp`var' , replace restore, preserve } use mytmphead , clear foreach var in trunk turn disp { append using mytmp`var' } sort name my5 egen namecl = group(name my5) , label list save mynew , replace -statsby- gives us what we want for a single variable. We need the results for each of the variables in the cluster analysis, so we loop over the variables and create little datasets that we later -append- together. Step 2: I will present four alternatives Alternative 1 serrbar mean sd namecl , scale(1.96) xlab(1/20) ylab Alternative 2 sort my5 name serrbar mean sd namecl , scale(1.96) xlab(1/20) ylab c(LII) Alternative 3 encode name, gen(name2) sort my5 name2 serrbar mean sd name2, scale(1.96) xlab ylab c(LII) Alternative 4 gen name3 = name2 + my5/10 serrbar mean sd name3, scale(1.96) xlab ylab c(LII) Step 3: restore After producing the graph we -restore- back to the original data. I prefer Alternative 4, but more labeling etc. would be nice. To get better control of this, you might need to use -graph- and -gph-. Ken Higbee khigbee@stata.com StataCorp 1-800-STATAPC * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Anova with string variables** - Next by Date:
**st: Matching strings** - Previous by thread:
**st: Profile Plots of Cluster Solution - How to?** - Next by thread:
**st: Anova with string variables** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |