Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: extract values from kdensity graphic


From   "Seed, Paul" <paul.seed@kcl.ac.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: extract values from kdensity graphic
Date   Thu, 3 May 2012 17:24:21 +0100

Dear Statalist, 

As Nick points out, this is becoming quite a complex problem.
I actually would not use -kdensity-, as it does 
not capture the essential features of Mike's original data set.

A simpler approach is to look at the differences between successive values, 
and declare a new group whenever the gap is large (for a suitable value
of "large").  This can be quite easily done in version 8.


***** Begin example **********

* Enter Mike's data set
set more off
clear
input sampling_event size
1 94.74
2 94.89
3 94.95
4 94.97
5 95
6 95.05
7 95.08
8 96.11
9 96.22
10 96.24
11 96.27
12 96.27
13 96.27
14 96.32
15 96.34
16 97.19
17 97.26
18 97.26
19 97.32
20 97.34
21 97.39
22 98.41
23 100.62
24 100.69
25 100.69
26 100.76
27 100.76
28 100.76
29 100.84
30 100.91
end
list
twoway (scatter size sampling_event)

* Indentify groups
sort size
gen step = size -size[_n-1]

* Use -stem- to quickly assess the step sizes
stem step
* In the example, steps are all <=0.1 or >= 0.85
* I declare a new group for any step > 0.5
* I could change this depending on the data set

gen group = step >0.5
replace group = sum(group)

* Check groups are well defined
bys group : su size

* Graph the various groups in different colours
graph twoway (connected size sampling_event if group == 1) ///
	(connected size sampling_event if group == 2) ///
	(connected size sampling_event if group == 3) ///
	(connected size sampling_event if group == 4) ///
	(connected size sampling_event if group == 5) 
* That looks good	
	
* Now try out -kdensity-; pick up the plotted values in x and d
kdensity size , w(0.1) n(30) gen(x d)

graph twoway (connected d x if group == 1) ///
	(connected d x if group == 2) ///
	(connected d x if group == 3) ///
	(connected d x if group == 4) ///
	(connected d x if group == 5) 
* kdensity just does not seem to capture the groups I see in the simple scatter plot.
 

********** End example **************

Paul T Seed, Senior Lecturer in Medical Statistics, 

Division of Women's Health, King's College London
Women's Health Academic Centre KHP
020 7188 3642, 
 paul.seed@kcl.ac.uk, 
http://www.kcl.ac.uk/medicine/research/divisions/wh/about/people/seedp.aspx

Please do not send unencrypted un-anonymised data to this address.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index