[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Sven-Oliver Spieß <mail@svenoliverspiess.net> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: drop 'em OR it depends |

Date |
Sun, 20 Jul 2008 20:29:23 +0200 |

Tim, "It depends" is usually a safe answer. You might want to keep them for example if you run several analysis with different variables and there are reasons why you wouldn't want to have identical samples for whatever reasons. Or if you wanted to impute the missing values of course. Or simply because it's easier to deal with the data when every subject has the same number of observations.--After all, the missings were in the data all along. Other than that I don't own the book you mention and can only assume the same is true for at least some other members of statalist, too. Also the url in your post contains two typos. So unfortunately I can't provide a more specific answer. Generally speaking, in many cases Stata simply "ignores" missing values in analyses and therefore they do not affect the results (see below). To better understand your specific problem it would be helpful if you could provide more details, like what analysis in particular they perform in section 9.6 and an excerpt of the relevant lines from your log file. Best, Sven-Oliver -------example: summary statistics reshaped vs. original online data--------- . use "C:\downloads\fevwide.dta", clear (Repeated measurements of FEV for three groups, coded wide) . reshape long fev, i(id) (note: j = 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48) Data wide -> long ---------------------------------------------------------------------------- - Number of obs. 57 -> 969 Number of variables 19 -> 4 j variable (17 values) -> _j xij variables: fev0 fev3 ... fev48 -> fev ---------------------------------------------------------------------------- - . rename _j month . d, s Contains data obs: 969 Repeated measurements of FEV for t > hree groups, coded wide vars: 4 size: 20,349 (99.9% of memory free) Sorted by: id month Note: dataset has changed since last saved . sum fev Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- fev | 663 42.59765 18.51655 10.12 110.81 . bysort grp: sum fev ---------------------------------------------------------------------------- ---- -> grp = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- fev | 459 47.64026 18.58769 14.28 110.81 ---------------------------------------------------------------------------- ---- -> grp = 2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- fev | 146 28.86562 9.250675 10.12 65.02 ---------------------------------------------------------------------------- ---- -> grp = 3 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- fev | 58 37.25828 16.47463 16.59 81.8 . use "C:\downloads\fevlong.dta", clear (Repeated measurements of FEV for three groups, coded long) . d, s Contains data from C:\downloads\fevlong.dta obs: 663 Repeated measurements of FEV for t > hree groups, coded long vars: 4 20 Apr 2002 21:43 size: 14,586 (99.9% of memory free) Sorted by: id month *** #obs in long data set = #non-missing in reshaped wide data set! . sum fev Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- fev | 663 42.59765 18.51655 10.12 110.81 . bysort grp: sum fev ---------------------------------------------------------------------------- ---- -> grp = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- fev | 459 47.64026 18.58769 14.28 110.81 ---------------------------------------------------------------------------- ---- -> grp = 2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- fev | 146 28.86562 9.250675 10.12 65.02 ---------------------------------------------------------------------------- ---- -> grp = 3 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- fev | 58 37.25828 16.47463 16.59 81.8 *** ==>statistics identical regardless if missings are dropped! -------end example--------- > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- > statalist@hsphsun2.harvard.edu] On Behalf Of Tim > Sent: Sonntag, 20. Juli 2008 06:57 > To: statalist@hsphsun2.harvard.edu > Subject: st: drop 'em OR it depends > > New semester starts in about a week. > One thing I had difficulty with last semester was getting the data > provided into the form needed for the analysis. I could get reshape to > work, but had to look it up every time, and it still took several > attempts every time. > So I've been looking again at Hills and De Stavola, "A short > introduction to Stata for biostatistics", chapter 9. (files at net from > http://ww.stata.com.data/hs/; net get book) > In section 9.5 they cover reshape. > In section 9.6 they cover _N and _n. > The examples in section 9.6 use the fevlong dataset. When I tried using > fevwide reshaped to long, I did not get the results in the book. Only > after dropping missing observations did it work. > > So my question is, should dropping missing obs be normal practice after > reshaping from wide to long, or does it depend on what I want to do > with > the long dataset? > And if I dont' drop 'em always, when do I keep them? > > Tim > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: drop 'em OR it depends***From:*Tim <lists@timbp.com>

- Prev by Date:
**Re: st: LOG2HTML -- size() option not allowed** - Next by Date:
**Re: st: LOG2HTML -- size() option not allowed** - Previous by thread:
**st: drop 'em OR it depends** - Next by thread:
**st: QAP** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |