Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: identifying highest number of consecutive variables where answer is consistent across observation |

Date |
Thu, 20 Feb 2014 18:34:26 +0000 |

Joe Canner has developed a good strategy for looking at this. Here is another. Suppose we -reshape long-, something like gen id = _n reshape long var, i(id) j(question) tsset id question Then we can treat the blocks of observations as panel data. With ssc inst tsspell tsspell var With this syntax for -tsspell- a "spell" is automatically a sequence of identical values. The existence of spells 15 or longer will be summarized by egen fifteen_or_more = total((_seq >= 15) / _end), by(id) where division by the indicator variable -_end- (1 on end of spell, 0 otherwise) ensures that we look only at the ends of spells. If needed, we can then -reshape- back. On the other hand, it is quite likely that some questions of similar kind are more easily answered with this data structure. Nick njcoxstata@gmail.com On 20 February 2014 17:04, Alison El Ayadi <aelayadi@gmail.com> wrote: > I am doing some data cleaning on survey data and am looking to > identify observations where there are 15 or more of the same answers > in a row (across the variables in current order). All of the > variables are string. Does anyone have an easy automated way to do > this? I'm thinking that it could be done by generating a variable > that provided the maximum number of same responses in a row, but have > no idea how to code this. Variables are q1 - q94, and all string. > > Any suggestions on efficiently writing this code would be greatly appreciated. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: identifying highest number of consecutive variables where answer is consistent across observation***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: identifying highest number of consecutive variables where answer is consistent across observation***From:*Alison El Ayadi <aelayadi@gmail.com>

- Prev by Date:
**Re: st: Mata compatibility problem** - Next by Date:
**RE: st: insheet and dropping cases** - Previous by thread:
**Re: st: RE: identifying highest number of consecutive variables where answer is consistent across observation** - Next by thread:
**Re: st: identifying highest number of consecutive variables where answer is consistent across observation** - Index(es):