Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Need help for calculation across observations within variable |

Date |
Tue, 21 May 2013 14:40:14 +0100 |

What's efficiency here? If it's machine time, in principle you should not use -egen-. In practice, it would take a big dataset or many repetitions to notice the slow-down, likely to be less than the time taken to write alternative code. On (1), whether there is a difference: If it's not machine time, but conciseness or simplicity of code, consider bysort pt_name (year) : gen different = year[_N] != year[1] except that a large group of Stata users might not agree on how transparent that is. This particular question is also an FAQ: http://www.stata.com/support/faqs/data-management/listing-observations-in-group/ On (2), the number of distinct values, there is a detailed discussion in SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations (help distinct if installed) . . . . . . N. J. Cox and G. M. Longton Q4/08 SJ 8(4):557--568 shows how to answer questions about distinct observations from first principles; provides a convenience command Your solution is a good one. Here is another egen tag = tag(pt_name year) egen max = total(tag), by(pt_name) Am I being consistent about -egen-? This is how I resolve it: 1. Interactively, I will often use -egen- if an -egen- solution springs to mind. 2. In a program, I know I should rewrite -egen- calls to the extent that a program is needed for serious or repeated use. Nick njcoxstata@gmail.com On 21 May 2013 14:17, Michael Stewart <michaelstewartresearch@gmail.com> wrote: > HI, > > I am looking to see if anyone could an efficient code than what I have > been using for a particular issues that I am dealing with > > My Need > > 1)Create a variable which shows if the "year" is same or different by pat_name > 2)Create a variable which shows number of distinct years ,per patient > > My dataset structure is as follows > > pt_name year(string variable) > 111 2009 > 111 2009 > 111 2009 > 111 2011 > 222 2009 > 222 2009 > 222 2010 > > My code is two step one > Step-1: bysort pt_name(year): gen flag=_n==_N > Step-2:egen max=total(flag),by(pt_name) > > Please let me know if there is an more efficient one step code > > > -- > Thank you , > Yours Sincerely, > Mike. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Need help for calculation across observations within variable***From:*Michael Stewart <michaelstewartresearch@gmail.com>

- Prev by Date:
**st: COURSE: Longitudinal Analysis Using SPSS & Data Management Using SPSS courses at Imperial College** - Next by Date:
**Re: st: Stata's two-step treatreg command** - Previous by thread:
**st: Need help for calculation across observations within variable** - Next by thread:
**st: COURSE: Longitudinal Analysis Using SPSS & Data Management Using SPSS courses at Imperial College** - Index(es):