Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re: first program

From   "Nick Cox" <>
To   <>
Subject   st: RE: Re: first program
Date   Thu, 1 Jun 2006 15:52:18 +0100

Tweaking Rodrigo's program, which improves a lot
on the original: 

1. This will fail if data are not sorted as desired. 

2. Using -egen- is over the top here. You can and 
should use _N directly. 

3. As you just want the maximum, use -summarize, 

4. This will fail if the data are not set up
as panel data. 

Now find my bugs... 

program balance, sort 
     version 8.2
     syntax [if] [in]
     marksample touse 
     tempvar count
     qui {
         local ivar: char _dta[iis]
	   if "`ivar'" == "" { 
		di as err "no identifier set"
		exit 498 
         bysort `touse' `ivar': gen long `count' = _N
         sum `count' if `touse', meanonly 
         keep if `count' == r(max) & `touse' 

(Also, contrary to this, -syntax id- is just illegal
-syntax- syntax.) 


Rodrigo A. Alfaro
> If you are planning to run this program only in one dataset 
> (that uses id as 
> cross-sectional identifier) you dont need to put it in the 
> syntax. When you 
> put id in the syntax you are creating a temporary variable 
> called id that 
> will be use further, for that reason you have to invoke it using `id' 
> instead of id.
> You are not using [if] and [in], putting these in the syntax 
> line just 
> allows to use them. This means that the users can conditional 
> the result but 
> your program still needs to apply the restrictions... read 
> marksample to get 
> a formal uses of [if] and [in].
> You have to define a temporary variable count in order to 
> prevent that this 
> variable already exists in the dataset. Again, you have to 
> invoke it using 
> `count' instead of count. After keep you will lose 
> information, be careful 
> with that.
> Finally, I don't understand what did you mean with "compile". 
> You just load 
> the program with -do balance- or even more, you can save this 
> program with 
> the extension .ado into your personal folder and this will be 
> your own 
> command.
> Rodrigo.
> PS: This is my version of your program:
> program balance
>     version 8.2
>     syntax [if] [in]
>     tempvar count
>     qui     {
>         local ivar: char _dta[iis]
>         by `ivar': egen `count'=count(`ivar') `if' `in'
>         sum `count'
>         local max=r(max)
>         keep if `count'==`max'
>     }
> end
> I dropped rclass, because I don't need to save any value 
> after reducing the 
> panel. Also, I deleted id of the syntax because you can use 
> char _dta[iis] 
> that tells you which cross-sectional variable was defined 
> using -tsset-. 
> Note that local variables as well [if] and [in] are invoked using `'.
Dirk Nachbar 

> I am trying to write my first Stata program and was wondering 
> if someone 
> could go through it and tell me if it's correct, how I should 
> refer to id 
> and what rclass means (just copied that).
> Another thing, I compiled it once and then wanted to 
> recompile it. How do I 
> do that?
> /*
> program to balance an unbalanced panel, keep only those 
> individuals with the 
> max duration
> */
> program balance, rclass
> version 8.2
> syntax id [if] [in]
> qui     {
>     sort id
>     by id: egen count=count(id)
>     sum count
>     local max=r(max)
>     keep if count==max
>     }
> end

*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index