Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: [Re: How to program a loop to calculate the value of an observation]

From   Andreas Reinstaller <>
Subject   st: [Re: How to program a loop to calculate the value of an observation]
Date   Sat, 03 Feb 2007 16:27:09 +0100

I have found a solution that does not strike me to be particularly elegant, but that works. I just use the indices of the variables in Stata, where NoEnt_nsc1 TO_nsc1 TO_Nace NoEmpl_nsc1 LET are the variables I use:

In a simple do file I have this code
--------------------------------------------------begin calc ----------------------------------
local size=_N

/* Ave_HHI nsc1 */

sort country time SecCode sizeclass
generate ntx=_n
generate newntx=.
local i = 1
forvalues i = 1(1)`size' {

local nof = NoEnt_nsc1[`i']
if `nof'==. local nof = 0
local szs = 0

if `nof' > 0 {
if `nof' == 1 {
local szs = (NoEnt_nsc1[`i']*(100*(TO_nsc1[`i']/NoEnt_nsc1[`i'])/TO_Nace[`i'] )^2) }
else{ local j = 0
forvalues j = 1(1)`nof' {
local sz = 0
local sz = ((100*((TO_nsc1[`i']/NoEmpl_nsc1[`i']*LET[`i'])+(2*`j'*(TO_nsc1[`i'] -(NoEnt_nsc1[`i']*(TO_nsc1[`i']/NoEmpl_nsc1[`i']*LET[`i'])))/(NoEnt_nsc1[`i']*(NoEnt_nsc1[`i']-1))))/ TO_Nace[`i'])^2)
local szs = `szs' + `sz'
qui by country time SecCode sizeclass: replace newntx = `szs' if ntx==`i'

by country time SecCode: egen Ave_HHI_nsc1=sum(newntx)

---------------------------------------------------------------- end calc --------------------------------------------

For 12000 obs it takes about 15 minutes to run on a dual core pentium with 2.8ghz and Windows XP in Stata 8.2.

If somebody has ideas about how to improve the speed, I should be grateful.


-------- Original-Nachricht --------
Betreff: How to program a loop to calculate the value of an observation
Datum: Sat, 03 Feb 2007 12:25:02 +0100
Von: Andreas Reinstaller <>

Dear Stata-community

I have a multicountry panel of industrial sector data where I have observations for instance for the number of firms, the turnoveror people employed grouped in specific size classes, defined say as firms smaller than 10 employees, between 10 and 50, and so forth. With these data I try to calculate a concentration index. While the data are sectoral the index to be calculated makes some assumptions about the size distribution of firms within each size class to calculate the concentration (for the initiated it is a Schmalensee concentration index, as used for instance in the OECD Structural and Demographic Business Statistics). More specifically, the index uses the number of firms in each size class, as well as information on turnover, employment etc in that class, to proxy the size of each firm in that size class. Line local sz=((100* .... in the little code fragment below shows how it is done.

Now my first -- clearly unsuccessful trial -- to get my index looked as follows. I just report it in order for you to grasp what I want to do, as it is a program that did not work:

Calling it with:

by country year sectorcode sizeclass: ave_hhi NoFirms SizeclassEmploym SizeclassTurnover SectoralTurnover EmplThreshold_sizeclass

it should do

program define ave_hhi, byable(recall)
syntax varlist(min=5 max=5 numeric) /*, newvarname(string) */
marksample touse
tokenize `varlist'
local ni `"`1'"'
local Ei `"`2'"'
local Si `"`3'"'
local S `"`4'"'
local EMi `"`5'"'

if `touse'{
local szs = 0
if (`ni'!=0){
if (`ni'==1) {
local szs=(`ni'*(100*(`Si'/`ni')/`S' )^2) }
else {
local j = 0
while `j' <= `ni' {
local sz = 0
local sz = ((100*((`Si'/`Ei'*`EMi')+(2*`j'*(`Si' -(`ni'*(`Si'/`Ei'*`EMi')))/(`ni'*(`ni'-1))))/`S')^2)
local szs = `szs' + `sz'
local ++j
/* quietly gen `newvarname'=`szs'*/
replace newntx=`szs'
else {
/* quietly gen `newvarname'=0*/
replace newntx=.

Now, this of course, as the Stata FAQs on using the if qualifier vs if command in programs argues ( will not work. The if's and the while are used - in line with Stata syntax - badly. The program will (and indeed does) just check the if (`ni'!=0) for one time, keeps the result and - as the first data in my set this condition is not fulfilled - replaces all newntx with a dot. The same problem would come up later when the program would run through the "while" loop condition. The FAQ section suggests I use if as a qualifier in a generate or replace command rather than as a command in itself, but as my calculation involves evaluating the while loop to get the value I want to replace, this is definitely not an option. I can do it for the if (`ni'==1) condition, as it does not involve the loop and I could directly assign the value, but that does not really help.

Being used to program in Matlab or any other programming language, my next idea was to specify a function, that would return exactly the value that is calculated in the loop so that I could use the return value in the replace/generate command using the if qualifier for all conditions `ni'>1, like for example the functions that can be used jointly with the egen command. However, browsing trough the Statalist archive, I see that Stata does not allow to specify any such function.

Finally, I thought to circumnavigate the problem by just simply specifying a little program where I transform the variables I need into matrices and do the calculations by using the Stata matrix commands - a little bit as if I was using Matlab. However, as my variables have more than 60000 observations I run into a matsize problem and can't pursue this avenue as well.

Now: does anybody see a way how I could get what I want easily, so that I don't have to export my data into Matlab and do this simple calculation there?

Any help is very much appreciated!


Dr.Andreas Reinstaller

Department of Economics
Institute for International Economics and
Development (VW7)
Vienna University of Economics and Business
Augasse 2-6 1090
Vienna Austria

Tel: +43 1 313 36 5254
Fax: +43 1 313 36 9209


* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index