Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Going through each observation of a variable

From   Derya Karaci <>
To   "" <>
Subject   Re: st: Going through each observation of a variable
Date   Sat, 8 Jun 2013 11:52:19 -0700 (PDT)

Hi David, 

Organization of the data is that I simply copy-pasted these prices in the data as additional variables. The variables Price1 and Price2 has 500 observations, each row representing a price vector. os1 and os2 are the expenditure shares for each individual and has 80,000 observations. 

I am computing Y for each individual as the expenditure share of good1 for each individual (os1), multiplied by price of good 1 (P1) plus the same for good 2. If I had only a single price vector, this is straightforward to compute. I could just write 'genY=os1*P1+os2*P2'. But I have 500 different price vectors. I would like to generate Y 500 times, and take the average across the 500.    

The program I posted choose randomly from these price vectors. But I don't want randomness at this stage. I would like to compute Y for each price vector one by one...This is what I meant by replication. 

Here is an example with 3 price vectors and 10 individuals to show what I am trying to do:

Thanks again, greatly appreciated! 


----- Original Message -----
From: David Kantor <>
Sent: Friday, June 7, 2013 11:38 AM
Subject: Re: st: Going through each observation of a variable

Hello Derya

At 12:47 PM 6/7/2013, you wrote:
>David, thanks a lot for your answer. You are right that my question 
>was not clear. The data set has about 80,000 individuals, it is just 
>these particular variables that represent the price scenarios have 
>500 observations. I would like to go through each of the 500 prices 
>one by one, and evaluate the expression for each individual, and 
>create a new variable for each replication (Y_`k' where k=1,..,500). 
>Each Y_`k' will have 80,000 observations. Then I would like to 
>summarize horizontally across replications, not across individuals. 
>At the end, each individual will have the mean and standard 
>deviation of the 500 different price effects. Mean (wmean) and 
>standard deviation (wsd) will also be variables with 80,000 observations.
>The code below tried to follow 
> But 
>actually I don't want to randomly select from 500, I want the 
>program to go through each observation one by one, create a new 
>variable for each replication, and take a horizontal mean and sd 
>across replications.
>I hope this is something feasible to do...

Do you mean that you have 80000 observations -- each of which has 500 
price1 variables and 500 price2 variables (and similarly for os1 and os2 )?
If so, how are they named?
Or are they in long shape?
In any case, a long-shaped data structure is usually easier to work 
with. (That would be 40000000 observations. A lot of data either way!)
Show us the data structure: what are the variables? How many 
observations? What is a "replication"? Are they sets of variables 
(wide) or observations (long)?
Let us know, and then we will have more to work with to help you.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index