Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: data management


From   n j cox <[email protected]>
To   [email protected]
Subject   Re: st: data management
Date   Mon, 02 Apr 2007 12:52:40 +0100

Carole J. Wilson had a solution that solved the problem.
I agree with her analysis. You need to loop over observations.

Here is a slightly simpler solution along the same lines.

One way to tackle this is just to -count-. -count- is one of the most
neglected Stata commands, but see

SJ-7-1 pr0029 . . . . . . . . . . . . Speaking Stata: Making it count
Q1/07 SJ 7(1):117--130 (no commands)
discusses count used with a loop over observations or variables

People who want to count often create a variable with 1s and 0s and then -summarize-. The sum of the values is, naturally, is the same as the number of 1s. If you do do this, make sure you do it directly: Use
-summarize, meanonly- and pick up -r(sum)-.

Using -egen, total()- is a similar method, but about 100 times as much
code for Stata to interpret. (-viewsource egen.ado- and -viewsource
_gtotal.ado- to see this.

-summarize- for counting can be a good method, but -count- is often better.

In the first example, Sim wants, I think,

. count if ID != 1 & size == 6
& market_exit > market_entry[1]
& market_entry < market_entry[1]

You can automate that:

gen exists_size_6 = 0

qui forval i = 1/`=_N' {
count if ID != ID[`i'] & size == 6 & ///
market_exit > market_entry[`i'] & ///
market_entry < market_entry[`i']
replace exists_size6 = r(N) in `i'
}

So, the basic idea is just

generate a count variable
loop over observations {
count how many "match" this observation
replace count variable with count in this observation
}

Nick
[email protected]

Sim Oertel

I have a question regarding the definition of a new variable in a data
set. I would like to define this new variable in the way that it will
tell me how many subjects with a specific characteristic exist at a
specific time. For example: Suggesting a simplified data set of nine
subjects with information regarding their market-entry, market-exit, and
size. In this case, I would like to know how many subjects with the size
6 were in the market at the same time with each other subject.

Example:
ID Market-Entry Market-Exit Size
New_Variable
1 10 98 1
1
2 15 80 1
0
3 30 110 1
2
4 41 97 1
1
5 76 85 1
1
6 77 218 2
2
7 85 220 3
2
8 82 118 6
2
9 99 218 6
2


For ID1, for example, subject 8 entered at time 82, while ID1
markets-exist is 98. The other subject with size 6 is ID9. But ID9
entered the market after the market-exist of ID1. So the variable should
hold the value 1 (One subject with size 6 was present in the market at
the same time as ID1). Regarding subject ID2, ID8 and ID 9 entered after
the market-exit of ID2, so the new variable should hold the value 0. For
ID3, the new variable should hold the value 2, because both subjects
with the size 6 were present in the market at the same time than ID3,
and so on.


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index