Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: reshaping to wide format, and need to create a "j" variable


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: RE: reshaping to wide format, and need to create a "j" variable
Date   Tue, 25 Oct 2011 22:15:15 +0100

I think this is backwards in two senses. For most purposes this is the better data structure for later analyses, so your problem is the other way round, to -reshape long- the other files. Also, it is not at clear that you will want to -merge- these; how would that be done, on which variables? It sounds much likely that you will want to -append-. 

All that said, what you need for a -reshape wide- is 

bysort id (placement) : gen j = _n 

whereas what you ask for is 

bysort id (placement) : gen j = _N

which won't do the job at all. Your anonymous colleagues who think that -foreach- is required are assigned to suffer this tutorial: 

SJ-2-1  pr0004  . . . . . . . . . . Speaking Stata:  How to move step by: step
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q1/02   SJ 2(1):86--102                                  (no commands)
        explains the use of the by varlist : construct to tackle
        a variety of problems with group structure, ranging from
        simple calculations for each of several groups to more
        advanced manipulations that use the built-in _n and _N

But my main point is that I don't follow your diagnosis. 

Nick 
[email protected] 

Kendra Lewis

I have data that tracks foster children's transitions and placements.
The data is in long format, such that children have the same ID
variable over time, but their ID variable appears in several different
rows. Each time their ID variable appears indicates a move or
transition. Here is an example dataset:

id           placement     date of placement    age at placement
1              1                  6.7.2009                       14
1              2                  8.2.2010                       15
1              3                  2.3.2011                       15
1              1                  3.4.2011                       15
2              1                  5.4.2009                       12
3              1                  4.6.2009                       13
3              2                  7.8.2010                       14
4              1                  4.5.2009                       10
4              2                  6.7.2009                       10
4              3                  5.2.2010                       11
4              2                  7.8.2010                       11
4              3                  9.9.2010                       12
4              1                  1.4.2011                       12
5              1                  7.8.2009                       13
5              2                  6.4.2010                       14

So, id #1 has had 4 placements, #2 has only had 1, #3 as had 2, and so
on. The data needs to be put into wide format to merge in other
datasets of a similar format. I know the necessary command will be
reshape wide stub, i(id) j(??)
where stub will be the varlist-as there is no common stub in the
dataset. What I need is a "j" variable that indicates a count of the
number of times the id number appears. For example, the count for id
#1 is 4, for id #2 is 1, for id #3 is 2, for id #4 is 6, and for id #5
is 2. Then I can use this for the reshape command to be my "j"
variable.
Does anyone have any suggestions? I've spoken to a few people and we
think it may be some sort of "foreach" loop command but we are not
sure.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index