Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: AW: RE: AW: RE: Transposing datasets


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   st: AW: AW: RE: AW: RE: Transposing datasets
Date   Mon, 2 Aug 2010 15:28:58 +0200

<> 

Nick himself advocated "first principles" in
http://www.stata.com/statalist/archive/2010-05/msg01165.html, btw...



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Martin Weiss
Gesendet: Montag, 2. August 2010 15:20
An: [email protected]
Betreff: st: AW: RE: AW: RE: Transposing datasets


<> 

"( Also, I learned something about using subinstr() in the rename command
from your post, thanks )"


Cheers! I love to work from first principles whenever possible, so my use of
-subinstr()- was not intended to detract from the appeal of NJC`s -findit
renvars-...



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Eric Booth
Gesendet: Montag, 2. August 2010 15:11
An: [email protected]
Betreff: st: RE: AW: RE: Transposing datasets

<>

I misread in the OP what gvkey was...you're right, there's no need for "id".
My post was delayed--I had sent my email before yours came through-- so I
hadn't intended for mine to be some kind of comment/alternate to your post,
as yours was clearly better.   ( Also, I learned something about using
subinstr() in the rename command from your post, thanks )

~ Eric
____________________________________________________________________________
____
____________________________________________________________________________
____
From: [email protected]
[[email protected]] on behalf of Martin Weiss
[[email protected]]
Sent: Monday, August 02, 2010 8:02 AM
To: [email protected]
Subject: st: AW: RE: Transposing datasets

<>

Why do we need another "id" variable? In Eric`s code, it is created via

*************
g id = _n
*************

Is "gvkey" not supposed to be the "id" variable?


HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Eric Booth
Gesendet: Montag, 2. August 2010 14:51
An: [email protected]
Betreff: st: RE: Transposing datasets

<>

You can't name a new variable with a number as the first character (e.g.,
"31jun1980").  So, -tostring- your datadate var first:

*********!
clear
inp gvkey   str20(datadate)        mcap_sum
212782  30jun2005       4946.9
212782  31jul2005       5042.1
212782  31aug2005       5145
212782  30sep2005       5302.5
212782  31oct2005       5253.5
212782  30nov2005       5642.7
212782  31dec2005       6230
end
**set up data**
g datadate2 = date(datadate, "DMY")
format datadate2 %td
drop datadate
rename datadate2 datadate


//1. make date a string var//
tostring datadate, force replace u


//2.  reshape wide using datadate//
 g id = _n
reshape wide mcap_sum, i(id) j(datadate) string
ds mcap_*
//3. move all obs for a gvkey to one line//
foreach v in `r(varlist)' {
        bys gvkey: egen `v'2 = max(`v')
        drop `v'
        rename `v'2 `v'
          }
        by gvkey: g o = 1 ==_n
        keep if o==1
        drop o

*********!
 ~  Eric
____________
______________________________
________________________________________
From: [email protected]
[[email protected]] on behalf of Kaspar Dardas
[[email protected]]
Sent: Monday, August 02, 2010 6:37 AM
To: [email protected]
Subject: st: Transposing datasets

Hi guys,

I have a dataset with about 32000 observations, which is in long
format (see structure below). gvkey is the identifier for a firm
(about 600 different firms), datadate is the monthend value between
2002 and 2010, which of course repeats in the dataset (again, long
format) and mcap_sum is my observation, which is different for each
month and gvkey.

gvkey   datadate        mcap_sum
212782  30jun2005       4946.9
212782  31jul2005       5042.1
212782  31aug2005       5145
212782  30sep2005       5302.5
212782  31oct2005       5253.5
212782  30nov2005       5642.7
212782  31dec2005       6230
etc...

Well, I would like to transpose my dataset so it shows each month as a
variable and the observations are mcap_sums. My tries with reshape
failed miserably. (xpose wont work because I still want to keep
mcap_sum as an observation).. Does anybody has a suggestion to solve
this quickly?

gvkey    31dec2005   30nov2005   31oct2005
212782  6230              5642.7           5253.5   ...........


Best,

Kaspar
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index