Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: reshaping key-value pair data


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: reshaping key-value pair data
Date   Wed, 2 Oct 2013 01:02:45 +0100

I guess you have a question: Is it possible ... [not "It is possible..."]

Here is one idea:

. clear

. input item str5 key str6 value

          item        key      value
  1. 1 color blue
  2. 1 color red
  3. 1 size XL
  4. 2 color orange
  5. 2 size S
  6. end

. bysort item key (value): gen which = value[1]

. bysort item key : replace which = value[_n-1] + "@" + value if _n > 1
(1 real change made)

. by item key : replace which = which[_N]
(1 real change made)

. by item key : keep if _n == _N
(1 observation deleted)

. drop value

. reshape wide which, j(key) string i(item)
(note: j = color size)

Data                               long   ->   wide
-----------------------------------------------------------------------------
Number of obs.                        4   ->       2
Number of variables                   3   ->       3
j variable (2 values)               key   ->   (dropped)
xij variables:
                                  which   ->   whichcolor whichsize
-----------------------------------------------------------------------------

. split whichcolor, p(@)
variables created as string:
whichcolor1  whichcolor2

. renpfix which

. drop color

. list

     +-------------------------------+
     | item   size   color1   color2 |
     |-------------------------------|
  1. |    1     XL     blue      red |
  2. |    2      S   orange          |
     +-------------------------------+
Nick
[email protected]


On 2 October 2013 00:31, Dimitriy V. Masterov <[email protected]> wrote:
> I have some data in an awkward key-value pair format:
>
> item key value
> 1 color blue
> 1 color red
> 1 size XL
> 2 color orange
> 2 size S
>
> It is possible to reshape this data into something like this:
>
> item color1 color2 size
> 1 blue red XL
> 2 orange S
>
> The order for the values should be alphabetical,so blue before red.
>
> I tried the following:
>
> gen color = value if key=="color"
> gen size = value if key=="size"
>
> sort item key value
> collapse (firstnm) color1=color (lastnm) color2=color (firstnm) size, by(item)
>
> This mostly works, but it won't work for more than 2 values per key
> and orange appears twice for item 2.
>
> DVM
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index