Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

From
Nick Cox <njcoxstata@gmail.com>

To
statalist@hsphsun2.harvard.edu

Subject
Re: st: survey answers imported from google, checkbox type

Date
Mon, 22 Apr 2013 08:48:14 +0100

What's wrong with the original data structure? If you want wide, but something a bit different, it is likely to be easiest to return to that and then change. Nick njcoxstata@gmail.com On 22 April 2013 05:58, Steven Young <youngazn@gmail.com> wrote: > Ok I may have figured out how to .... re-arrange them. > > I used sort, by var: gen newj = 1 if _n == 1 > replace newj = sum(newj) > > Now I run into the problem of, trying to reshape back into Wide, it > says that newj is not unique within id; there are multiple > observations at the same newj within id. How can I persuade it to > ignore that and just reshape it? > > I think maybe this solution may not work, because I created a newj > data value of 4, and so when it tries to remake the columns based on > newj, it will not know how to properly take that into account. > > I guess the same problem exists, because the original id does not have > the 4th "option" of "none"... > > On Sun, Apr 21, 2013 at 6:40 PM, Steven Young <youngazn@gmail.com> wrote: >> I did read them, and I just used reshape, but it is still somewhat lacking... >> >> After I used split, I wrote "gen id = _n" and reshaped based on the >> stub from the split >> >> Now I have: >> >> id _j stub >> 1 1 boy girl >> 1 2 boy boy >> 1 3 girl girl >> 2 1 boy girl >> 2 2 girl girl >> 2 3 >> 3 1 boy boy >> 3 2 >> 3 3 >> 4 1 girl girl >> 4 2 >> 4 3 >> >> I managed to assign a numerical value through gen and recode to each >> stub's data, and relabeled it as well. >> >> Two last questions: >> >> There's an option for "none". That means someone did not pick "boy >> girl", "boy boy" or "girl girl". As it is, right now that option would >> show up under _j = 1. >> id _j stub >> 5 1 none >> 5 2 >> 5 3 >> >> What's the best way to take this into consideration/introduce this var >> so that all id's have that 4th option?? >> >> How do I now re-sort/reshape this back into wide so that it shows up as: >> Var1 Var2 Var3 Var4 >> 1 boy girl boy boy girl girl >> 2 boy girl girl girl >> 3 boy boy >> 4 girl girl >> 5 None >> >> On Sun, Apr 21, 2013 at 5:26 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>> Force is on your mind. Better to think of persuasion. Specific answers below. >>> >>> Nick >>> njcoxstata@gmail.com >>> >>> >>> On 21 April 2013 23:15, Steven Young <youngazn@gmail.com> wrote: >>>> Thanks Nick for your reply. >>>> >>>> I used tabsplit from tab_chi (SSC). However it just lists the >>>> tabulation. Is there a way to force it to create new variables based >>>> on the splitting? >>> >>> -tabsplit- restructures the dataset temporarily to do what it does. >>> The tabulation uses -tabulate-, but the original data structure is >>> restored. That's deliberate. If you want something else, you are free >>> to clone the program and rewrite it accordingly.But this is the same >>> question as the next really, so see below. >>> >>>> I read through the Stata support, and I liked split. I can use it to >>>> break the compound strings at the "," (comma). >>>> >>>> One thing I'm running into now, is that for instance in the original Var: >>>> 1 "boy girl, boy boy, girl girl" >>>> 2 "boy girl, girl girl" >>>> 3 "boy boy" >>>> 4 "girl girl" >>>> >>>> When using split, it of course makes 3 new vars called Var1, Var2, Var3. >>>> It also splits the data in the order it sees it. >>>> >>>> Var1 Var2 Var3 >>>> 1 boy girl boy boy girl girl >>>> 2 boy girl girl girl >>>> 3 boy boy >>>> 4 girl girl >>>> >>>> Is there a way to force split to appropriately split them so that they >>>> are under the same var name? >>> >>> You want a -stack- or -reshape-. Advice at length was given in the >>> references I gave earlier, so I have to guess you haven't read them. >>> >>> Nick >>> >>>> On Thu, Apr 18, 2013 at 1:40 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>> See (for example) >>>>> >>>>> -tabsplit- in -tab_chi- (SSC) >>>>> >>>>> FAQ . . . . . . . . . . . . . . . . . . . Dealing with multiple responses >>>>> . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and U. Kohler >>>>> 4/05 How do I deal with multiple responses? >>>>> http://www.stata.com/support/faqs/data/multresp.html >>>>> >>>>> SJ-5-1 st0082 . . . . . . . . . . . . . . . Tabulation of multiple responses >>>>> (help _mrsvmat, mrgraph, mrtab if installed) . . . . . . . . B. Jann >>>>> Q1/05 SJ 5(1):92--122 >>>>> introduces new commands for the computation of one- and >>>>> two-way tables of multiple responses >>>>> >>>>> SJ-3-1 pr0008 Speaking Stata: On structure & shape: the case of mult. resp. >>>>> . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox & U. Kohler >>>>> Q1/03 SJ 3(1):81--99 (no commands) >>>>> discussion of data manipulations for multiple response data >>> >>>>> Nick >>>>> njcoxstata@gmail.com >>>>> >>>>> On 18 April 2013 08:29, Steven Young <youngazn@gmail.com> wrote: >>>>> >>>>>> So I have a survey with answers imported from Google. >>>>>> >>>>>> One of the questions asks "Which have you heard of" and lists 4 items below >>>>>> in a checkbox fashion (tick all that you know). >>>>>> >>>>>> Google aggregated the data into one cell, so a person (each row) may answer >>>>>> "a, b, d", a second may answer "a, b, c" and a third may answer "a, d". >>>>>> Unfortunately each of these answers are quite long... not as short as a, b, >>>>>> c, d. I also cannot change how Google "aggregates" this data into one cell. >>>>>> >>>>>> Now the issue I have is that when it's imported to stata, it will list in >>>>>> one cell, each of the selected items, separated by comma. >>>>>> >>>>>> How do I go about making a "do" file that will go through this and find out >>>>>> what each person answered, ie make sub-columns of answer choice a, b, c, d, >>>>>> and then assigning a value of 1 to each column that the person answered? >>>>>> >>>>>> For instance if Joe answered a, b, d, then his answer columsn will be 1, 1, >>>>>> 0, 1.

