Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: survey answers imported from google, checkbox type
From 
 
Steven Young <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: survey answers imported from google, checkbox type 
Date 
 
Sun, 21 Apr 2013 18:40:11 -0700 
I did read them, and I just used reshape, but it is still somewhat lacking...
After I used split, I wrote "gen id = _n" and reshaped based on the
stub from the split
Now I have:
id      _j          stub
1       1           boy girl
1       2           boy boy
1       3           girl girl
2       1           boy girl
2       2           girl girl
2       3
3       1           boy boy
3       2
3       3
4       1           girl girl
4       2
4       3
I managed to assign a numerical value through gen and recode to each
stub's data, and relabeled it as well.
Two last questions:
There's an option for "none". That means someone did not pick "boy
girl", "boy boy" or "girl girl". As it is, right now that option would
show up under _j = 1.
id     _j     stub
5      1      none
5      2
5      3
What's the best way to take this into consideration/introduce this var
so that all id's have that 4th option??
How do I now re-sort/reshape this back into wide so that it shows up as:
        Var1       Var2       Var3       Var4
1     boy girl   boy boy  girl girl
2     boy girl                  girl girl
3                    boy boy
4                                   girl girl
5                                                 None
On Sun, Apr 21, 2013 at 5:26 PM, Nick Cox <[email protected]> wrote:
> Force is on your mind. Better to think of persuasion. Specific answers below.
>
> Nick
> [email protected]
>
>
> On 21 April 2013 23:15, Steven Young <[email protected]> wrote:
>> Thanks Nick for your reply.
>>
>> I used tabsplit from tab_chi (SSC). However it just lists the
>> tabulation. Is there a way to force it to create new variables based
>> on the splitting?
>
> -tabsplit- restructures the dataset temporarily to do what it does.
> The tabulation uses -tabulate-, but the original data structure is
> restored. That's deliberate. If you want something else, you are free
> to clone the program and rewrite it accordingly.But this is the same
> question as the next really, so see below.
>
>> I read through the Stata support, and I liked split. I can use it to
>> break the compound strings at the "," (comma).
>>
>> One thing I'm running into now, is that for instance in the original Var:
>> 1    "boy girl, boy boy, girl girl"
>> 2    "boy girl, girl girl"
>> 3    "boy boy"
>> 4    "girl girl"
>>
>> When using split, it of course makes 3 new vars called Var1, Var2, Var3.
>> It also splits the data in the order it sees it.
>>
>>     Var1       Var2       Var3
>> 1  boy girl  boy boy   girl girl
>> 2  boy girl  girl girl
>> 3  boy boy
>> 4  girl girl
>>
>> Is there a way to force split to appropriately split them so that they
>> are under the same var name?
>
> You want a -stack- or -reshape-. Advice at length was given in the
> references I gave earlier, so I have to guess you haven't read them.
>
> Nick
>
>> On Thu, Apr 18, 2013 at 1:40 AM, Nick Cox <[email protected]> wrote:
>>> See (for example)
>>>
>>> -tabsplit- in -tab_chi- (SSC)
>>>
>>> FAQ     . . . . . . . . . . . . . . . . . . .  Dealing with multiple responses
>>>         . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox and U. Kohler
>>>         4/05    How do I deal with multiple responses?
>>>                 http://www.stata.com/support/faqs/data/multresp.html
>>>
>>> SJ-5-1  st0082  . . . . . . . . . . . . . . . Tabulation of multiple responses
>>>         (help _mrsvmat, mrgraph, mrtab if installed)  . . . . . . . .  B. Jann
>>>         Q1/05   SJ 5(1):92--122
>>>         introduces new commands for the computation of one- and
>>>         two-way tables of multiple responses
>>>
>>> SJ-3-1  pr0008   Speaking Stata: On structure & shape: the case of mult. resp.
>>>         . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox & U. Kohler
>>>         Q1/03   SJ 3(1):81--99                                   (no commands)
>>>         discussion of data manipulations for multiple response data
>
>>> Nick
>>> [email protected]
>>>
>>> On 18 April 2013 08:29, Steven Young <[email protected]> wrote:
>>>
>>>> So I have a survey with answers imported from Google.
>>>>
>>>> One of the questions asks "Which have you heard of" and lists 4 items below
>>>> in a checkbox fashion (tick all that you know).
>>>>
>>>> Google aggregated the data into one cell, so a person (each row) may answer
>>>> "a, b, d", a second may answer "a, b, c" and a third may answer "a, d".
>>>> Unfortunately each of these answers are quite long... not as short as a, b,
>>>> c, d. I also cannot change how Google "aggregates" this data into one cell.
>>>>
>>>> Now the issue I have is that when it's imported to stata, it will list in
>>>> one cell, each of the selected items, separated by comma.
>>>>
>>>> How do I go about making a "do" file that will go through this and find out
>>>> what each person answered, ie make sub-columns of answer choice a, b, c, d,
>>>> and then assigning a value of 1 to each column that the person answered?
>>>>
>>>> For instance if Joe answered a, b, d, then his answer columsn will be 1, 1,
>>>> 0, 1.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/