Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: choosing how to collapse very large datasets


From   Austin Nichols <[email protected]>
To   [email protected]
Subject   Re: st: choosing how to collapse very large datasets
Date   Thu, 21 Oct 2010 23:34:08 -0400

Hind Sbihi <[email protected]>:
You should also have a participant id, right?  Try e.g.

g bysec=floor(time)
collapse hr-rating, by(id songid bysec) fast
egen g=group(id songid)
su g, mean
loc maxg=r(max)
foreach v of var hr-skintemp {
g mean_`v'=.
g trend_`v'=.
g var_`v'=.
qui forv i=1/`maxg' {
 qui reg `v' bysec if g==`i'
 replace trend_`v'=_b[bysec] if g==`i'
 replace mean_`v'=_b[_cons]+22.5*_b[bysec] if g==`i'
 replace var_`v'=e(rsme) if g==`i'
 }
}
xtreg rating mean* trend* var*, i(id)

On Thu, Oct 21, 2010 at 11:14 PM, Hind Sbihi <[email protected]> wrote:
> Hello stata users
>
>  The data I have collected has physiological measurements (variables in col 3 to 7) collected at 256Hz while study participants listen to a song and give the song a rating (last column).
>  Because of the chosen frequency we generated 256 observations per second.
> Every study participant (n=50) listens to 45 second excerpt for each of 37 songs.
>  The volume of the data set is simply overwhelming at this stage and I am considering different options for starting at least to visualize the data (e.g. rating vs. physiologic responses) before doing any analysis.
>
>  My question is: how can I aggregate the data?
>  Collapse() seems to be the appropriate command but I am wondering which arguments should go in the command.
>  Below is a snapshot of what the data looks like for the first song for one participant.
>
>  time   songid          hr     hraccel         scr        dscr         emg    resprate    skintemp   rating
>  0        1   000063.73   -00000.87   000001.72   -00000.00   000003.49   000050.44   000028.15        4
>  .0039063        1   000063.73   -00000.87   000001.72   -00000.00   000003.49   000050.44   000028.15        4
> .0078125        1   000063.73   -00000.87   000001.72   -00000.00   000003.49   000050.44   000028.15        4
> .011719        1   000063.73   -00000.87   000001.72   -00000.00   000003.49   000050.44   000028.15        4
> .015625        1   000063.73   -00000.87   000001.72   -00000.00   000003.49   000050.44   000028.15        4
> .019531        1   000063.73   -00000.87   000001.72   -00000.00   000003.49   000050.44   000028.15        4
> .023438        1   000063.73   -00000.87   000001.72   -00000.00   000003.49   000050.44   000028.15        4
> .027344        1   000063.73   -00000.87   000001.72   -00000.00   000003.48   000050.44   000028.15        4
> .03125        1   000063.73   -00000.87   000001.72   -00000.00   000003.48   000050.44   000028.15        4
> .035156        1   000063.73   -00000.87   000001.72   -00000.00   000003.48   000050.43   000028.15        4
>
>  Many thanks in advance for your suggestions.
>
>  Hind

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index