Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: shifting existing values to subsequent variables in a row


From   "Buzz Burhans" <[email protected]>
To   <[email protected]>
Subject   RE: st: shifting existing values to subsequent variables in a row
Date   Sun, 8 May 2011 10:00:49 -0600

Thanks Nick.

Reshape is one of the strategies I was thinking of.

I thought though perhaps there might be an easier  or faster way involving
exporting the ID variable and the block of affected variables to a Mata
matrix, then renaming the columns of the matrix such that the new col name
is the previous column  name that was one column offset to the right, then
saving the Mata matrix with the renamed column names as a .dta file, and
then merging it back in to the original file.  Or possibly this could be
done by interacting Mata and Stata without the need to create the .dta file
to be merged back in.  I have not previously used Mata though, and am not
sure how to accomplish either of these in Mata, at least without spending
what seems to require a lot of time to learn enough Mata, which I can't see
happening soon because of other time commitments. 

Ultimately, I will need to reshape this data anyway so your suggestion is
quite appropriate, but it seems easier to do some of the data inspection and
cleaning (such as dropping some observations that are not usable for other
reasons) before I did the reshape.

Thanks for the suggestion.

Buzz
 

Buzz Burhans, Ph.D.

Dairy-Tech Group
So. Albany, VT / Twin Falls ID

Cell: 208-320-0829
ID Fax: 208-735-1289
VT Fax: 802-755-6842

Email: [email protected]
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Sunday, May 08, 2011 1:31 AM
To: [email protected]
Subject: Re: st: shifting existing values to subsequent variables in a row

I'd -reshape- such data to long. Then use -by:- and subscripting.

Nick

On Sat, May 7, 2011 at 8:14 PM, Buzz Burhans <[email protected]> wrote:
> I have been given a several datasets with ~50K observations and 197
> variables.
>
> Of the 197 variables, 180 are obtained from 20 measurement occasions where
9
> measurements were made. After the initial 17 variables describing the
> subject (in my case individual cows), there are a series of 9 measurements
> from  the first measurement occasion, repeated in the same order for each
> subsequent measurement occasion
>
> Inspecting the data, it is clear that in a small but substantial
proportion
> of the observations/cases the last datum is not present for a given
> measurement occasion, with the consequence that the value provided for all
> the subsequent variables is shifted left by one variable. It is quite
clear
> from the content of the data in the remaining variables in a  row ( a
> mixture of text and
> numbers) that the values are offset one column to the left, i.e. one
> variable to the left.
>
> I can fairly easily identify the "bad" observations with an -if-
statement.
> Such observations  occur randomly and not usually consecutively.  The
> problem can occur in
> multiple subsets of the data, as it appears to have occurred with the
final
> datum on different measurement occasions (but not every one, only in some
> instances), so whatever procedure I come up with will need to be reusable
> for application to data that is shifted after a later measurement
occasion.
> I am trying to come up with an easy way to shift all the remaining
> values back to the correct variable.  The 9 variables on each occasion all
> have different names, though within a
> measurement occasion the prefix is the same for all 9 measurements
> (variables) attached to that occasion (i.e. day1this  day1that day1other
> day1more...)
>
> The data look something like this for 2 observations where  the first is
> ok, the second with a absent  missing value and subsequent values shifted
> right::
>
>  var1 - var25  day1No8    day1last    day2first day2second
>     day2third.....
> ___________|________|________|_______|___________|___________....
>
> Ok values....|  100              Hot               25
>  120                 UP
> Bad values..|  100              25               120
>     UP
>
> What I need is:
>
> var1 - var25  day1No8      day1last    day2first    day2second
>    day2third.....
> ___________|________|________|_______|___________|___________....
>
> Ok values....|  100              Hot               25
>     120                 UP
> shift values.|  100                .                   25
>  120                 UP
>
> I would appreciate any ideas on how to accomplish this efficiently when
> applied to multiple observations.
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/ 
________________________________________
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1325 / Virus Database: 1500/3623 - Release Date: 05/07/11


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index