Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: shifting existing values to subsequent variables in a row


From   "Buzz Burhans" <buzzb3@earthlink.net>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: shifting existing values to subsequent variables in a row
Date   Sat, 7 May 2011 13:14:40 -0600

I have been given a several datasets with ~50K observations and 197
variables.

Of the 197 variables, 180 are obtained from 20 measurement occasions where 9
measurements were made. After the initial 17 variables describing the
subject (in my case individual cows), there are a series of 9 measurements
from  the first measurement occasion, repeated in the same order for each
subsequent measurement occasion

Inspecting the data, it is clear that in a small but substantial proportion
of the observations/cases the last datum is not present for a given
measurement occasion, with the consequence that the value provided for all
the subsequent variables is shifted left by one variable. It is quite clear
from the content of the data in the remaining variables in a  row ( a
mixture of text and
numbers) that the values are offset one column to the left, i.e. one
variable to the left.

I can fairly easily identify the "bad" observations with an -if- statement.
Such observations  occur randomly and not usually consecutively.  The
problem can occur in
multiple subsets of the data, as it appears to have occurred with the final
datum on different measurement occasions (but not every one, only in some
instances), so whatever procedure I come up with will need to be reusable
for application to data that is shifted after a later measurement occasion.
I am trying to come up with an easy way to shift all the remaining
values back to the correct variable.  The 9 variables on each occasion all
have different names, though within a
measurement occasion the prefix is the same for all 9 measurements
(variables) attached to that occasion (i.e. day1this  day1that day1other
day1more...)

The data look something like this for 2 observations where  the first is
ok, the second with a absent  missing value and subsequent values shifted
right::

 var1 - var25  day1No8    day1last    day2first day2second
    day2third.....
___________|________|________|_______|___________|___________....

Ok values....|  100              Hot               25            
 120                 UP
Bad values..|  100              25               120          
    UP          

What I need is:

var1 - var25  day1No8      day1last    day2first    day2second
   day2third.....
___________|________|________|_______|___________|___________....

Ok values....|  100              Hot               25         
    120                 UP
shift values.|  100                .                   25            
 120                 UP

I would appreciate any ideas on how to accomplish this efficiently when
applied to multiple observations.

Buzz


Buzz Burhans, Ph.D.

Dairy-Tech Group
So. Albany, VT / Twin Falls ID

Cell: 208-320-0829
ID Fax: 208-735-1289
VT Fax: 802-755-6842

Email: buzzb3@earthlink.net


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index