Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: shifting existing values to subsequent variables in a row

From   "Buzz Burhans" <>
To   <>
Subject   st: shifting existing values to subsequent variables in a row
Date   Sat, 7 May 2011 13:14:40 -0600

I have been given a several datasets with ~50K observations and 197

Of the 197 variables, 180 are obtained from 20 measurement occasions where 9
measurements were made. After the initial 17 variables describing the
subject (in my case individual cows), there are a series of 9 measurements
from  the first measurement occasion, repeated in the same order for each
subsequent measurement occasion

Inspecting the data, it is clear that in a small but substantial proportion
of the observations/cases the last datum is not present for a given
measurement occasion, with the consequence that the value provided for all
the subsequent variables is shifted left by one variable. It is quite clear
from the content of the data in the remaining variables in a  row ( a
mixture of text and
numbers) that the values are offset one column to the left, i.e. one
variable to the left.

I can fairly easily identify the "bad" observations with an -if- statement.
Such observations  occur randomly and not usually consecutively.  The
problem can occur in
multiple subsets of the data, as it appears to have occurred with the final
datum on different measurement occasions (but not every one, only in some
instances), so whatever procedure I come up with will need to be reusable
for application to data that is shifted after a later measurement occasion.
I am trying to come up with an easy way to shift all the remaining
values back to the correct variable.  The 9 variables on each occasion all
have different names, though within a
measurement occasion the prefix is the same for all 9 measurements
(variables) attached to that occasion (i.e. day1this  day1that day1other

The data look something like this for 2 observations where  the first is
ok, the second with a absent  missing value and subsequent values shifted

 var1 - var25  day1No8    day1last    day2first day2second

Ok values....|  100              Hot               25            
 120                 UP
Bad values..|  100              25               120          

What I need is:

var1 - var25  day1No8      day1last    day2first    day2second

Ok values....|  100              Hot               25         
    120                 UP
shift values.|  100                .                   25            
 120                 UP

I would appreciate any ideas on how to accomplish this efficiently when
applied to multiple observations.


Buzz Burhans, Ph.D.

Dairy-Tech Group
So. Albany, VT / Twin Falls ID

Cell: 208-320-0829
ID Fax: 208-735-1289
VT Fax: 802-755-6842


*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index