Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: backfill missing data

From   Steve Samuels <>
Subject   Re: st: backfill missing data
Date   Tue, 24 Aug 2010 12:45:56 -0400

David Torres:

As the data originated in different questionnaires, at one time they
were in some kind of long form. Perhaps it would be easier if you went
back to that format to solve your problem, with the aid of explicit
[_n] subscripting ("help subscripting"). The concatenation and
renaming appear to have complicated matters unnecessarily.

Steven Samuels
18 Cantine's Island
Saugerties NY 12477
Voice: 845-246-0774
Fax: 206-202-4783

On Tue, Aug 24, 2010 at 12:14 PM, David Kantor <> wrote:
> At 11:55 AM 8/24/2010, David Torres wrote:
>> I'm working with longitudinal data (12 rounds of info collected so
>> far) and need to backfill information for respondents who were not
>> interviewed in a given year subsequent to round 1.  Information on my
>> variables of interest, when not collected in a round due to
>> noninterview, can be gathered in the next round in which respondents
>> are interviewed.  I'd like to carry that information back so that it
>> fills in the missing cells in the year and job number to which it
>> should apply.
>> I've concatenated unformatted date variables for each year and job
>> number so that start and finish dates for a job are carried back
>> together.  Every pair of numbers, then, including the space in
>> between, represent a start and finish date.  All dates here, though
>> for example purposes only, are year specific.  An example of what I
>> have, then, is:
>> pubid stfin1_1998 stfin2_1998 stfin1_1999 stfin2_1999 stfin1_2000
>> stfin2_2000
>> 1     13901 14200 14100 14200                         14247 14590
>> 2     13890 14198                                     14310 14525
>> 3                                                     14000 14208 14311
>> 14915
>> 4                             13883 14650 14351 14600 14635 14900
>> For pubid 1, the values in stfin1_2000 would be copied to stfin1_1999
>> as it applies to that year.  The same goes for pubid 2.  In pubid 3,
>> stfin1_2000 should be copied to stfin1_1998 as it applies to that
>> year; stfin2_2000 should be copied to stfin1_1999 since it applies to
>> that year.  In pubid 4, stfin1_1999 should be copied to stfin1_1998.
>> I only mean to copy follow-up year information to cells for which
>> current year information is missing, or ". ."
>> Is there an easy way to do this across several years and job numbers
>> at the same time?  Perhaps using a foreach command?
> I recommend reshaping to long, though it may be complicate by having the
> stfin1_ stfin2_ variables to be of the same series. You may need to do
> something clever to make that happen.
> Follow that by a use of carryforward. See -ssc desc carryforward-.
> You may want to go backward as well as forward (or maybe backward only). The
> help for carryforward explains that.
> Finally, if you prefer, reshape it back to the way it was. Though, it may be
> better to let it stay in long form.
> --David
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index