```Dear Joseph

Thank you for your response. I guess the problem formulation/wording I gave was not quite accurate.
I'll try to be more clear.
The 6-month "padding" exercise is not my main objective.

Since I am working with a multi-record-per-case observational data set, some of the explanatory variables are time-varying and therefore carry different information by year which is why I need to split records every year when a job span is greater than one year.

More explicitly, I need to split by calendar year throughout the job duration, i.e. : from the start date of a job to the end of that calendar year, then every year until the year when the job ends.
An important twist in my data is that we also have information on exposure level which has a start and end date.
My unit of observation is the combination of a job (specified by a start and end date), hearing loss variables (gathered at each hearing test) and exposure level (specified also with a start and end date).
As you can probably infer, some records will show redundant job dates because either there are several tests during the job, or there is different exposure dates (exposure can vary within a job, for example when something has been changed in the workplace).

I have been trying to approach this using the recommendation that both you and Maarten have kindly offered, that is to declare that the data is -stset- and use -stsplit-
Unfortunately, even when I "faked" that my sample was survival data, I have failed (so far) splitting the duration times when records span more than a year. One of the reason is that I could not figure out how to have varying splitting intervals as each of the subject (n=13,000) has multiple jobs with different  durations.

I hope I made this question clearer.

Hind Sbihi
School of Occupational and Environmental Health.
University of British Columbia

> Hind Sbihi wrote:
>
> The dataset I am working with comprises repeated observations for several
> thousands subjects. Each observation consists of a job/exposure
> level/hearing test combination. The data is long shaped.
> I am trying to truncate these units of observation by year.
>
>
> My main objective is to expand these observations and obtain something like
> this output
>
>
> I would be grateful if you would let me know how to deal with this problem.
>
>
> What you want seems to be padding the dataset with empty six-month date
> intervals, which seems like an unusual Main Objective, even if it's for
> display or "reporting".
>
> If this dataset padding activity per se is not the ultimate objective of
> your analysis, then my suggestion would be to pause, step back and
> re-confirm what it is that you're trying to get out of your data.
>
> Perhaps your ultimate objective is better approached with time-series
> setting of your data (-tsset-) and related commands, or, as Maarten
> suggested:  declaring the dataset to be survival data (-stset-) and then
> taking advantage of one or more subsequent -st- commands.
>
> Stata's data-management commands related to time-series and survival
> datasets include convenient "house-keeping" commands that might be useful in
> related to time-series or survival analysis.
>
> Joseph Coveney
