Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: identifying the unique values across variables and creating new var equal to them

 From Michael Barker To statalist@hsphsun2.harvard.edu Subject Re: st: identifying the unique values across variables and creating new var equal to them Date Thu, 28 Feb 2013 11:08:35 -0500

```I agree with Nick, reshape is the way to go for this problem, then
look for duplicates by couple id and birth date. Something like:

* reshape to kid level
reshape long ch par_ch , i(id) j(kid)
* reshape to kid-parent level
reshape long @ch , i(id kid) j(partner) string
* look for non-duplicate birthdays within couples
duplicates tag id ch if !missing(ch), gen(dup)
gen ex = dup==0

Then you might have to reshape or collapse back to child or couple
level, depending on your analysis plan.

Mike

On Wed, Feb 27, 2013 at 7:30 AM, Nick Cox <njcoxstata@gmail.com> wrote:
>
> A quick look at this leads me to the suggestion that you need a
> different data structure to answer this. it is usually much easier to
> check for consistency within a group of observations than to check
> within a set of variables.
>
> Another detail that leads to the same conclusion is that you want
> identify children, not couples.
>
> The only incantation I offer is -reshape-.
>
> Nick
>
> On Wed, Feb 27, 2013 at 12:19 PM, Ivanova, K.O. <K.O.Ivanova@uva.nl> wrote:
> > Dear all,
> >
> > I have a dataset where each row corresponds to a couple. Each partner within a couple was asked to report the birthdates of his / her children (not specified whether own or step). I need to find a way to identify the cases when one partner reported a birthdate which the other partner did not (this is the only way for me to identify children born outside the current union).
> >
> > This is my data structure (the values are fictional, in my data the values as coded as century month codes; the variables starting with "par_" are based on the answers of the partner of the respondent):
> >
> > id      ch1     ch2     ch3             par_ch1 par_ch2 par_ch3
> >
> > 1       3       9       12              7               9               12
> >
> > 2       2       5       3               .               .               .
> >
> > 3       9       11      7               7               4               11
> >
> > And I need to find a way, based on the data above, to create the following new variables (with their corresponding, for this example, values):
> >
> > id      ex1     ex2     ex3     par_ex1         par_ex2 par_ex3
> >
> > 1       3       .       .       7               .               .
> >
> > 2       2       5       3       .               .               .
> >
> > 3       9       .       .       .               4               .
> >
> > Basically, I need to find a way to check the value of each birthdate against the rest of the birthdates reported *within* a couple. If that birthdate is reported by the other partner as well, the child is common and thus, of no interest for me. When the birthdate is unique within the couple, that child is a child only of the partner that reported it and the new variable should take the  value of that birthdate.
> >
> > I hope the example is clear and would appreciate any help I can get.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```