[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
David Kantor <dkantor@jhu.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st:How to do analysis if the same variable exists in one datasetand is missing or contains no observation in another database? |

Date |
Fri, 08 Aug 2003 10:15:45 -0400 |

You seem to have changed the structure. You now have 8 vars: a, b, ..., h.

And now var5 depends on e through h, rather than a through d.

Let me assume that you want to use all the existing vars in varlist1 -- for both collecting into a set of independent vars, and for creating an additional composite variable. (I will call the composite variable compvar rather than var5.)

I see no need to create a set of new variables; that will only waste space, which seems to be scarce in this particular problem. Instead, just form a new varlist that consists of only those that are actual variables (and not completely missing). Thus it is a subset of varlist1.

Here's my suggestion, borrowing some ideas from what Nick wrote (which I would not have thought of myself):

local varlist1 "a b c d e f g h" // or whatever you might have

foreach x of local varlist1 {

capture confirm var `x'

if _rc==0 {

capture assert mi(`x')

if _rc==0 {

drop `x'

}

else {

local varlist2 "`varlist2' `x'"

}

}

if trim("`varlist2'") ~= "" {

egen compvar = eqany(`varlist2'), v(1)

regress depvar `varlist2' compvar

}

----

One little point to remember: this picks up any variable that is not completely missing. Thus, for example, if you have a million observations, and a variable is nonmissing on only one of them, it will be included. But the regression will be limited to only those observations that are nonmissing on all variables.

It's not clear whether you want compvar to be 0 or missing when all of the other vars are not 1. If it is to be made missing, then you want to also do...

replace compvar = . if compvar == 0

And I note that your regression is now limited to cases where at least one of the other independent variables is == 1. But it is puzzling why you want compvar included among the independent vars. Perhaps you meant...

regress depvar `varlist2' if compvar==1

Or perhaps I misinterpreted your structure.

----

Good luck with this.

-- David

At 07:29 PM 8/7/2003 -0400, you wrote:

Thanks Nick and David for the help.

David, I mean the exact correspondence: var1 for a, var2 for b, etc.

If a variable is absent, it would be preferable not to create it, since the

program is huge.

For example, if a does not exist or contains no observation, thus, it would be

preferable not to create var1.

Even though we create new variables with missing values, they would be

irrelevant for my regressions.

For var5, if at least one of the variables exists, then I want to use it to

create var5. If not, it will not be created.

Nick, all the code is there. My intend is simple.

Let me rewrite all my program below: my dependent variable is depvar (which is

common to all files).

local varlist1 "a b c d e f g h"

foreach x of local varlist1 {

capture confirm var `x'

if _rc==0 {

capture assert mi(`x')

if _rc==0 {

drop `x'

}

else {

g var1=a

g var2=b

g var3=c

g var4=d

g var5=.

replace var5=1 if e==1| f==1| g==1| h==1

}

}

regress depvar var1 var2 var3 var4 var5 /*if one of them exists*/

The first suggestion of Nick seems good, but since I have a lot of variables to

create, it will be very difficult to rewrite the code for each of them.

I will try his second suggestion.

Best regards.

Amadou DIALLO,

AFTHD, The World Bank.

David Kantor Institute for Policy Studies Johns Hopkins University dkantor@jhu.edu 410-516-5404 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:

- Prev by Date:
**st: RE: RE: Getting saved results from ineqdec0 for groups** - Next by Date:
**st: testing differences of proportions** - Previous by thread:
**RE: st:How to do analysis if the same variable exists in one dataset and is missing or contains no observation in another database?** - Next by thread:
**Re: st:How to do analysis if the same variable exists in one dataset and is missing or contains no observation in another database?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |