.- help for ^hh^ (STB-43: dm58) .- Analysis of Household Data in Individual (long) Format ------------------------------------------------------ ^hhset^ hrespnr sex [weight] [^, p^artner^(^varname^) v^erify] ^hhtab^ varlist [^if^ exp] [^in^ range] [^, an^y ^al^l tabulate_options] ^hhcount^ [=] exp [^if^ exp] [^in^ range] [^, an^y ^al^l] ^hhcorr^ varlist [^if^ exp] [^in^ range] [^, an^y ^al^l ^l^abel] ^hhi^ [^if^ exp] [^in^ range] [^, ad^d ^an^y ^al^l ^p^reserv ^nok^eep ^nol^abel] ^:^ any_cmd ^hh_is^ ^hh_slct^ [^if^ exp] [^in^ range]^, g^en^(^varname^)^ [^an^y ^al^l] Description ----------- The ^hh^ package provides facilities to analyze 2-level datasets of individuals nested within households in ^long-format^ (i.e., the data of husbands and wives are provided in separate records), without going through the tedious and slow process of ^reshape^-ing the data between long and wide format. ^hhset^ specifies the household identification variable and a variable specifying the sex of the respondents. Neither variable should contain missing values. ^hh^ expects the coding (1=Husband, 2=Wife). ^hhset^ generates an "internal variable" HH_idrop that marks the records that do not belong to a heterosexual couple. This information is stored in the data using characteristics associated with ^_dta^ named ^HH_^name, and thus has to be supplied only once. If the data on a household contains records on other people besides the "main couple" (typically, children; parents at old age, etc.), a ^partner^ variable should specified that identifies the "main" couple by value 1 and all other household members by value 0. The ^hh^ package requires that a household contains 0 or 2 partners. ^hhset^ typed without arguments displays the names of the key variables. ^hhtab^ displays a husband-by-wife tabulation of the variables in varlist. Any ^tabulate^ option (e.g., ^cell nofreq^) may be specified. A one-variable tab is displayed for variables that do not vary within households. ^hhcount^ displays whether an expression is true for husband and wife as a 2x2 table if the logical values for the husbands and wives may be different, otherwise a 2x1 table is shown. In contrast with ^count^, the expression should ^NOT^ be preceded with an ^if^ keyword, and the leading ^=^ is optional. Note that the ^if^ and ^in^ clauses may be used to select a subset of the data (see below for details). ^hhcorr^ lists the husband-wife (intra-household) correlations and marginal summary statistics for husbands and wives for the variables in varlist. ^hhi^ facilitates mixed-level statistical analyses in which individual and household characteristics are combined. ^hhi^ is a pre-command like ^xi:^ or ^by:^ that operates on any Stata command with the regular syntax ^command^ varlist [^if^ exp] [^in^ range] [weight] [^,^ options] Using ^hhi^, it is possible to include in the varlist prefixed-variables that define variables from individual level data: ^h.^varname husband's value of varname ^w.^varname wife's value of varname ^m.^varname mean of varname for husband and wife ^d.^varname husband-wive difference for varname ^a.^varname absolute difference between husband and wife in varname If a variable is specified without a prefix, ^hhi^ verifies that it is indeed constant within each household. ^hhi^ generates new variable names by prepending or replacing the first character with the function prefix. ^hhi^ may yield incorrect results if the data contains variables with embedded periods in the variable names. You are strongly advised to provide ^if^/^in^ restrictions of the sample via ^hhi^, and not via any_cmd. If your command includes weights, beware that weights should be constant within households. Utilities for hh-programmers (if any) ^hh_is^ verifies that the dataset in memory is indeed in "HH (household) format". Otherwise it displays an error message. ^hh_slct^ performs couple/household selection from the individual selection via ^if^/^in^. The selected obs are marked by a variable specified via the required option ^gen^. General options --------------- ^any^ specifies that households should be included in which AT LEAST ONE of the partners is selected by the ^if^ and/or ^in^ clauses. ^all^ specifies that households should be included in which BOTH partners are selected by the ^if^ and/or ^in^ clauses. Options of ^hhset^ ---------------- ^verify^ specifies that the user is requested to verify the appropriateness of the value labels of the sex identifier. Options of ^hhcorr^ ---------------- ^label^ specifies that variable labels are included in the display table instead of the means and std. deviations of the variables for husbands and wives. Options of ^hhi^ -------------- ^add^ specifies that the names of the newly generated variables are formed by prefixing the function-prefix to the name of the variable without dropping the first character of the variable name. ^nolabel^ suppresses that a table describing the constructed variables is displayed. ^preserv^ specifies that the data are saved to disk prior to modifcation. ^nokeep^ specifies that the constructed variables are not kept in the data. With these variables, the use of the standard Stata post-estimation commands (e.g., ^predict^) is possible. Note: you can drop the constructed variables manually using the command ^drop $HHI_VARS^, where ^HHI_VARS^ is a global macro that contains the variables constructed by ^hhi^. Warning: The documentation in STB on the defaults of the options label and keep are incorrect. Examples -------- . ^hhset hrespnr sex^ (hhset will determine heterosexual couples) . ^hhtab edu region^ (2-way tabulate of incomes of h. and w. region is constant with hh, so a 1-way tab is shown) . ^hhcorr income age^ (h-w correlations for income and for age) . ^hhcount income > 3000^ (2x2 table whether h/w have incomes over 2000) Combination of ^if^ clause and ^any^,^all^ options . ^hhcorr income if region==1^ (h-w corr of income in region 1) . ^hhcorr income if age>45^ (error, "age>45" not constant in hh) . ^hhcorr income if income>45, any^ (h-w corr income if h or w of age>45) . ^hhcorr income if income>45, all^ (h-w corr income if h and w of age>45) ^hhi^ facilitates powerful "multilevel" analyses without reshaping data. For instance, to perform a logistic regression of whether or not the couple is married (0/1) from the mean (half total) household income and the religions of husband and wife (assumed to be interval variables), we can issue the command . ^hhi: logistic married m.income h.religion w.religion^ The command ^hhi^ will verify that the variable married is constant within the data, i.e., both partners agree that they are married to one another --- something that could not be taken for granted in our data. The prefix command ^hhi^ may of course also be used for other tasks. E.g. . ^hhi: graph m.age w.age, border xlab ylab^ . ^hhi: spearman m.edu w.edu^ Acknowledgments --------------- The ^hh^ package was designed to facilitate the analysis of the household data "Households in the Netherlands 1995" (HIN95), a survey held among approximately 1500 couples and approximately 300 singles as part of the PIONIER program "The Management of Matches" (NWO grant PGS 50-370 to Raub/Weesie). Clearly, the ^hh^ package may be used for the analysis of data on other "asymmetric" matches (data on ordered pairs) than households such as buyer supplier relationships. Unfortunately the husband and wife labels have been hard-coded into the package. Note that truely symmetric relations (e.g., homosexual/lesbian couples, research alliances between firms) cannot be adequately represented because a analogue of the sex variable may not be reasonable. Author ------ Jeroen Weesie Utrecht University Netherlands weesie@@weesie.fsw.ruu.nl Also See -------- STB: STB-43 dm58 Manual: [R] ^reshape^ On-line: help on @reshape@, @st@, @xt@