Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Weighting on Sub-samples of Complex Survey Data and Specifying Correlation for PA Models

From   Ryan McCann <>
To   <>
Subject   st: Re: Weighting on Sub-samples of Complex Survey Data and Specifying Correlation for PA Models
Date   Thu, 6 May 2010 15:36:31 -0400


Thank you for your response.  I have a couple of follow up questions.  In
addition, I don't think I was clear with regard to the definition of the
Credit Card variable.

I am using CreditCard= balance of the businesses' credit card in a given
month (e.g. a small business will use a credit card to purchase a computer,
buy a plane ticket for a business meeting, etc.) The idea being that Credit
Cards act as a way of covering financing gaps that are created by cautious
lending practices in traditional bank lending. I believe you were
understanding the credit card variable to be the amount or number of credit
card transactions that consumers created while buying a small business'
products.  Given this, do you still believe there is an endogeneity problem?

With regard to the last paragraph regarding regressing lnRev on X, I am
regressing lnRev on lnCreditCard, so I was simply interpreting this as an
elasticity.  Am I missing something?

Regarding the controls for error correlation.  Xtreg, PA automatically
assumes vce(robust) and clusters on the individual identified when the panel
is set.  In this instance (xtset firm_id year)  I've added dummies in for
state and 2/3 of them are significant.  There are about 1200 observations
over about 600 firms, so the average panel for individual firms is approx. 2
years.  I'm still wondering if the "exchangeable" correlation specification
is most apt given my prior findings or not.

Regarding you weighting schemes.  I understand the first, simply taking a
mean of a dummy with in a group of cohorts.  I want to make sure I
understand the second:  I would be running a logit of INSAMPLE on the actual
variables I was using to define my cohort groups (over the entire sample),
and then predict the probability from the variable values within each group?
The baseline year of the survey would probably offer the largest complete
sample (least non-responses), but there may still be significant gaps.  I
will give this some thought though.

Thanks again for your comments and I look forward to your thoughts on this.

Ryan McCann
Senior Analyst
Keybridge Research LLC
Office: 202.965.9487 | Mobile: 774.521.8874

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index