Aha! It's called compositional data, and there's a way to deal with it. thanks, Jeph On 3/13/2012 12:01 PM, Austin Nichols wrote:

Jeph Herrin<stata@spandrel.net>: findit dirifit On Tue, Mar 13, 2012 at 11:43 AM, Jeph Herrin<stata@spandrel.net> wrote:This is not really a Stata question, but a modelling question, though hopefully there is a Stata command that will provide a solution. I have a dataset which contains reported totals and subtotals from approximately 5k respondents. For example, 5k firms are asked about their revenues from various sources, which are then summed to create total revenues. (Specifically, it is an electronic form which forces respondents to provide subtotals that equal their total). However, some respondents only report their totals, and for these I am trying to estimate the subtotals. A basic approach would be to simply calculate the ratio for each sub category (eg, sales revenue/total revenue) for existing data and use this to estimate the missing values. However, because I have a lot more information on the respondents - for instance, number of employees, assets, etc - I think I can model the subtotal more accurately using these variables. The problem is, the subtotals estimated this way do not sum to the total. Concretely, let's say there are three variables -x1-,-x2-,-x3-, and that their total t = x1 + x2 + x3. What I'd like is something like multivariable regression mv x1 x2 x3 = indvar1 + indvar2 + indvar3 with the constraint that x1+x2+x3 = t Does this problem have a name and/or a standard solution? thanks, Jeph* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

