# Re: st: Weighted Euclidean distances with panel data

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: Weighted Euclidean distances with panel data Date Wed, 9 Sep 2009 10:55:03 -0400

James Cross<jcross@tcd.ie> :
This would be fairly easy to program in Mata, but you can also
-reshape- to wide format and calculate the distances using -generate-
if you know how you are going to treat missing values.  Once you can
specify the actual calculations for your example, it will be easier to
specify a method.

On Wed, Sep 9, 2009 at 9:12 AM, James Cross<jcross@tcd.ie> wrote:
> Hi all,
>
> I have a large panel dataset which contains information on different
> actor positions on different issues (dimensions) within different
> legislative proposals. Each actor position has a saliency score
> associated with the position by which I hope to weight the importance
> of the issue/dimension to that actor.  In essence, I am trying to
> calculate the weighted Euclidean distances between each actors'
> position and a reference point.
>
> In order to do this I first need to create submatrices of the dataset,
> structured as row vectors, that contain the distances between the two
> points of interest for each actor for each issue/dimension for each
> proposal. That is, I should end up with a row vector for each actor of
> distances between the actors' position and the reference point. The
> number of elements in this row vector is determined by the number of
> issues in each proposal in the panel data. While I can do this for
> each observation individually, I am wondering if it is possible to get
> stata to do it automatically to save me the effort.
>
> The resulting vector will then need to be multiplied by its transpose
> and a diagonal matrix of issue salience for each actor but that cannot
> be done until I have created the individual actor distance vectors.
> There is also an issue with missing data in that sometimes the
> reference point will be missing and some actors will not have
> positions on all of the issues/dimensions.
>
> The data is structured as follows:
>
>
> proposal    issue    actor    position    ref point
> 04163           1     1         0                   0
> 04163           2     1         0                   0
> 04163           1     2         0                   0
> 04163           2     2         0                   0
> 00032           1    1          100                0
> 00032           2    1          50                  n/a
> 00032           3    1          100                100
> 00032           4    1          100                0
> 00032           1    2          100                0
> 00032           2    2          0                   n/a
> 00032           3    2          0                   100
> 00032           4    2          0                    0
> 00032           1    3          40                  0
> 00032           2    3          100                n/a
> 00032           3    3          100                100
> 00032           4    3          100                0
>
> The resulting vector would look like this for proposal 00032 actor 1:
> [100 n/a 0 100], and for proposal 04163 actor 1: [0 0].
>
> I am not sure if this is even possible in stata or if it is, how much
> programming is involved.
>
> Any suggestions welcome.