Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Clustering by school year

From	David Kantor <[email protected]>
To	[email protected]
Subject	Re: st: Clustering by school year
Date	Sat, 23 Oct 2010 22:39:16 -0400

At 10:03 PM 10/23/2010, Jose A wrote:

Would clustering by school year be as simple as generating avariable school_year = school identifier * year, and then using thisnew varialbe as the cluster?

Just from a practical standpoint, this could work, provided thatschool_identifier is numeric (preferably an integer).But you also need to assure that the values you get will constitute aone-to-one mapping of school_identifier and year to the resulting number.That is, there should be no distinct pairs of school_identifier andyear that map to the same value.Say that you have school_identifiers 200 and 201, and your years are2000 and 2010. You would have,

2000 * 201 = 402000
2010 * 200 = 402000
-- thus, a many-to-one mapping.

You need to inspect your set of years and school_identifiers to seeif something like this would happen.

If this situation arises, then you need some other scheme. You couldextract the unique pairs of school_identifier and year that occur inthe data. (Or, if you need to be more general, obtain the sets ofyears and school_identifier separately; form the cross-product; see-help cross-.) With this set, -gen long clusterid = _n-; save it, andlater merge your analysis file to this file.


HTH
--David

----
P.S., there is another numeric-based solution: either,
 k1 * school_identifier + year
or
 k2 * year + school_identifier

where k1 or k2 are strategically chosen constants:
 k1 > max(year)
 k2 > max(school_identifier)

One possibility is
 10000 * school_identifier + year
----

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Clustering by school year
  - From: Stas Kolenikov <[email protected]>

References:
- st: Clustering by school year
  - From: [email protected]

Prev by Date: st: Clustering by school year
Next by Date: st: How does Stata calculate percentiles?
Previous by thread: st: Clustering by school year
Next by thread: Re: st: Clustering by school year
Index(es):
- Date
- Thread