Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Selecting a sample to compromise between significant size and geographical dispersion

From	Partho Sarkar <[email protected]>
To	[email protected]
Subject	st: Selecting a sample to compromise between significant size and geographical dispersion
Date	Thu, 15 Sep 2011 12:55:30 +0530

I wonder if any Statalister has some ideas/insight to share on the
following "fuzzy" problem.

Broadly speaking, I want to select a sample from a very large
population to achieve a "good" compromise between excluding
"insignificant" units, and ensuring "reasonable" diversity.  I have a
hierarchical dataset on prices of some commodities from markets across
the country.  (The geographical levels being:
national-state-district-market.  Markets are the primary units). I
want to consider the prices only from "significant" markets, i.e., for
each commodity, markets which have trading volumes at least equal to
the median volume (say).  BUT, I also want to ensure as complete a
geographical coverage as possible.

Ideally, I would have a set of parameters to control the
"significance" (as defined above) and the "dispersion" (geographical)
of markets for each commodity, and a method to optimally select the
"best" parameters.  E.g., if I were to try to do this manually, I
might first set the median trading volume as a cut-off, this would
result in a certain selection of markets, with an associated
geographical pattern. (What could be a meaningful way to measure the
degree of dispersion?) If on inspection I found that the cut-off
resulted in "too much" geographical concentration, I would reduce the
cut-off, and so iterate till I got a "good" compromise.

I imagine this sort of consideration comes up fairly commonly in some
areas, and there might be established methods/programs to handle this,
whether in Stata or otherwise (I am familiar with Matlab & R).  Any
ideas?

Thanks & Regards
Partha S. Sarkar
Consultant Econometrician
Indicus Analytics Pvt. Ltd (www.indicus.net)
New Delhi, India
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Selecting a sample to compromise between significant size and geographical dispersion
  - From: Stas Kolenikov <[email protected]>
- Re: st: Selecting a sample to compromise between significant size and geographical dispersion
  - From: Maarten Buis <[email protected]>

Prev by Date: Re: st: Imposing bounds on parameters estimated with -optimize-
Next by Date: Re: st: Q on Graph Combine: how to put 3 or less graphs on a 2x3 format (leaving multiple holes)
Previous by thread: st: Exporting -dfuller - and -pperron- output to MS Word or MS Excel
Next by thread: Re: st: Selecting a sample to compromise between significant size and geographical dispersion
Index(es):
- Date
- Thread