Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Stata, data processing, databases, and consultants


From   "Michael Blasnik" <[email protected]>
To   <[email protected]>
Subject   st: Re: Stata, data processing, databases, and consultants
Date   Fri, 07 Sep 2007 09:36:23 -0400

...
I have worked on several projects where I use Stata to do everything -- get information from multiple data sources with different file types and formats, clean up the data, automatically generate reports on potential data errors/anomalies, create a .dta "database", run statistical analyses, and create formatted tabular and graphical output (written as a set of linked html files). I've even set up these systems to be run by people with no Stata knowledge where everything works through a simple dialog box.

You may also want to look at this prior NASUG presentation by Ed Bassin http://ideas.repec.org/p/boc/asug04/15.html

If you are comfortable with Stata and don't need any real database-specific features (like those mentioned by Phil Schumm), then I think Stata can work well for this type of job. The main downside is that fewer people are familiar with Stata which makes code maintenance a potential issue -- what do they do if you leave?

Michael Blasnik

----- Original Message ----- From: "Buzz Burhans" <[email protected]>
To: <[email protected]>
Sent: Thursday, September 06, 2007 10:15 PM
Subject: st: Stata, data processing, databases, and consultants



Dear statalisters,

I'd appreciate the opinions of those who have experience using / programming
Stata in applications that interconnect with local databases where data are
accessed, processed, and reports generated by other people who are not the
programmer or creator of the system.

I am affiliated with small group of consultants to the dairy industry. We
collect, aggregate, and assess data about dairy herd performance. The data
are varied in type, and include individual animal health and performance
records which are typically updated monthly, pen level data aggregated from
the individual animal data, pen level data collected during herd visits on
handwritten forms, monthly herd level data, diet data captured from other
software, and interactive forms that allow updating cost and price
information.  The individual animal data is exported from multiple different
herd management software programs, and require a fair amount of manipulation
to standardize and process into our reports.

Recently I have used Stata to accumulate and process the data.  Using
Stata's -insheet- and -odbc- capabilities I use a combination of 2
databases: MS Access and Alpha 5, an Excel spreadsheet used as a data
repository, other Excel spreadsheets for reporting with ranges which are
populated with imported text files which I generate in Stata do files.  It
mostly runs from a custom User menu in Stata. It has become unwieldy, though
it works.

My questions:

1. The current form of this reporting system is centered on using Stata to
process the data because I can accomplish enough Stata programming to
automate this -but I am by no means an accomplished Stata programmer.  It is
likely that the whole project should be handed over to a database consultant
and reconstructed as a single unified database.  However, for several
reasons I would prefer to keep it linked to Stata. Is anyone using Stata for
data processing and report generating like this, where the data repositories
are small local databases? Is it foolish not to convert such a system
entirely to a database program (and move it all away from Stata?)?

2. On the other hand, I really like Stata's data management capabilities.
We already process pretty much completely in Stata using *.do files; I am
tempted to also move the database portions of this project to Stata *.dta
files (as well as the processing).  Stata's -mmerge-, -merge-, -append-, and
-joinby- commands make this attractive to me. Has anyone used *.dta files as
a database, or is this foolish and would it be much better to use real
databases for the data repository?

As far as the questions I raise above, I am interested in opinions from and
on the list.

3. Is there anyone out there on the list who is a consultant who might be
interested in working on either completely reconstructing what we have now,
or renovating it to be more efficient?  It has gotten to be pretty unwieldy,
and my colleagues and I would like to "recreate" a system that is more user
friendly and did some additional data processing beyond what we are doing
now.  My bias would be for keeping the data processing in Stata, but we are
really looking for whatever might be the best. We'd like to explore this
with someone who does consulting of this nature, especially a consultant who
uses Stata and probably has experience generating reports using Microsoft
products. I am interested in possibly reporting in LaTex as well, but at
least initially have a bias for MS platforms for reporting.

I am the creator of what we have now, and would like to keep it linked to
Stata partly because I am serviceably competent at doing what I need to
accomplish in Stata.  My programming is far from efficient however, and the
update we are considering is more extensive than I really can commit the
time to.  I'd like to be able to continue to work with the project in the
future, but it may be best for us to hand it off for revision to someone who
does data processing for a living.  If there are consultants on statalist
who might be interested in discussions about the work and contracting to do
such work please contact me off the list. We are exploring possibilities at
this point, and would like to hear from anyone who might be interested in
the work.

Buzz

Buzz Burhans, Ph.D.

Dairy-Tech Group
Twin Falls, ID
Phone: 208-320-0829
Fax: 208-735-1289
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index