Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: alternate data management strategies for importing Excel matrices
From
Steven Nakoneshny <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: alternate data management strategies for importing Excel matrices
Date
Mon, 16 Dec 2013 16:05:50 -0700
Dear Statalist,
A colleague provided me with an Excel file with two tabs each containing a matrix of de-identified IDs. I wish to convert these matrices into a single var of unique IDs as I will need to -merge- them with patient data shortly. My initial attempts were to use -reshape- but I couldn’t get past the r(498) error "variable _j contains all missing values”.
However, I was able to achieve my desired end result by looping over individual columns in the spreadsheet and appending the results together. Here is my (successful) code:
— code begins —
tempfile blank
g accnum=""
save `blank'
clear
foreach sheet in "large tumor TMA" "#2 TMA" {
foreach x in B C D E F G H I J K L M N O P Q R S T U V W {
import excel using “foo",sh("`sheet'") cellra(`x'4:`x'21)
keep `x'
duplicates drop `x',force
rename `x' accnum
append using `blank'
save `blank',replace
clear
}
}
use `blank'
duplicates drop accnum,force
drop if inlist(accnum, "tonsil", "placenta", "pancreas", "no core", "liver", “kidney”, “")
— code ends —
As with anything else Stata, I thought this could be a tremendous learning opportunity if anybody could suggest other commands by which I could arrive at the same result.
Steve
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/