Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: alternate data management strategies for importing Excel matrices


From   Steven Nakoneshny <scnakone@ucalgary.ca>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: alternate data management strategies for importing Excel matrices
Date   Mon, 16 Dec 2013 16:05:50 -0700

Dear Statalist,

A colleague provided me with an Excel file with two tabs each containing a matrix of de-identified IDs. I wish to convert these matrices into a single var of unique IDs as I will need to -merge- them with patient data shortly. My initial attempts were to use -reshape- but I couldn’t get past the r(498) error "variable _j contains all missing values”.

However, I was able to achieve my desired end result by looping over individual columns in the spreadsheet and appending the results together. Here is my (successful) code:

— code begins — 

tempfile blank
g accnum=""
save `blank'
clear

foreach sheet in "large tumor TMA" "#2 TMA" {
	foreach x in B C D E F G H I J K L M N O P Q R S T U V W {
		import excel using “foo",sh("`sheet'") cellra(`x'4:`x'21)
		keep `x'
		duplicates drop `x',force
		rename `x' accnum
		append using `blank'
		save `blank',replace
		clear
	}	
}	

use `blank'
duplicates drop accnum,force
drop if inlist(accnum, "tonsil", "placenta", "pancreas", "no core", "liver", “kidney”, “")

— code ends —

As with anything else Stata, I thought this could be a tremendous learning opportunity if anybody could suggest other commands by which I could arrive at the same result.

Steve

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index