[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: reading data with missing obs

From   Michael Hanson <>
Subject   Re: st: reading data with missing obs
Date   Sun, 08 Feb 2009 14:37:28 -0500

On Feb 8, 2009, at 3:14 AM, Glen Waddell wrote:

I am trying to read in data from multiple xml files (although this issue is not specific to this format, I believe) in which some of the sheets within some of the files have variable names in the first row and nothing else. That is, all variables are missing in some of the sheets within some of the files.

In reading in these files, Stata only appears to read in the first variable... not continuing to subsequent columns. If I had but a few files I would brute force this by filling the empty cells with some unique character, read them in and then drop the obs accordingly. However, I have many of these sheets to be read in, so automating it would be quite valuable.

I recently had a similar situation in which columns of data (from an Excel spreadsheet saved in xml format) started with missing values, and thus were not read in correctly with the basic -xmluse- command. After some experimentation, I found I had to read the data as "allstring" (an option of -xmluse-) and then -destring- the desired series after Stata imported them as string variables in order to missing values to be captured correctly. For example:

xmluse [file.xml], doctype(excel) cells([b2:z300]) allstring firstrow missing nocompress
destring [varlist], ignore([stray_text]) replace

Note that you will need to replace the items in [square brackets] with those appropriate to your problem. Also note that this solution is specific to xml formatted data, even if the problem (as you suggest) is not.

Hope this helps,

*   For searches and help try:

© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index