Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: binary format type str question

From   David Kantor <>
Subject   Re: st: binary format type str question
Date   Tue, 13 Mar 2007 09:29:16 -0400

I'm not sure how you are determining the end of the data segment. I always understood that the allocation for string data (and all data) is in fixed-length chunks. So I expect 641 bytes per record (if I added correctly), for a total of 32691 bytes.

If your str98 value is only, say, 70 characters long, then character 71 is a null character (zero). Characters 72 - 98 are irrelevant, but probably filled with nulls as well. I would not interpret consecutive nulls as signifying empty values (for other variables).

One can imagine other ways of storing string data, but Stata uses fixed-length, fixed_position allocation.

Again, I would look into how you decided where the end of the data segment is.
Also see - help dta-. (I am looking at the Stata 8 on-screen help. It says in one place that the allocated length of each var label is 33; later it says 81. I think 81 is correct.)


At 05:48 AM 3/13/2007, you wrote:

Thanks David. That's what I think I'm doing and it works for data_label and time_stamp, but it doesn't seem to work for the str types.

Here's an example. There are 6 variables with types 98, 136, 102, 105, 102, and 98. I read that as 6 str types with maximum lengths 98 bytes, 136 bytes, etc. There are 51 observations. But the remaining number of bytes is 1071. This means there are 3.5 bytes per datum. There aren't enough bytes to go around if I assume fixed lengths! One the other hand, if I try to start another variable as soon as I hit a zero, I find there are multiple zeros in a row, which would seem to indicate no data for some variables. Hmm. Clearly I'm missing something.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index