Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Re: Compress and saveold


From   "Gauri Khanna" <[email protected]>
To   [email protected]
Subject   RE: st: Re: Compress and saveold
Date   Sun, 21 Jan 2007 08:19:00 +0000

Dear Maarten and Michael,

Thank you so much for your answers. I now have learnt that the 80 character limit is a Stata/SE problem and not a Stata 8 problem.

But most of all, I thank you for the options that you sent. While hacking off the characters seem most appropriate at the moment since these variables are more like comments but I certainly value the other solutions.

I tried to hack off the string and compress if after and it works!

Thank you again.

Regards,

Gauri



From: "Michael Blasnik" <[email protected]>
Reply-To: [email protected]
To: <[email protected]>
Subject: st: Re: Compress and saveold
Date: Sat, 20 Jan 2007 14:47:49 -0500

There are a couple of issues.

1) The 80 character limit is not a v8 vs. v9 issue but a Stata/SE vs. regular (intercooled) Stata issue.

2) I'm not sure why you are surprised that -compress- didn't affect the length of strings that actually use all of their length. You don't seem to understand what -compress- does. It will change the storage type of a variable to the type that requires the least space/memory without losing any of the information. A 244 character string cannot fit into a str80 variable if it actually has more than 80 characters in it. You can try to -trim()- the variables, but it sounds like you've already found that it won't help.

To deal with this problem, you may want to look first at what information you actually need that is in those long strings. Perhaps you can find another way of holding the information besides long text strings. I see three basic alternatives:

a) Just keep the first 80 characters and discard the rest:

replace mystring=substr(mystring,1,80)
compress mystring

b) find ways to shorten the strings without losing information, for example if there are long substrings you could abbreviate then you could use a series of -subinstr- calls to make it shorter:

replace mystring=subinstr(mystring,"Incorporated","Inc",.)
....
compress mystring

c) break the string into a series of shorter strings:

gen mystring1=substr(mystring,1,80)
gen mystring2=substr(mystring,81,160)
gen mystring3=substr(mystring,161,240)
drop mystring

The course you take should depend on what's in the strings that you need.

Michael Blasnik


----- Original Message ----- From: "Gauri Khanna" <[email protected]>
To: <[email protected]>
Sent: Saturday, January 20, 2007 2:20 PM
Subject: st: Compress and saveold



Dear List members,

I sent a stata dataset created in version 9.2 to a stata 8 user who ran into the 80 characters limit problem for string variables. As you know, Stata 8 only supports 80 characters on string variables whereas the dataset I created in version 9.2 has a couple of string variables that are in excess of 80.

1. So I tried to compress these variables. But when I look at them again using the -describe- command I find no difference in their length? Does that mean that these variables are not compressed or cannot be compressed?

Here is the output I get on the 10 variables that I compress:
<snip>

These are exaclty of the same length prior to compressing!

I also eyeballed the data to see if there are leading and trailing blanks but unfortunately I do have some observations that fill in the entire string length. So I cannot use -trim-, -ltrim-, and -substr()-.

2. I have used the command -saveold GM- where GM is the name of my dataset but am not sure if that will work. Do any of you know about -saveold- solving the problem?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index