Jorge Eduardo Pérez Pérez

[email protected]

Re: st: Problem with -infix- with if qualifiers and strings?

Tue, 12 Nov 2013 21:51:37 -0500

again for checking my mistakes. I focused on the cannot be read message and overlooked other possible issues. Sergiy: It is quite possible that this data was generated by CSPro, however, I don't have access to the CSPro dictionaries, I only have the textfiles, so I can not use this command. Thank you anyway. -------------------------------------------- Jorge Eduardo Pérez Pérez Graduate Student Department of Economics Brown University On Tue, Nov 12, 2013 at 7:46 PM, Sergiy Radyakin <[email protected]> wrote: > I agree with Robert that this might be the reason for Stata's observed > behavior. But looking at the large picture of what Jorge is doing (and > specifically reading fixed ascii files with records of multiple types) > it reminds me reading a CSPro file. If this is indeed the case, Jorge > might benefit from a specialized command to import CSPro file to Stata > that would automatically parse the CSPro dictionary and import records > of individual type, or dump all records as separate Stata files. The > description of -usecspro- is available here: > http://ideas.repec.org/p/boc/usug13/17.html > or here (same) > http://www.stata.com/meeting/uk13/abstracts/materials/uk13_radyakin.pdf > > and Jorge can contact me directly for the program files. > > Best, Sergiy Radyakin > > On Tue, Nov 12, 2013 at 6:17 PM, Robert Picard <[email protected]> wrote: >> Here's how your code runs on my Mac using Stata 12.1. As you can see, >> I don't lose an observation. I'm pretty sure that the last line of >> your version of "test.txt" is missing a return character which causes >> Stata to skip the last line. As to -infix-'s behavior, it must read in >> values for -type- and -number- before it can try to evaluate -if >> type==20-. That generates the errors you see. One solution is to read >> the second number as as string and then convert it to float later, see >> example below. >> >> Robert >> >> --------------- test.txt ------------------------ >> 10ABC >> 20321 >> 10ZYX >> 20654 >> ------------------------------------------------- >> >> . infix type 1-2 str text 3-5 if type==10 using test.txt >> (2 observations read) >> >> . list >> >> +-------------+ >> | type text | >> |-------------| >> 1. | 10 ABC | >> 2. | 10 ZYX | >> +-------------+ >> >> . >> . clear >> >> . infix type 1-2 number 3-5 if type==20 using test.txt >> 'ABC' cannot be read as a number for number[1] >> 'ZYX' cannot be read as a number for number[2] >> (2 observations read) >> >> . list >> >> +---------------+ >> | type number | >> |---------------| >> 1. | 20 321 | >> 2. | 20 654 | >> +---------------+ >> >> . >> . clear >> >> . infix type 1-2 str snumber 3-5 if type==20 using test.txt >> (2 observations read) >> >> . gen number = real(snumber) >> >> . list >> >> +-------------------------+ >> | type snumber number | >> |-------------------------| >> 1. | 20 321 321 | >> 2. | 20 654 654 | >> +-------------------------+ >> >> On Tue, Nov 12, 2013 at 5:13 PM, Jorge Eduardo Pérez Pérez >> <[email protected]> wrote: >>> Dear Statalist. >>> >>> I have noticed weird behavior regarding -infix- and data where the >>> variable type may change per line. -infix- is not reading the dataset >>> properly. >>> >>> Here's an example. My dataset is: >>> >>> 10ABC >>> 20321 >>> 10ZYX >>> 20654 >>> >>> If I try to read the lines that start with 10 and the remainder as a >>> string, everything works: >>> >>> . clear >>> >>> . infix type 1-2 str text 3-5 if type==10 using test.txt >>> (2 observations read) >>> >>> . list >>> >>> +-------------+ >>> | type text | >>> |-------------| >>> 1. | 10 ABC | >>> 2. | 10 ZYX | >>> +-------------+ >>> >>> But if I try to read the lines that start with 20, where the remainder >>> is a number, Stata seems to be trying to read the lines that start >>> with 10 as well, producing a "cannot be read" message and , worse, >>> dropping observations from my data! >>> >>> . clear >>> >>> . infix type 1-2 number 3-5 if type==20 using test.txt >>> 'ABC' cannot be read as a number for number[1] >>> 'ZYX' cannot be read as a number for number[2] >>> (1 observations read) >>> >>> . list >>> >>> +---------------+ >>> | type number | >>> |---------------| >>> 1. | 20 321 | >>> +---------------+ >>> >>> I have replicated this in both Stata 12.1, Windows 7 and 13.1 on >>> Windows 8. Can someone replicate to see if this is a bug, or am I >>> missing something? In the meantime I will read my variables as strings >>> and destring afterwards. >>> >>> Thanks! >>> >>> -------------------------------------------- >>> Jorge Eduardo Pérez Pérez >>> Graduate Student >>> Department of Economics >>> Brown University >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

