Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: reading a txt file that loops


From   Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: reading a txt file that loops
Date   Sat, 16 Apr 2011 08:41:06 -0700

<>

You could use the Mata text processing machinery to rearrange the file:

*--------------------------------------------------------sears-data-in.txt
FIPS        1990       1980       1970       1960
00000  248709873  226545805  203211926  179323175 United States

18000    5544159    5490224    5193669    4662498 Indiana
18001      31095      29619      26871      24643 Adams County
18003     300836     294335     280455     232196 Allen County
18005      63657      65088      57022      48198 Bartholomew County

FIPS        1950       1940       1930       1920
00000  151325798  132164569   12320262  106021537 United States

18000    3934224    3427796    3238503    2930390 Indiana
18001      22393      21254      19957      20503 Adams County
18003     183722     155084     146743     114303 Allen County
18005      36108      28276      24864      23887 Bartholomew County
*--------------------------------------------------------sears-data-in.txt

*--------------------------------------------------------sears.do
clear*
clear mata

mata
filein = fopen("E:\programming\stata\Examples\sears-data-in.txt", "r")
fileout = fopen("E:\programming\stata\Examples\sears-data-out.txt", "rw")

while((line=_fget(filein))!=J(0, 0, ""))
{
		// if FIPS, discard line and update common lines
		if(regexm(strtrim(line), "^FIPS")) {
				commonline  = strtrim(fget(filein))
				blankline = fget(filein)
				commonline2 = strtrim(fget(filein))
		}
		// else write commonlines and line to output
		else if(strtrim(line)!="")
		{
                  writeline = commonline + " "+ commonline2 + " " +
strtrim(line)
			fput(fileout, writeline)
		}
		// else nothing to do
		else
		{
			printf("%s.\n", "Nothing to do")
		}
}
// return file handles
fclose(filein)
fclose(fileout)
end
*--------------------------------------------------------sears.do

The required (by my understanding) file is saved as
"sears-data-out.txt". Note that you might run into trouble pulling in
strings with embedded spaces like "United States" if you don't enclose
them in quotes. This might require further processing.

T

On Sat, Apr 16, 2011 at 5:35 AM, Sears Generic <searsgeneral@indy.rr.com> wrote:
>
> Are there any shortcuts to reading a data file that has the following format
> other than to reorganize the data before importing?  The data file is for
> population by year by geographic location (e.g. United States, Indiana, then
> 3 counties in Indiana).  "FIPS" is a unique identifier for each county.  The
> problem is that the text file loops (i.e. only provides 4 decades of data
> before starting over) on a new line.  In the example below I've reduced the
> issue to the United States, Indiana, and 3 counties, but the full dataset
> has every county for every state so the looping does not recur in a
> consistent way.  Any suggestions would be appreciated.
>
>
> FIPS        1990       1980       1970       1960
> 00000  248709873  226545805  203211926  179323175 United States
>
> 18000    5544159    5490224    5193669    4662498 Indiana
> 18001      31095      29619      26871      24643 Adams County
> 18003     300836     294335     280455     232196 Allen County
> 18005      63657      65088      57022      48198 Bartholomew County
>
> FIPS        1950       1940       1930       1920
> 00000  151325798  132164569   12320262  106021537 United States
>
> 18000    3934224    3427796    3238503    2930390 Indiana
> 18001      22393      21254      19957      20503 Adams County
> 18003     183722     155084     146743     114303 Allen County
> 18005      36108      28276      24864      23887 Bartholomew County
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/



--
To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index