← See Stata 19's new features
Highlights
Convert a PDF file to a plain text file
See more reporting features
Do you need to convert a PDF to plain text? Perhaps you need to extract content that can be easily parsed for data mining or efficient processing by large language models. Or perhaps you need the content from your PDF in a smaller file size that can quickly be loaded and processed.
With the new pdf2txt command, you can now convert your PDF to plain text format. This feature is a part of StataNow™.
If you want to convert a Word document to plain text, try the new docx2txt command.
Below, we use the putpdf suite of commands to create a PDF with a table of descriptive statistics and a table of regression results. We use data from the Second National Health and Nutrition Examination Survey (NHANES II) (McDowell et al. 1981) to analyze blood pressure, weight, and body mass index. We run the following commands to create our PDF:
. webuse nhanes2l, clear
. putpdf begin
. putpdf paragraph
. putpdf text ("We analyze data from the Second National Health and")
. putpdf text (" Nutrition Examination Survey."), linebreak(1)
. quietly: dtable bpsystol weight bmi, by(diabetes)
title("Table 1. Descriptive statistics")
column(by(, halign(right)) total(, halign(right)))
. putpdf collect
. regress bpsystol age weight
. putpdf table bweight = etable,
title("Table 2. Linear regression of systolic blood pressure")
. putpdf save bpreport, replace
And now we convert bpreport.pdf to a plain text file by typing
. pdf2txt bpreport.pdf
Here is our plain text file:
McDowell, A., A. Engel, J. T. Massey, and K. Maurer. 1981. “Plan and operation of the Second National Health and Nutrition Examination Survey, 1976–1980.” In Vital and Health Statistics, ser. 1, no. 15. Hyattsville, MD: National Center for Health Statistics.
Read more about pdf2txt in [RPT] pdf2txt in the Stata Reporting Reference Manual.
Learn more about Stata's reporting features.
View all the new features in Stata 19 and, in particular, new in reporting.
Learn
Free webinars
NetCourses
Classroom and web training
Organizational training
Video tutorials
Third-party courses
Web resources
Teaching with Stata
© Copyright 1996–2026 StataCorp LLC. All rights reserved.
×
We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.
Cookie Settings
Last updated: 16 November 2022
StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.
These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.
Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.
