Import data from Parquet files

Import data from Parquet files StataNow

← See Stata 19's new features

Highlights

Import Parquet files
Import subsets of columns and rows
See more data management features

Apache Parquet is a data file format that organizes data by columns, and it supports several compression methods for the data to achieve efficient storage. Now you may use import parquet to import data from a Parquet file into Stata. import parquet reads a Parquet file into an Apache Arrow table and then converts the table to a Stata dataset. Most Parquet data types and compression methods are supported. This feature is a part of StataNow™.

Let's see it work

We first look at the information contained in the Parquet file iris.parquet.

. import parquet using iris.parquet, describe
Contains data from C:/StataNow19/iris.parquet

 Observations:        150
    Variables:        5



Column                           Type

sepal.length                     double         
sepal.width                      double         
petal.length                     double         
petal.width                      double         
variety                          utf8

Now we import the Parquet data into Stata.

. import parquet using iris.parquet
(5 vars, 150 obs)

We can also import a subset of columns. For example, below we want to import only columns sepal.length, sepal.width, and variety into Stata.

. import parquet sepal.length sepal.width variety using iris.parquet, clear
(3 vars, 150 obs)

Additionally, we can import a subset of rows from the Parquet file; below, we only import the last 100 rows:

. import parquet sepal.length sepal.width variety using iris.parquet, rowrange(-100:L) clear
(3 vars, 100 obs)

Tell me more

Read more about the import parquet command in [D] import parquet in the Stata Data Management Reference Manual.

Learn more about Stata's data management features.

View all the new features in Stata 19 and, in particular, new in data management.

Ready to get started?

Experience powerful statistical tools, reproducible workflows, and a seamless user experience—all in one trusted platform.

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.