Home  /  Resources & Support  /  Introduction to Stata basics  /  How to label variables

Labeling the variables in a dataset is one of the most basic and fundamental data management tasks. Sometimes, the variable names are clear and useful, but we would like to add a label that will be used in tables and graphs. Let's open and describe an example dataset from the Stata website and we'll show you how.

. use https://www.stata.com/users/youtube/rawdata.dta, clear

. describe

Contains data from https://www.stata.com/users/youtube/rawdata.dta
 Observations:         1,268                  Fictitious data based on the
                                                National Health and Nutrition
                                                Examination Survey
    Variables:            10                  6 Jul 2016 11:17
                                              (_dta has notes)
Variable Storage Display Value name type format label Variable label
id str6 %9s Identification Number age byte %9.0g sex byte %9.0g Sex race str5 %9s Race height float %9.0g height (cm) weight float %9.0g weight (kg) sbp int %9.0g Systolic blood pressure (mm/Hg) dbp int %9.0g Diastolic blood pressure (mm/Hg) chol str3 %9s serum cholesterol (mg/dL) dob str18 %18s
Sorted by: id

The column "Variable label" shows labels for all the variables except age and dob. Most textbooks will advise you to include the units of measurement in a variable label, and age is measured in years. Let's add a label to the variable age.

. label var age "Age (years)"

The variable dob contains the date of birth for each observation. Let's label it too.

. label var dob "Date of Birth"

Let's describe our data again to verify that we have labeled the variables correctly.

. describe

Contains data from https://www.stata.com/users/youtube/rawdata.dta
 Observations:         1,268                  Fictitious data based on the
                                                National Health and Nutrition
                                                Examination Survey
    Variables:            10                  6 Jul 2016 11:17
                                              (_dta has notes)
Variable Storage Display Value name type format label Variable label
id str6 %9s Identification Number age byte %9.0g Age (years) sex byte %9.0g Sex race str5 %9s Race height float %9.0g height (cm) weight float %9.0g weight (kg) sbp int %9.0g Systolic blood pressure (mm/Hg) dbp int %9.0g Diastolic blood pressure (mm/Hg) chol str3 %9s serum cholesterol (mg/dL) dob str18 %18s Date of Birth
Sorted by: id Note: Dataset has changed since last saved.

Now we can save our dataset.

. save mydata
file mydata.dta saved

You can watch a demonstration of these commands by clicking on the link to the YouTube video below. You can read more about these commands by clicking on the links to the Stata manual entries below.

See it in action

Watch Data management: How to label variables.

Tell me more

Read more in the Stata Data Management Reference Manual; see [D] describe, [D] label, and [D] save.