How do I create a data dictionary?

A data dictionary is metadata about the dataset's properties (also called the dataset schema). 

When you ingest or virtualize datasets on the platform, the data dictionary will be automatically harvested and displayed. 

For metadata-only datasets, you may manually add a data dictionary. If you eventually connect this dataset to data, the manually provided schema will be updated with the harvested one (if the properties are the same, no change will occur). 

Even for automatically provided schema, you may choose to include property definitions to help users understand what each property represents. 

To add or modify a data dictionary, scroll to the bottom of the edit/create dataset popover. In the Data section, you'll see two options: to connect data or add a data dictionary. To add a data dictionary, select the "edit" button to open the data dictionary builder. 

Once the builder is open, you can add columns, or properties, to the dataset as needed. 

Columns require a name and a key. A name is the human-readable name for the column, whereas a key is the property name provided by the dataset itself. If, for example, you have ingested a dataset with unintelligible column names (as is the case with many manufacturing datasets), you can provide a column "name" that is human-readable without overriding the property key. 

This is also where you can add property descriptions and glossary terms. 

When finished, select done. Once you have added all the columns you need, you may select done to return to the edit dataset screen. Select save dataset to apply your changes.