What are the different types of data catalog?

Connecting to data is what makes a data catalog tick. There are many ways to do this - different approaches that support unique needs and strategic initiatives. There are two important ways of interpreting what a data catalog should do.

A "metadata" catalog

This is a catalog that indexes dataset metadata (information about the data). A metadata catalog may include samples of data but will not provide full access to the underlying dataset. 

Metadata is a broad term, but generally metadata falls into a few different categories:

  1. Descriptive metadata. Everything that helps you interpret the contents of a dataset. Examples: title, description, classification, business glossary
  2. Structural metadata. Everything that relates to the composition of the dataset. Examples: schema, property types, size
  3. Administrative metadata. Everything about how the dataset is controlled. License cost, warehouse region, owner, point of contact, etc.

A data catalog

This is a software solution that indexes dataset metadata and provides access to the underlying data itself. It is important to distinguish data catalogs that provide access to data and those that do not, as they provide different strategic benefits. If your goals include increasing adoption and access to data, you will want to use technology that includes the ability to share and connect to data. 

The ThinkData Catalog Platform can operate as either a metadata catalog or a data catalog, depending on an organization’s needs.