What do these column headings mean?

Columns describe the rows i.e. the rows represent the number of entries, and the values in the columns describe the attributes of these entries

When we look at a table of data, the first row (or rows) should indicate what the columns of data contain. For example, in the example below, it's fairly easy to guess that Gender, Population Group, Current Institution, Faculty, are all attributes of people.

But what about in this example, taken from an annual report published by the Department of Transport?

You can probably figure out what the column Geo_type is by reading down a few entries, and it seems obvious that Pr_code relates to province. The rest of the table, however, is meaningless without something to interpret the variable names in the columns.

Good data should come with "metadata" attached, either within the spreadsheet or or as a separate file. Metadata is data about the data, and can include an explaination of column headings, publication times, publisher names and methodoloogy used for collection. It should also tell you the licensing conditions for the dataset and whether or not you can use it.

Here's an example of metadata taken from a World Bank dataset.

You can see this for yourself at this link. Open it up and click on the button marked "Details". This will show you the first part of the metadata, along with more links to get more information.

Not all data comes with metadata attached. Be very wary about data you can't be 100% sure of though.

Last updated