Wazimap profile curation handbook
  • Start Here
    • Introduction
  • Point Mapper
    • What is Point Mapper?
    • Shaping Data for Point Collections
    • Uploading Point Collections
    • Creating Themes for Profile Collections
    • Creating Profile Collections from Point Collections
    • Uploading additional points to an existing Point Collection
    • Editing existing Point Data in Django
    • Bulk updates to an existing point collection
    • Navigating Point Mapper
  • Profile Admin
    • Creating Datasets
    • Sub-Indicator groups (columns)
    • Creating Universes
    • Creating Variables
    • Creating Point Collections
    • Creating a Profile Highlight
    • Creating Profile Indicators
    • Creating a Profile Key Metric
    • Managing Categories and Sub-Categories
    • Managing Point Themes and Profile Collections
    • Profile configuration options
  • Curation Concepts
    • Geography Codes
    • Zero-values vs missing data
    • Glossary
  • Common practices
    • General
    • SANEF election dashboard
    • Africa Data Hub
    • Data handling tips
  • Promotion and usage
    • Analytics
Powered by GitBook
On this page
  • Current behaviour
  • Cases presented as missing data
  • Cases not explicitly presented as zero or missing
  • Cases presented as zero
  • No supported yet
  • Explicit missing data in rich data view
  • Partial data in the data mapper

Was this helpful?

Export as PDF
  1. Curation Concepts

Zero-values vs missing data

PreviousGeography CodesNextGlossary

Last updated 3 years ago

Was this helpful?

It is important that users can tell whether a value in an indicator is zero, or missing.

Presenting the gap in the data is often just as helpful or important as presenting the data itself. It would also often misrepresent the facts to present a gap in the data as if it is zero.

On the other hand, it can be very inefficient to try to express every possible zero in a dataset. The simplest way to express zeros in a dataset is to simply have a row for every possible combination of attributes for each geographic area represented in the dataset. That would result in incredibly large datasets, where the value (Count column) of most of the rows would often tend to be zero.

Wazimap tries to support minimally-sized datasets by making some assumptions about the data, while also trying to support the presentation of missing data.

Current behaviour

Cases presented as missing data

Wazimap presents a value for a geographic area as "missing" when there are no rows of data for that geographic area in the dataset backing an indicator. As soon as there is one or more datum for a geography in an indicator, all subindicator and filter combinations will be presented as zero instead of missing. .

On a choropleth plotting hostpital beds per 1,000 people in countries in Africa, it would be wrong to plot countries with missing data as having zero beds.

Cases not explicitly presented as zero or missing

No data available for a given subindicator for the selected geography

When a subindicator does not occur in the data for a given geographic area, it will be excluded from the chart.

For example, when years are missing from an indicator on misspending, those years are not shown for that geography.

This behaviour is important in instances where an subindicator group can have a very large variety of different values, and only a small number are applicable to a specific geographic area. For example, election results showing the votes received by a party should only show the parties that contested that geographic area. If subindicators are included that did not contest that area, the chart would include hundreds of irrelevant items.

Some rows available for a subindicator for the selected geography, but not for every combination of filters

When a dataset does not contain explicit zero rows for a certain combination of subindicators, and a filter is applied excluding the available data, the subindicators without data are excluded from the chart.

Cases presented as zero

When a row in the dataset has the Count value of zero, it is of course presented as zero.

In the data mapper, when there is some data row for a given geographic area in an indicator, every combination of filters would be presented as zero even if there was no row in the dataset for that combination of attribute values.

In the example below, the number of white Tshivenda speakers is explicitly presented as zero, even though there is no such row in the dataset. The data mapper assumes that the data for a geography is complete if there is some row for that geography - perhaps for a different subindicator, or for the selected subindicator but for a different combination of filters..

No supported yet

Due to the assumptions shown above made by Wazimap about your data, we don't currently support the following. If there is demand, we can consider adding support, perhaps by making the behaviour configurable per dataset or indicator. We can potentially also help you shape your datasets and indicators to achieve your objectives within the above behaviour.

Explicit missing data in rich data view

The rich data view currently just hides rows without data for a given subindicator or filter selection. It does not support showing a label on a chart axis with a blank space to make the lack of data for that indicator visually explicit.

Partial data in the data mapper

The data mapper currently does not support partial data for a geography. If an indicator has one datum for that geography, all combinations of subindicator and filters will show zeros for that geography where other values are not available.

In these cases, it can be helpful to indicate in the description of your dataset that the data may not be complete, and when it was last updated.

As a workaround, you can show gaps in an entire subindicator by separating a dataset into a dataset and indicator per subindicator. The indicators where no values are available for a geography will then present that geography as blank (grey).

See below
Countries that did not occur in the dataset shown in grey.
2017-18 and 2018-19 available for Emthanjeni
2017-18 and 2018-19 not available for Renosterberg
Excludes e.g. the IFP who did not participate in this ward. Zeros were explicitly included in the dataset.
Stellenbosch has a small number of Tshivenda speakers between 15 and 35.
When Race: White is selected, Tshivenda is not shown - there was no data for that combination of attribute values in the dataset.
35 Tshivenda speakers in Stellenbosch
Zero shown as the number of white Tshivenda speakers based on the assumption that the dataset is complete because there are some Tshivenda speakers in Stellenbosch - just not ones where their Race subindicator is White.