Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Video to be added
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Welcome to the Profile Curation Handbook — the administrative guide for the all new Wazimap NG!
Wazimap NG (Next Generation) provides easy access to different kinds of data through a mapped and open-source information system, designed to help non-technical users explore data, both meaningfully and in context. In other words, a Geo-Information Spatial (GIS) Tool “for the rest of us”.
This Profile Curation Handbook contains documentation on how to manage and upload datasets, as well as create and manage profiles and their associated indicators. There are two administration roles:
Data Administrator — responsible for sourcing, shaping, and uploading datasets and point collections to Wazimap NG.
Profile Administrator — responsible for defining and managing profile indicators, and site overall content. In short, responsible for how data looks on Wazimap NG.
Both these roles have a function in Wazimap NG’s three views. It is, however, possible for these roles to be performed by the same person. The three views are Point Mapper, Data Mapper, and Rich Data. The table below shows the respective logos for each view of Wazimap NG.
Point Mapper
Data Mapper
Rich Data
The rest of this Profile Curation Handbook is structured according to these three views of Wazimap NG, and the various roles that Data and Profile Administrators play for each.
Take a look at our new Wazimap NG website here.
Experiencing any issues? Contact Wazimap Support here.
Point Mapper allows for a number of locations (called point collections) to be added and viewed on a map using Wazimap NG.
Figure 1, below, is an example of such point mapping, and shows the location of Water Treatment Works within the City of Cape Town Metro of South Africa.
The first step towards creating such a map is sourcing and understanding your data, and knowing what you are trying to display using your data. This is often the hardest part, and requires collaboration between Data and Profile Administrators. It is also important to know a dataset’s Terms of Use or Licence, and what this allows you to use the data for.
Wazimap NG uses .csv files to display point collections, and these .csv files require a particular shaping (formatting) to be properly understood by the Wazimap NG platform. This is the first step towards creating your map.
Data is uploaded to Wazimap NG using Django (which can be accessed here). Once you are logged in, scroll down to the POINTS
section, and click on the +Add
button next to Collections
(see Figure 5, below).
On the following page, do the following (refer to Figure 6, below):
Select (from the dropdown list) the Profile
which should be associated with your point collection;
Give your point collection a Name
;
Select the appropriate Permission type
(Public or Private);
Import your point collection by clicking Choose file
and selecting the .csv file you exported earlier;
Add the Source
information; and
Scroll down and click Save
in the bottom right hand corner.
Datasets can be set as Public
or Private
under Permission type
. Public datasets, and variables derived from them, can be used on any profile in Wazimap NG. This enables reuse of valuable datasets without the need for each use case to source, and upload the data again. Private datasets and variables derived from them can only be used on the profile they are assigned to under Profile
.
---
Your point collection (datatset) should now be added to Django. To display your point collection on Wazimap NG, you must first create a Theme
, and then a Profile Collection
(from your point collection), which will be nested under your Theme. Themes and Profile Collections are vital in displaying data using the Point Mapper on Wazimap NG.
The easiest tool to create or edit .csv files is Google Sheets — it can open both .csv and .xlsx files, and can export .csv files for upload to Wazimap NG. Any point collection requires the following three fields (as highlighted in red in Figure 2, below):
name
;
longitude
; and
latitude
.
As shown in Figure 2, above, additional fields can be included as attributes for each point in the collection. In the Point Mapper view, these additional attributes are displayed in the More Info section of each point’s tooltip (refer to Navigating Point Mapper
).
The name
field (column) of your point collection cannot be formatted as a number. It may be necessary to format the entire column to plain text by selecting the entire column and clicking Format > Number > Plain text
(see Figure 3, below).
Since the attributes under the name
column are displayed on the tooltip in Wazimap NG, it is important to label your points appropriately (refer again to Navigating Point Mapper
).
Additionally, column headings cannot start with a number (e.g., 2019
HDI
), contain periods (e.g., SDG 3
.
1
), or contain square brackets (e.g., HDI over the Years
[
Average
]
).
No cell within the first row below the column headers can be left blank either — if no attribute exists in these cells, simply add no data
. A blank cell will result in that attribute not being displayed for any points (even if they have a value).
Once your point collection is complete (has the three required fields, is formatted correctly, and contains no blank cells in the first row), it can be exported as a .csv file. To do this, click File > Download > Comma-separated values (.csv)
(see Figure 4, below).
Take note of where this .csv file downloads to — you will need to locate it for upload to Wazimap NG later.
To create a Theme, scroll down to the POINTS
section again, and this time click on the +Add
button next to Themes
(see Figure 7, below).
On the following page, do the following (refer to Figure 8, below):
Select (from the dropdown list) the Profile
which should be associated with your Theme;
Give your Theme an appropriate Name
(this will be displayed on Wazimap NG);
Select a preferred Icon
; and
Scroll down and click Save
in the bottom right hand corner.
On Wazimap NG, the theme colour is configurable. Simply add the hex code
of your choice. The next step is to create the Profile Collection
(from your point collection), that will be nested under your newly created Theme.
Long term maintenance to a point collection often involves the following tasks:
Removing points that are no longer relevant;
Adding points that are not already in the database; and
Updating data about points already in the database (e.g., opening hours, services provided, correcting/improving a description).
To do this, you need to be able to compare your updated data to the data you have already uploaded to a profile collection in Wazimap NG. The easiest way to do this is to use unique identifiers.
An identifier is a value which uniquely and consistently identifies an object — in this case, a point in your point collection.
It is important to include some kind of identifier in your point data to facilitate updates to the data. Users down the line will rely on your identifiers being unique and consistent to be able to incorporate updates should they download a copy of your point data.
If your data does not have an official identifier that is consistent over time, and unique per point, it will become difficult to check for any duplicates or whether newly-provided points are already in your database (e.g., by name, address, and so on). Think about what your users will be able to provide you, and include that in what is shown to them as well.
You can use the following formula in Google Sheets to create a UUID (Copy this into the appropriate cell):
Make sure to copy and paste as text as well (not the formula), into a new column, and use the text version of it going forward. You don't want it to calculate a new random UUID for existing points.
Google Sheets has a built-in function to check for and remove duplicates. In Google Sheets, select the data of interest, and click on Data > Data clean-up > Remove Duplicates
(see Figure 20, below). In the box that appears (see Figure 21, below), be sure to select the option Data has header row
if it indeed does.
If you are using Excel, see conditionally format duplicate values
. If you are using Workbench, follow the steps below:
Add a deduplication step.
Select a column whose values probably ought to be unique.
Look for rows where the duplicate number is greater than 1.
If you have a consistent UUID (unique identifier), then:
If you have messy or dirty data with different capitalisation and potential spelling mistakes, try using OpenRefine
with CSV reconciliation.
Before a dataset can be uploaded, the data needs to be cleaned and shaped into the correct format. As a rule, the more disaggregated the dataset, the better as this allows a single dataset to be (re)used for multiple indicators and also allows for multivariable analysis.
The system accepts files in csv, xls and xlsx formats. The file needs to adhere to a specific structure and ensure it always contains the following fields:
Inside the Geography
column, valid values are
country code (ZA)
province codes (GAU, LIM, WC, etc.)
Municipal Demarcation Board codes (CTP, WC024)
Ward IDs (10204020 for Stellenbosch Ward 20 in 2016 demarcation)
or the lower level numerical geography code (e.g. 160001, 175005, etc.)
In between the Geography
and the Count
columns are the fields. These could be Age
, Race
, Education level
, etc.. see example below:
Geography
Age
Race
Child Ever Born
Count
ZA
16
Black African
never given birth
1
ZA
16
Coloured
never given birth
2
ZA
19
Black African
Unspecified
5
Column name requirements:
Must be unique when all colummn names are converted to lower case
Must start with a letter
Can contain letters, numbers, and spaces
Once the dataset has been sourced and shaped, it is ready to upload.
Log into the backend administration section of the website and navigate to Datasets
and click Add
. Give the dataset file a meaningful name and select the applicable geographical boundaries and capture source information (this will be displayed to users to help them understand where the data came from). Proceed with uploading the file from your machine.
Uploading the file kicks off a background task to process the file.
The system will alert you once this is complete. You may also check on the status of the job by viewing the queue Django Q > queued tasks
.
Once the file has been processed, you can proceed with creating indicators.
Datasets can be marked Public
or Private
.
Public datasets, and variables derived from them, can be used on any profile in Wazimap. This enables reuse of valuable datasets without the need for each case to source and upload the data.
Private datasets and variables derived from them can only be used on the profile they belong to.
Qualitative datasets can be uploaded and used to create qualitative indicators. A qualitative dataset must describe the relevant geography and provide the content. This content can be plaintext or HTML, see the example below on how the dataset should be laid out.
Geography
Content
EC
This is qualitative data
WC
<p>This is qualitative data</p>
In the content type dropdown menu select "Qualitative". Then continue to create a variable in the same way as you would for a quantitative dataset.
A universe refers to the population against which the variable is being applied. Universe can also be left as blank which would then apply to everyone.
To create universes, you will need to write a bit of json.
The structure of this is:
See below for an example of youth age range universe.
This universe when applied to a variable, would include all people within the ages of 15 to 35.
A few notes:
It doesn't need to be an array - a single value will also work (e.g. {"Gender": "Female"}
It is case sensitive to remember to match the case in the file
You can have multiple filters
It is possible to upload additional points to an existing point collection without having to replace the entire point collection. To do so, first prepare a .csv file with the new points to be added, as outlined under .
Next, in Django, scroll down to the POINTS
section, and this time click on Collections
(see Figure 15, below).
The page that opens will contain all point collections for all Wazimap NG profiles. There are two ways to locate an existing point collection (see Figure 16, below):
If you know the point collection’s name, simply enter it in the Search bar (top left), and click Search
; or
Filter
point collections by the associated Profile (top right), and locate the desired point collection in the list.
Once located, click on the point collection’s name. The same page as shown in , earlier, will open. Next, import your new points by clicking Choose file
and selecting the .csv file with the new points to be added. Then, hit Save
. After refreshing your Wazimap NG Profile, the new points should appear on the map.
It is possible to edit individual points within an existing point collection in Django. To do so, scroll down to the POINTS
section, and this time click on Locations
(see Figure 17, below).
The page that opens will contain all points for all Wazimap NG profiles. Points can be located in the same way as point collections:
If you know the point’s name, simply enter it in the Search bar (top left), and click Search
;
Filter
points by the associated Profile (top right), and locate the desired point in the list; or
Filter
points by the point collection name.
Once located, click on the point’s name. On the page that opens, it is possible to edit a point’s name
(see Figure 18, below), and its attributes
(see Figure 19, below). Attributes appear in code format.
Referring to Figure 16, for each attribute associated with a point, there is a key
and a value
. A key corresponds to the column headings in the originally uploaded .csv file, and value to the cells in the respective column. For this reason, it is advisable NOT TO EDIT a key
, but only a value
.
Variables are data points and can also be grouped for aggregate fields. These form the basis for the Profile Administrator to create Profile Indicators from.
To create a new variable first ensure that the dataset file is uploaded and that it has been processed on the system. If this is the case, then proceed to Variables
in the admin system and select Add
to create a new one.
First, select the dataset this variable is found in, from the dropdown list and continue by clicking the Save and continue editing
button.
Select the universe that this variable applies to (leave blank for the entire population). Note that you can also create a new universe at this point, if necessary, by clicking on the plus (+) sign next to the universe drop down.
Select which field(s) to group the variable by - you will notice that these are the data columns in your dataset file. These become sub-indicators. You will typically want one or perhaps two of these.
Give your variable a meaningful name.
Click Save
.
Variables are extracted as a background process and you will be alerted once they are complete.
Repeat this for as many variables as you need to create from your dataset and repeat for all your dataset files you have uploaded.
Once the variables have been created, the Profile Administrator can now proceed with creating and configuring the rest of the site.
It might be that you want to change the order in which the sub-indicators are shown. For example, you may want to swap Agree and Disagree in the chart below
In the Admin Suite, find SubindicatorsGroups
Find the relevant indicator:
Then drag the subindicators into the desired order:
The order should now be changed on the front-end. In order to see it you will need to hard-refresh your browser - ctrl + shift + R (this will be fixed soon).
Values are summed over all dimensions other than the indicator variable by default. That means an indicator on the column "financial year" from a dataset with columns "financial year" and "income source" will disaggregate by financial year, and show the sum of the different income sources, unless the user adds a filter on income source.
It sometimes doesn't make sense to sum values over a dimension. For example
If you don't know how many years are in a dataset, the total across years doesn't mean anything
If your data contains overlapping categories to support differing standards, e.g. ages 15-24 and 15-35, summing over these would lead to double-counting.
You can mark a column as non-aggregatable by un-checking the Can aggregate
check-box. It is checked by default as most columns are fine to aggregate over.
This will mean that indicators from this dataset will automatically have a filter for this subindicator group, unless this group is used as the indicator variable (in which case it is already disaggregated).
Unlike user-added filters, filters added to disaggregate non-aggregatable columns can not be removed.
You can specify which value should be the default to be matched in the filter in indicator config. E.g. to ensure that the latest financial year is the default value filtered against. Users can still choose other options if they wish.
Here are the steps to create a Profile Highlight (as shown in the image below):
STEP 1: Create and upload a dataset for the Profile Highlight of interest (e.g., Persons aged 60+), using the same method discussed under Creating Datasets.
STEP 2: Create a variable from you uploaded dataset, using the same method discussed under Creating Variables.
STEP 3: Under the Profile Menu (in Django), select Add New Profile Highlight, and it will redirect you to the page, as show below.
STEP4: Select your Profile from the drop-down list.
STEP 5: Select the Variable you previously created from your dataset.
STEP 6: Add a label or title for the Profile Highlight.
STEP 7: Select the Sub-Indicator to be displayed.
STEP 8: Under Denominator, select Absolute Value.
STEP 9: Scroll down, hit Save, and refresh the Wazi Profile. Your Profile Highlight should now show the value for the selected geography.
Categories can be created and managed by navigating to Indicator Categories
in the admin site.
Provide a name for the category
Select the profile to be associated with this category (default is Youth Explorer but there might be others in the future)
Provide descriptive text explaining what the category contains and any other details relevant to the users. This is displayed in the Rich Data View.
Categories, subcategories, and profile indicators can be reordered by dragging the handle in the ordering column.
The ordering handle is only available when sorted by that column.
A profile indicator should be used when you want to display an indicator to the user. A profile indicator is presented as a bar chart in the rich data view and is also available under the data mapper.
If your indicator only has one subindicator - i.e. only one bar in the barchart, consider using a Profile Key Metric instead.
To create new profile indicators, log into the backend admin section with your admin account and click Add
next to Profile indicators
.
Select the profile to associate this indicator with (there might only be a single one).
Select the variable on which this is based, from the dropdown list.
For the content type dropdown, if the indicator is based on qualitative data select HTML
, otherwise leave it as Indicator
for a chart type indicator.
Provide a meaningful name for this profile indicator in the label field - this is what users will see
Select the category and sub-category this indicator will be housed under (shown in both the Data Mapper and the Rich Data view).
Add a textual description of the indicator - this will be shown in the Rich Data view just below the relevant graph.
Select the choropleth method to be used - either sibling or sub-indicator. This determines what to use as the denominator when rendering the choropleth. Sibling level would be the sum of the same geography types (e.g. when viewing number of households in WC this would compare WC to the sum of all provinces). Sub-indicator method would tally up the values for the children of the current geography level (e.g. households in WC would then tally up households for all districts in WC)
Sub-indicators will be shown to you once you had initially saved. Sub-indicators can be reordered by dragging and dropping them.
Confirm the indicator is now visible on the frontend by doing a hard refresh (ctrl-shift-r)
Additional configuration is possible, such as how numbers are formatted. Documentation can be found in the wazimap techncial documentation.
Point collections refer to a number a different locations (points on a map) and are typically grouped by a specific subject matter. For example a dataset of schools in South Africa.
Collections are created by uploading a csv file and associating the file with a theme and sub-theme.
Prepare a csv file containing at least the following fields: Name, longitude, latitude. You can see an example below:
In addition to these fields, various other fields can also be included as attributes for the point. These are shown on the point tooltip. An example of a file with additional attributes shown below:
Include the best identifier or identifying information
It's a good idea to include an standard identifiers for points in your dataset. This can make future updates and cross-referencing much easier. For example, for official facilities like public schools in South Africa, use their EMIS number.
See more
The "Name" column of your point collection cannot be a number data type. It may be necessary to convert the column to a text/string format before uploading. This can be done easily in Microsoft Excel and is illustrated in the figure below.
NOTE: It may sometimes be more suitable to change which column is given the "Name" heading as this is what is displayed in the tooltip on the map. In the screenshot below, the "Description" column would be a more appropriate choice to be made the "Name" column.
Once the file is ready to be uploaded, navigate to Point Collections
and click Add
which will allow you to name the collection and upload the file.
Select the profile which should be associated with these points and select the theme they belong to. Provide a label and the source of the data and proceed to upload.
Uploads to existing point collections add data without replacing existing points.
To add additional points to a point collection, prepare the file with the new points to be added, as outlined in the Preparing the points dataset section.
Select the point collection you would like to update, upload the file with the new additional point data, and proceed to save the changes.
You can edit information for a specific point by selecting it under Points > Locations.
To rectify/revise/update previously updated data in bulk, delete the existing point collection and recreate the adjusted file containing all the data.
Long term maintenance to a point dataset often involves the following tasks:
removing points that are no longer relevant
adding points that are not already in the database
updating data about points already in the database (e.g. opening hours, services provided, correcting/improving a description)
To be able to do this, you need to be able to compare your updated data to the data you have already uploaded to a collection in Wazimap.
To match incoming data to your existing data, you need an identifier. Failing that, you will need to use whatever other identifying information you have available.
An identifier is a value which uniquely and consistently identifies an object - in this case, a point in your point collection.
It is important to include some kind of identifier in your point data to facilitate updates to the data. Downstream users may also need to be able to rely on your identifiers being unique and consistent to be able to incorporate your updates if they keep a copy of your point data.
If your data does not have an official identifier that is consistent over time, and unique per point, think about how you will check if you have any duplicates and check whether newly-provided points are already in your database - e.g. by name, address, and so on. Think about what your users will be able to provide you, and include that in what is shown to them as well.
Consider making up and maintaining your own unique consistent identifier. We suggest using UUIDs because they are globally unique, and not just unique within one table. This is important in case you want to combine tables later, e.g. if you want to merge "public schools" and "private schools" into one "schools" table and continue to use your unique identifier.
You can use the following formula in excel to create a UUID:
When filled down a column, these identifiers will look like
Make sure to copy and paste as text (not the formula) into a new column and use the text version of it going forward. You don't want it to calculate a new random UUID for existing points.
In Excel, see Conditionally format duplicate values
In Workbench
add a deduplication step
select a column whose values probably ought to be unique
look for rows where duplicate number is greater than 1
If you have a consistent unique identifier:
In Excel, use VLOOKUP()
In Workbench, perhaps Join Tabs might work
If you have messy data with different capitalisation and potential spelling mistakes:
Try using OpenRefine with CSV reconcilliation
Profile configuration is generally carried out by Wazimap Support.
Please email support@wazimap.co.za to request changes.
Public profiles can be viewed by anyone on the internet.
Private profiles can only be viewed by users who are assigned permission for that profile. Users have to login to view that profile.
To create a Profile Collection, scroll down to the POINTS
section again, and this time click on the +Add
button next to Profile Collections (see Figure 10, below).
On the following page, do the following (refer to Figure 11, below):
Select (from the dropdown list) the Profile
which should be associated with your Profile Collection;
Select (from the dropdown list) the Theme
you created;
Select (from the dropdown list) the Collection
(point collection) you uploaded;
Give your Profile Collection an appropriate Label
(this will be displayed on Wazimap NG); and
Scroll down and click Save
in the bottom right hand corner.
Currently the Icon
and Colour
functionalities are not working. These fields can be left empty.
As shown earlier in Figure 1
, it is possible to filter points based on their attributes (e.g., Responsible Institution
). To add filters to your Point Collection, include the attributes you want to filter your points by in the Configuration
panel using a filterable_fields
array (see Figure 12, below).
Any values in the filterable_fields
array that are not attributes of your Point Collection will be ignored.
In addition to adding filters, you can define the field type of columns in your Profile Collection to render data according to HTML code. This is particularly useful for linking to additional, external information (e.g., linking to the source of your points).
This requires that the html code be included in your dataset when uploading your Point Collection
(see, for example, the code used in Figure 13, below). To define the field type, include the attribute (column) you want to define in the Configuration
panel using a field_type
array (refer to Figure 12, above).
The HTML code for adding links is:
<a href='
url
' target='_blank'>
name
</a>
In Figure 13, above, the url
is https://ws.dws.gov.za/IRIS/dashboard_waste.aspx
, and the name
is South African Department of Water & Sanitation
. On Wazimap NG, in the More Info section of each point’s tooltip, this data will be displayed as hyperlinked text (see Figure 14, below).
In addition to the href
HTML attribute, the following are also allowed: class
, target
, data-*
, and style
.
---
If all steps were followed correctly, your point collection will now display on your Wazimap NG Profile (sometimes a hard refresh of the Wazimap NG page is required for changes to reflect). The sections that follow offer some additional information for displaying point collections on your Wazimap NG profile.
This page is a stub - please add documentation
Label data sources as {{ dataset title }} - {{ organisation }}
so that users can more easily find the right data source.
Link to the official page about the actual dataset, if one exists, otherwise to the homepage of the source organisation.
Prefer writing source names in full, rather than abbreviated. e.g. Our world in data
rather than OWID
See if they have a preferred way to be cited and try to use that.
Select Value as the default between Value/Percentage presentation in the chart/table
Disable the Value/Percentage toggle
Use ""
as the format string - as in - don't use the SI unit formatting, because then 0.789 will look like 789m and people will think it means millions.
If you're going to use format strings that format with SI units, ensure you configure no more than 2 decimal places or 3 significant digits, otherwise you could end up with numbers presented as 12.345k
and people will misread the . as a thousand separator due to the three decimal places and think this is 12 345 000
Point themes and Profile Collections are vital in displaying data using the Point Mapper on Wazimap. Themes are similar to the Indicator Categories in the Data Mapper while Profile Collections are similar to the Profile Indicators of the Data Mapper.
Themes can be created and managed by navigating to Themes
under the Points
section of the Wazimap-NG admin page.
Select which profile you want to create the theme for
Provide a name for the theme
Select an appropriate icon to display
Theme colour is currently fixed according to the order of themes:
Theme colour will be more configurable in a future update. Please let us know what your needs are.
Profile Collections can be created and managed by navigating to Themes
under the Points
section of the Wazimap-NG admin page.
Select which profile you would like the Profile Collection to be associated with
Select which Theme you would like the new Profile Collection to fall under
Select the Point Collection that has the data you would like to represented by the Profile Collection
Decide on the Label for the Profile Collection (this will be the text that is displayed in the front-end)
NOTE: Currently the Profile Collection Icon and Colour functionality is not working. These fields can be left empty.
It is important that users can tell whether a value in an indicator is zero, or missing.
Presenting the gap in the data is often just as helpful or important as presenting the data itself. It would also often misrepresent the facts to present a gap in the data as if it is zero.
On the other hand, it can be very inefficient to try to express every possible zero in a dataset. The simplest way to express zeros in a dataset is to simply have a row for every possible combination of attributes for each geographic area represented in the dataset. That would result in incredibly large datasets, where the value (Count column) of most of the rows would often tend to be zero.
Wazimap tries to support minimally-sized datasets by making some assumptions about the data, while also trying to support the presentation of missing data.
Wazimap presents a value for a geographic area as "missing" when there are no rows of data for that geographic area in the dataset backing an indicator. As soon as there is one or more datum for a geography in an indicator, all subindicator and filter combinations will be presented as zero instead of missing. See below.
On a choropleth plotting hostpital beds per 1,000 people in countries in Africa, it would be wrong to plot countries with missing data as having zero beds.
When a subindicator does not occur in the data for a given geographic area, it will be excluded from the chart.
For example, when years are missing from an indicator on misspending, those years are not shown for that geography.
This behaviour is important in instances where an subindicator group can have a very large variety of different values, and only a small number are applicable to a specific geographic area. For example, election results showing the votes received by a party should only show the parties that contested that geographic area. If subindicators are included that did not contest that area, the chart would include hundreds of irrelevant items.
When a dataset does not contain explicit zero rows for a certain combination of subindicators, and a filter is applied excluding the available data, the subindicators without data are excluded from the chart.
When a row in the dataset has the Count value of zero, it is of course presented as zero.
In the data mapper, when there is some data row for a given geographic area in an indicator, every combination of filters would be presented as zero even if there was no row in the dataset for that combination of attribute values.
In the example below, the number of white Tshivenda speakers is explicitly presented as zero, even though there is no such row in the dataset. The data mapper assumes that the data for a geography is complete if there is some row for that geography - perhaps for a different subindicator, or for the selected subindicator but for a different combination of filters..
Due to the assumptions shown above made by Wazimap about your data, we don't currently support the following. If there is demand, we can consider adding support, perhaps by making the behaviour configurable per dataset or indicator. We can potentially also help you shape your datasets and indicators to achieve your objectives within the above behaviour.
The rich data view currently just hides rows without data for a given subindicator or filter selection. It does not support showing a label on a chart axis with a blank space to make the lack of data for that indicator visually explicit.
The data mapper currently does not support partial data for a geography. If an indicator has one datum for that geography, all combinations of subindicator and filters will show zeros for that geography where other values are not available.
In these cases, it can be helpful to indicate in the description of your dataset that the data may not be complete, and when it was last updated.
As a workaround, you can show gaps in an entire subindicator by separating a dataset into a dataset and indicator per subindicator. The indicators where no values are available for a geography will then present that geography as blank (grey).
Below are the Geography Codes that should be used for .csv files uploaded to Wazimap Profiles. Be sure to use the Geography Codes corresponding to your Wazimap Profile's Geography Hierarchy.
Algeria
DZA
Angola
AGO
Benin
BEN
Botswana
BWA
Burkina Faso
BFA
Burundi
BDI
Cape Verde
CPV
Cameroon
CMR
Central African Republic
CAF
Chad
TCD
Comoros
COM
Congo
COG
Djibouti
DJI
Democratic Republic of Congo
COD
Egypt
EGY
Equatorial Guinea
GNQ
Eritrea
ERI
eSwatini
SWZ
Ethiopia
ETH
Gabon
GAB
Gambia
GMB
Ghana
GHA
Guinea
GIN
Guinea-Bissau
GNB
Ivory Coast
CIV
Kenya
KEN
Lesotho
LSO
Liberia
LBR
Libya
LBY
Madagascar
MDG
Malawi
MWI
Mali
MLI
Mauritania
MRT
Mauritius
MUS
Mayotte
MYT
Morocco
MAR
Mozambique
MOZ
Namibia
NAM
Niger
NER
Nigeria
NGA
Rwanda
RWA
Réunion
REU
Saint Helena
SHN
Sao Tome and Principe
STP
Senegal
SEN
Seychelles
SYC
Sierra Leone
SLE
Somalia
SOM
South Africa
ZAF
South Sudan
SSD
Sudan
SDN
Tanzania
TZA
Togo
TGO
Tunisia
TUN
Uganda
UGA
Western Sahara
ESH
Zambia
ZMB
Zimbabwe
ZWE
These are the steps needed to generated datasets for Africa Data Hub.
The script of ADH can be found at https://github.com/OpenUpSA/wazimap-adh-data under the folder COVID Countries and Africa Admin 1.
Download owid-covid-latest.csv from https://github.com/owid/covid-19-data/tree/master/public/data
Download from https://github.com/dsfsi/covid19za/tree/master/data the following files: covid19za_provincial_cumulative_timeline_confirmed.csv, covid19za_provincial_cumulative_timeline_deaths.csv, covid19za_provincial_cumulative_timeline_recoveries.csv
Download from https://data.humdata.org/organization/hera-humanitarian-emergency-response-africa ALL( for various countries) csv files that end with Coronavirus (Covid-19) Subnational eg. Niger: Coronavirus (Covid-19) Subnational. All the datasets from this source should be saved in a folder named HERA
For each csv file in HERA folder, change columns name from
['CONTAMINES', 'DECES', 'GUERIS', 'CONTAMINES_FEMME', 'CONTAMINES_HOMME', 'CONTAMINES_GENRE_NON_SPECIFIE']
These dataset should be saved under the folder wazimap-adh-data-main/COVID Countries and Africa Admin 1 After downloading the datasets, run the script as in the following order
Admin0_dataTransformation_v2.ipynb
Africa_Admin1_dataTransformation.ipynb
Data_Aggregation.ipynb
Data_Aggregation.ipynb will generate cases_monthly, death_monthly .
Cells and columns in CSV don't have well-defined types so programs reading those CSVs generally infer the type from the values.
This can be a problem when opening a file with phone numbers which often start with a zero, and look like a number to programs reading CSVs.
When reading a CSV file, see if you can specify that such columns should be read as Text rather than letting the program infer the type.
After reading, if the columns were read as text, the zero-prefixes will remain. If the program read it as numbers, the zero prefix will have been lost.
To check that the file was saved correctly, you can open it in a text editor like Notepad to check that the zeros are still there:
In our context a dataset is a set of data with a geography attribute, a count attribute, and one or more dimension attributes. A dataset can be constructed from one or more files file structured in a specific format (see Creating Datasets) and uploaded to the dataset in wazimap. Datasets are the data source that Variables are created from. The platform allows for multiple datasets from various sources to be uploaded and displayed to users. Datasets are uploaded and managed by a Data Administrator.
In Wazimap a Subindicator is what we call an attribute value in one of the classifying columns or dimensions of a dataset.
In OLAP terms, a subindicator corresponds to a member of a dimension.
In statistics, a subindicator corresponds to categories of a categorical variable.
Subindicators are so named because they represent the choices offered when plotting a choropleth.
A subindicator group represents the set of subindicators of a particular attribute or column in the original dataset.
In OLAP terms, a subindicator group corresponds to a dimension.
It corresponds directly to the columns in a dataset, other than the Geography and Count columns.
Variables are datapoints used to create profile indicators from and are created by the Data Administrator. Multiple variables can be created from the same dataset.
Most of the time, a variable simply exposes a subindicator group for use as categories in an indicator.
Variables exist for more complicated cases where the way percentages need to be calculated using a different population than simply the total of all subindicators. This is done by associating a universe to the variable.
Universe refers to the population to which the indicators are applied. There can be multiple universes if required and it can also be left blank to apply to the entire population.
Profile indicators are created by the Profile Administrator and are presented to the user on the website. Indicators belong to categories (e.g. Demographics) and can belong to sub-categories (e.g. General Population). In addition, they can also have sub-indicators (e.g. Age could take the individual age brackets as sub-indicators).
Key metrics are values of significance as decided upon by the Profile Administrator. These are used to showcase and callout highlighted values both in the rich data (profile) view, as well as on the map view. Key metrics can be shown as a percentage or absolute numbers as defined by the Profile Administrator.
The Data Mapper provides an interface for users to plot indicators on the map. Only indicators available for plotting are shown and these might change depending on the geography level and that data available for that level. Please note that all indicators are shown in the Rich Data View.
What was once referred to as a profile view on Wazimap is now the Rich Data View and provide charted exploration of the available data indicators. This view also reveals the source of each indicator along with a description for the categories and indicators (optional and set by Profile Administrators). This view also allows a chart to be downloaded (to be used elsewhere) and will soon allow for a chart to be embedded and for data to be downloaded. The Rich Data View also supports a print-friendly view allowing for easier sharing and dissemination.
The point menu houses point data themes and collections and allows for these to be overlaid on the map.
Point data refers to coordinate based data rather than a dataset shaped within a geographical boundary. This allows for points to be overlaid on various other indicators.
Ensure any data pertaining to people are rounded to the nearest whole number.
Do not use "M" and "k". Use full figures with thousands-separator.
Examples:
data entry refers to 104678.9 people
--> this should be displayed as 104,679
.
data entry refers to 109987789.49 people
--> this should be displayed as 109,987,789
.
Prefer full financial years, as in 2016-2017
rather than 2017
or 2016-17
. This is because
Most people don't know that in the municipal sphere (unlike national) 2017
means 2016-2017
Many people won't realise that the -17 means the next year - writing it in full is much easier to understand correctly the first time.
Order financial years in subindicator groups in reverse chronological order - that is 2018-2019 and then 2017-2016. People are most often interested in the latest financial year.
Always mark the financial year column as non-aggregatable. Summing over financial years usually doesn't make sense for an abritrary number of financial years and can easily lead to surprises for users.
When the financial year column is not the variable, add a default filter to match the latest financial year. Marking financial year as non-aggregatable will already add a filter for it, but adding default filter configuration will ensure that a sensible value is selected in the filter.
The reasoning for our preferred age bands is as follows:
Ideally bands should align with voting age
Ideally bands should align across datasets e.g. demographics should align with election data
perhaps 0-18, 18-19, 20-29, 30-39, 40-49, 50-59, 60+
Label data sources as {{ dataset title }} - {{ organisation }}
so that users can more easily find the right data source.
as opposed to