> For the complete documentation index, see [llms.txt](https://openup.gitbook.io/wazimap-ng/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://openup.gitbook.io/wazimap-ng/profile-admin/creating-datasets.md).

# Creating Datasets

{% embed url="<https://www.youtube.com/watch?v=W1x_oNzumAE>" %}

## Preparing the dataset

Before a dataset can be uploaded, the data needs to be cleaned and shaped into the correct format. As a rule, the more disaggregated the dataset, the better as this allows a single dataset to be (re)used for multiple indicators and also allows for multivariable analysis.

The system accepts files in csv, xls and xlsx formats. The file needs to adhere to a specific structure and ensure it always contains the following fields:

```
Geography,Count
```

Inside the `Geography` column, valid values are

* country code (ZA)
* province codes (GAU, LIM, WC, etc.)
* Municipal Demarcation Board codes (CTP, WC024)
* Ward IDs (10204020 for Stellenbosch Ward 20 in 2016 demarcation)
* or the lower level numerical geography code (e.g. 160001, 175005, etc.)

{% hint style="warning" %}
In between the `Geography` and the `Count` columns are the fields. These could be `Age`, `Race`, `Education level`, etc.. see example below:
{% endhint %}

| Geography | Age | Race          | Child Ever Born   | Count |
| --------- | --- | ------------- | ----------------- | ----- |
| ZA        | 16  | Black African | never given birth | 1     |
| ZA        | 16  | Coloured      | never given birth | 2     |
| ZA        | 19  | Black African | Unspecified       | 5     |

{% hint style="info" %}
Column name requirements:

* Must be unique when all colummn names are converted to lower case
* Must start with a letter
* Can contain letters, numbers, and spaces
  {% endhint %}

Once the dataset has been sourced and shaped, it is ready to upload.

## Uploading the dataset

Log into the backend administration section of the website and navigate to **`Datasets`** and click **`Add`**. Give the dataset file a meaningful name and select the applicable geographical boundaries and capture source information (this will be displayed to users to help them understand where the data came from). Proceed with uploading the file from your machine.&#x20;

Uploading the file kicks off a background task to process the file.&#x20;

The system will alert you once this is complete. You may also check on the status of the job by viewing the queue **`Django Q > queued tasks`**.

Once the file has been processed, you can proceed with [creating indicators](/wazimap-ng/profile-admin/uploading-datasets.md).

## Dataset permissions and sharing

Datasets can be marked `Public` or `Private`.

**Public datasets**, and variables derived from them, can be used on any profile in Wazimap. This enables reuse of valuable datasets without the need for each case to source and upload the data.

**Private datasets** and variables derived from them can only be used on the profile they belong to.

## Qualitative datasets

Qualitative datasets can be uploaded and used to create qualitative indicators. A qualitative dataset must describe the relevant geography and provide the content. This content can be plaintext or HTML, see the example below on how the dataset should be laid out.

| Geography | Content                           |
| --------- | --------------------------------- |
| EC        | This is qualitative data          |
| WC        | \<p>This is qualitative data\</p> |

In the content type dropdown menu select "Qualitative". Then continue to [create a variable](/wazimap-ng/profile-admin/uploading-datasets.md) in the same way as you would for a quantitative dataset.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://openup.gitbook.io/wazimap-ng/profile-admin/creating-datasets.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
