LogoLogo
Procurement Data Crash Course
Procurement Data Crash Course
  • About this course
    • Course introduction
  • Module 1: How the public procurement process works
    • 1.1 Understanding the public procurement process
      • ❓Why the public procurement process exists
      • ⚖️What rules govern the public procurement process?
      • ⚙️RFQ or RFP? An introduction to the different types of tender
      • 📋The key stages of the procurement process
        • 📑Stage 1: Planning
        • 🚴‍♂️Stage 2: Initiation
        • ✔️Stage 3: Selection & award
        • 🤝Stage 4: Contract
        • 🏗️Stage 5: Implementation
      • 🛡️Why monitoring the procurement process is important
      • Test yourself: Understanding the public procurement process
    • 1.2 What does procurement data look like?
      • 💰Budgets & IRPs
      • 📃RFPs & RFQs
      • 🏆Awards
      • 📖Annual Reports
      • 🏛️The Auditor General's report
    • 1.3 Where is public procurement data published?
      • 🔍Where to find procurement data
      • 📚Maintaining your own library of procurement data
    • 1.4 Procurement oversight and monitoring for NPOs and media
      • ✋Procurement oversight guide for CSOs
      • 📺Procurement oversight guide for media
  • Module 2: Working with procurement data
    • 2.1 Whey we need machine readable data
      • Important data formats: CSVs, Excel and Google Sheets
    • 2.2 Turning websites and PDFs into machine readable data
      • Scraping data with Tabula
      • Simple web scraping with Google Sheets
      • Web scraping by inspecting network traffic
  • Useful resources and libraries
    • 3.1 Procurement data online resources
      • Importance reference resources
      • Online data repositories
  • Course testing & feedback
    • 🎓Extended course exam
    • 📝Surveys & feedback
    • ⏱️Quick course exam
  • MODULE4: Explore the OCPO procurement dashboard
    • 4.1 A walk through the OCPO COVID-19 reporting dashboard
      • Summary and Supplier page of the dashboard
      • Find supplier information from external sources
      • Navigating COVID19 Item Spend Page
      • Navigating the Transactions List Page
    • 4.2 Keep the Receipts Tool
      • Background and Introduction
      • Download data from Keep the Receipts
    • 4.3 Using KeeptheReceipts and Google Sheet for Procurement Data Analysis
      • Infrastructure Order Analysis
      • Mask Price Analysis
Powered by GitBook
On this page
  • Step 1: Visit the E-Tender portal
  • Step 2: Find the API links
  • Step 3: Getting the JSON into a spreadsheet
  • Step 4: Import the data to Google Sheets
  1. Module 2: Working with procurement data
  2. 2.2 Turning websites and PDFs into machine readable data

Web scraping by inspecting network traffic

How to scrape data from South African E-Tender portal

PreviousSimple web scraping with Google SheetsNext3.1 Procurement data online resources

Last updated 2 years ago

As discussed in previous lessons, issued by public bodies should be added to the central . The data uploaded here is hard to work with, partly because it is incomplete - many important pieces of information are stored in attached documents rather than in the site database - and you cannot easily download it for analysis.

In this topic, we will show you how address the second of those challenges, by scraping data from the website using its own search query URLs.

There's some terminology to understand.

  • Application Programming Interface (API) An API is used when two software applications want to talk to each other. In this case, the API connects your web browser to the E-Tenders database. Database search queries are passed over as that part of the URL in your browser's address bar which follows the question mark.

  • JavaScript Object Notation (JSON) Data is passed from the database to your browser in response to an API query, in a data format known as JSON. JSON is a little bit similar to a file, but with data arranged in a different format so as to allow more flexibility.

Step 1: Visit the E-Tender portal

Open the E-Tender portal , and click browse opportunities.

Notice there are four categories to choose from, Currently Advertised, Awarded, Closed and Cancelled tenders.

What if you wanted to create a list of all currently advertised tenders which you can import to your spreadsheet software to analyse?

Step 2: Find the API links

If you click on Currently Advertised, you should see a table appear in the middle of the page with tender information. This data is populated by an API call, and we can find the specific link by pressing F12 to open up our browser Inspect function.

You can also hover your mouse over the table, right-click and select Inspect.

In the Inspect window, click on the Network tab. Your screen should look something like this.

You may need to reload the page at this point.

Now click the Fetch/XHR button to filter the output of this screen, then click on the result. You should see the API request being sent to collect data.

Double click on the result under Name. This should start ?status=1&_= and finish with a long number that represents the last record. It will take you to the URL that’s returning data. When you open this in a new browser tab, you should see something like this.

This is the JSON output that is sent in response to the API request. Here are examples of the API requests at the time of writing.

  • Currently advertised

  • Awarded

  • Closed

  • Cancelled

Step 3: Getting the JSON into a spreadsheet

Once you have the JSON data in your browser, you can save it onto your local machine., Just right click and choose Save in the menu. Save as “your_name.json”

Click JSON to CSV then upload the file you just created, then click Convert.

After the conversion has taken place, you'll be able to select download to save the CSV file on your desktop.

Step 4: Import the data to Google Sheets

Importing a CSV into Google Sheets is easy. Create a new spreadsheet and call it Currently Advertised. Now, under the File menu, choose Import, select your CSV file then Replace Current Sheet and finally Import data.

Now you have a spreadsheet with the details of all currently open tenders on the E-Tender portal, including department name, contact details and a brief description. You can do the same for closed, cancelled and awarded tenders too.

Spreadsheet software can't read JSON files directly, however, so next you'll need to convert this data to a CSV file. Our favourite too for this conversion is .

https://www.etenders.gov.za/Home/TenderOpportunities/?status=1&_=1654507040789
https://www.etenders.gov.za/Home/TenderOpportunities/?status=2&_=1654507040789
https://www.etenders.gov.za/Home/TenderOpportunities/?status=3&_=1654507040789
https://www.etenders.gov.za/Home/TenderOpportunities/?status=4&_=1654507040789
https://csvjson.com/
tender notices and supporting documents
E-Tender portal run by national government
CSV
https://www.etenders.gov.za/
, click browse file to upload the json file that you just saved, wait for it to upload
lick download, you get the csv file of all the currently advertising tender