In this post I would like to share our (@solidlines) experience connecting Kobo tool box and dhis2. This experience was already shared in a weekly call of the dhis2 integration working group.
Context
An organization plans to collect surveys in Kobo / KoboToolbox (a free toolkit for collecting data). However, they want to have the surveys on a DHIS2 instance as well. To avoid making the data entry twice, the idea is to transfer the survey information automatically from Kobo to dhis2. The destination in dhis2 is an single event program.
The Kobo data is available using the Kobo API, but some curation data process was needed.
As a requirement, the integration process should be running every 2 hours
In addition, notification emails would sent for different purposes (the process has started, there was a problem in the dhis2 payload…), in order to monitor the automatic process.
Approach/Solution
In order to accomplish this request, an ETL (Extract, Transform, and Load) process was set up using Apache NiFi (an easy to use, powerful, and reliable system to process and distribute data). The main steps of that process are:
- Retrieve data from Kobo using the API (CSV) every 2 hours. Each row is a survey submitted in Kobo.
- Internal pre-processing (data curation)
- Select only the latest surveys to be processed. There was an agreement to process only the surveys of the last 10 days.
- Organisation Unit mapping between Kobo OU code system and dhis2 (using an external CSV that needs to be up-to-date)
- Internal: Split one CSV file (that contains multiple surveys) into multiple JSON files (future payloads).
- Generate the payload to be sent to dhis2 (using JOLT, JSON to JSON transform library). In this step there is also a value mapping process (mainly for dhis2 optionSets)
- Double check if the survey was already uploaded (using the dhis2 API).
- If the survey was not previously uploaded, send data (event payload) to dhis2.
In this screenshot you can see the ETL process configured in Apache NiFi
1 post - 1 participant