KoboCollect API v2

1

Overview

This Standard Operating Procedure (SOP) provides a comprehensive guide for downloading data from KoboCollect (KoBoToolbox) to Excel format using the API v2. The process involves authentication, project selection, data export, and file download.

Note: KoboCollect API v2 is the recommended version for data export operations as it provides better performance and more features than v1.

Screenshot: KoboCollect Dashboard Overview

Add screenshot showing the downloaded Excel file opened in Excel or spreadsheet application

2

Prerequisites

Before starting the data export process, ensure you have the following:

Active KoboCollect account with appropriate permissions
API token or username/password credentials
Knowledge of your project/form ID
Basic understanding of REST API concepts
Tool for making API requests (cURL, Postman, Python, etc.)

Important: Ensure you have the necessary permissions to access and export data from the specific forms you want to download.

Screenshot: KoboCollect Account Settings

Add screenshot showing account settings and API token generation

3

Authentication setup

KoboCollect API v2 supports multiple authentication methods. The most secure method is using API tokens.

Generate API token

Log in to your KoboCollect account
Navigate to Account Settings
Go to the "API Token" section
Click "Generate New Token"
Copy and securely store the token

Screenshot: API Token Generation

Add screenshot showing the API token generation interface

Test authentication

Test your authentication using the following cURL command:

cURL

curl -H "Authorization: Token YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     https://kf.kobotoolbox.org/api/v2/assets/

Success Response: If authentication is successful, you'll receive a JSON response with your projects/assets list.

4

Get project list

Retrieve a list of all your projects/forms to identify the specific form you want to export data from.

Endpoint: GET https://kf.kobotoolbox.org/api/v2/assets/

cURL

curl -H "Authorization: Token YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     https://kf.kobotoolbox.org/api/v2/assets/

Python

import requests

# Configuration
API_TOKEN = "YOUR_API_TOKEN"
BASE_URL = "https://kf.kobotoolbox.org/api/v2"

headers = {
    "Authorization": f"Token {API_TOKEN}",
    "Content-Type": "application/json"
}

# Get projects list
response = requests.get(f"{BASE_URL}/assets/", headers=headers)
projects = response.json()

# Display projects
for project in projects['results']:
    print(f"Name: {project['name']}")
    print(f"UID: {project['uid']}")
    print(f"Asset Type: {project['asset_type']}")
    print("---")

Response parameters

Parameter	Description	Example
`uid`	Unique identifier for the form	aBcD1234EfGh5678
`name`	Human-readable name of the form	Household Survey 2024
`asset_type`	Type of asset (survey, template, etc.)	survey
`deployment__submission_count`	Number of submissions	156

Screenshot: Project List Response

Add screenshot showing the JSON response with project list

5

Access form data

Once you have identified your form UID, you can access the form's submissions data. This step shows you how to preview the data before exporting.

Endpoint: GET https://kf.kobotoolbox.org/api/v2/assets/{FORM_UID}/data/

cURL

curl -H "Authorization: Token YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     "https://kf.kobotoolbox.org/api/v2/assets/FORM_UID/data/?format=json"

Python

import requests
import json

# Configuration
API_TOKEN = "YOUR_API_TOKEN"
BASE_URL = "https://kf.kobotoolbox.org/api/v2"
FORM_UID = "YOUR_FORM_UID"

headers = {
    "Authorization": f"Token {API_TOKEN}",
    "Content-Type": "application/json"
}

# Get form data
params = {
    "format": "json",
    "limit": 10  # Limit to first 10 records for preview
}

response = requests.get(
    f"{BASE_URL}/assets/{FORM_UID}/data/", 
    headers=headers, 
    params=params
)

data = response.json()
print(f"Total submissions: {data['count']}")
print(f"Showing first {len(data['results'])} records:")
print(json.dumps(data['results'][0], indent=2))  # Pretty print first record

Tip: Use the limit parameter to preview a small number of records before downloading the entire dataset.

Screenshot: Form Data Preview

Add screenshot showing the JSON response with form submission data

6

Export data

Create an export job to generate the Excel file. KoboCollect uses an asynchronous export system where you first create an export request, then download the file once it's ready.

Endpoint: POST https://kf.kobotoolbox.org/api/v2/assets/{FORM_UID}/exports/

Export parameters

Parameter	Description	Options
`type`	Export format	xls, xlsx, csv, json, xml
`source`	Data source	submissions (default)
`lang`	Language for column headers	en, fr, es, etc.
`hierarchy_in_labels`	Include group names in headers	true, false
`group_sep`	Group separator character	/, -, _, etc.

cURL

curl -X POST \
     -H "Authorization: Token YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{
       "type": "xlsx",
       "source": "submissions",
       "lang": "en",
       "hierarchy_in_labels": true,
       "group_sep": "/"
     }' \
     https://kf.kobotoolbox.org/api/v2/assets/FORM_UID/exports/

Python

import requests
import time

# Configuration
API_TOKEN = "YOUR_API_TOKEN"
BASE_URL = "https://kf.kobotoolbox.org/api/v2"
FORM_UID = "YOUR_FORM_UID"

headers = {
    "Authorization": f"Token {API_TOKEN}",
    "Content-Type": "application/json"
}

# Create export request
export_data = {
    "type": "xlsx",
    "source": "submissions",
    "lang": "en",
    "hierarchy_in_labels": True,
    "group_sep": "/"
}

response = requests.post(
    f"{BASE_URL}/assets/{FORM_UID}/exports/",
    headers=headers,
    json=export_data
)

export_info = response.json()
export_uid = export_info['uid']
print(f"Export created with UID: {export_uid}")
print(f"Status: {export_info['status']}")

# Check export status
def check_export_status(export_uid):
    response = requests.get(
        f"{BASE_URL}/assets/{FORM_UID}/exports/{export_uid}/",
        headers=headers
    )
    return response.json()

# Wait for export to complete
while True:
    status_info = check_export_status(export_uid)
    print(f"Export status: {status_info['status']}")
    
    if status_info['status'] == 'complete':
        download_url = status_info['result']
        print(f"Export complete! Download URL: {download_url}")
        break
    elif status_info['status'] == 'error':
        print("Export failed!")
        break
    else:
        print("Export in progress... waiting 5 seconds")
        time.sleep(5)

Note: Large datasets may take several minutes to export. The API will return a status of "processing" until the export is complete.

Screenshot: Export Creation Response

Add screenshot showing the export creation response with UID and status

7

Download Excel file

Once the export is complete, you can download the Excel file using the provided download URL.

Endpoint: GET {DOWNLOAD_URL}

cURL

curl -H "Authorization: Token YOUR_API_TOKEN" \
     -o "exported_data.xlsx" \
     "DOWNLOAD_URL"

Python

import requests
from datetime import datetime

# Configuration
API_TOKEN = "YOUR_API_TOKEN"
DOWNLOAD_URL = "YOUR_DOWNLOAD_URL"  # Obtained from export status

headers = {
    "Authorization": f"Token {API_TOKEN}"
}

# Download the file
response = requests.get(DOWNLOAD_URL, headers=headers)

if response.status_code == 200:
    # Generate filename with timestamp
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    filename = f"kobocollect_export_{timestamp}.xlsx"
    
    # Save the file
    with open(filename, 'wb') as f:
        f.write(response.content)
    
    print(f"File downloaded successfully: {filename}")
    print(f"File size: {len(response.content)} bytes")
else:
    print(f"Download failed with status code: {response.status_code}")
    print(f"Error: {response.text}")

Success! Your Excel file has been downloaded and is ready for analysis.

Screenshot: Downloaded Excel File

Add screenshot showing the downloaded Excel file opened in Excel or spreadsheet application

8

Automation scripts

For regular data exports, you can use these complete automation scripts that handle the entire process from authentication to file download.

Complete Python script

Python

#!/usr/bin/env python3
"""
KoboCollect Data Export Script
Automatically exports data from KoboCollect to Excel format
"""

import requests
import time
import json
import os
from datetime import datetime
import argparse

class KoboExporter:
    def __init__(self, api_token, base_url="https://kf.kobotoolbox.org/api/v2"):
        self.api_token = api_token
        self.base_url = base_url
        self.headers = {
            "Authorization": f"Token {api_token}",
            "Content-Type": "application/json"
        }
    
    def get_projects(self):
        """Get list of all projects/forms"""
        try:
            response = requests.get(f"{self.base_url}/assets/", headers=self.headers)
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            print(f"Error fetching projects: {e}")
            return None
    
    def list_projects(self):
        """Display available projects"""
        projects = self.get_projects()
        if not projects:
            return
        
        print("\nAvailable projects:")
        print("-" * 80)
        for i, project in enumerate(projects['results'], 1):
            submission_count = project.get('deployment__submission_count', 0)
            print(f"{i:2d}. {project['name']}")
            print(f"    UID: {project['uid']}")
            print(f"    Submissions: {submission_count}")
            print(f"    Type: {project['asset_type']}")
            print()
    
    def create_export(self, form_uid, export_type="xlsx", **kwargs):
        """Create an export job"""
        export_data = {
            "type": export_type,
            "source": "submissions",
            "lang": kwargs.get("lang", "en"),
            "hierarchy_in_labels": kwargs.get("hierarchy_in_labels", True),
            "group_sep": kwargs.get("group_sep", "/")
        }
        
        try:
            response = requests.post(
                f"{self.base_url}/assets/{form_uid}/exports/",
                headers=self.headers,
                json=export_data
            )
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            print(f"Error creating export: {e}")
            return None
    
    def check_export_status(self, form_uid, export_uid):
        """Check the status of an export job"""
        try:
            response = requests.get(
                f"{self.base_url}/assets/{form_uid}/exports/{export_uid}/",
                headers=self.headers
            )
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            print(f"Error checking export status: {e}")
            return None
    
    def download_file(self, download_url, filename=None):
        """Download the exported file"""
        if not filename:
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            filename = f"kobocollect_export_{timestamp}.xlsx"
        
        try:
            response = requests.get(download_url, headers=self.headers)
            response.raise_for_status()
            
            with open(filename, 'wb') as f:
                f.write(response.content)
            
            return filename
        except requests.exceptions.RequestException as e:
            print(f"Error downloading file: {e}")
            return None
    
    def export_data(self, form_uid, output_dir="./downloads", **kwargs):
        """Complete export process"""
        print(f"Starting export for form: {form_uid}")
        
        # Create output directory if it doesn't exist
        os.makedirs(output_dir, exist_ok=True)
        
        # Create export
        print("Creating export job...")
        export_info = self.create_export(form_uid, **kwargs)
        if not export_info:
            return False
        
        export_uid = export_info['uid']
        print(f"Export job created: {export_uid}")
        
        # Wait for completion
        print("⏳ Waiting for export to complete...")
        max_attempts = 60  # 5 minutes maximum wait time
        attempt = 0
        
        while attempt < max_attempts:
            status_info = self.check_export_status(form_uid, export_uid)
            if not status_info:
                return False
            
            status = status_info['status']
            print(f"   Status: {status}")
            
            if status == 'complete':
                download_url = status_info['result']
                break
            elif status == 'error':
                print("Export failed!")
                return False
            else:
                time.sleep(5)
                attempt += 1
        
        if attempt >= max_attempts:
            print("⏰ Export timeout!")
            return False
        
        # Download file
        print("Downloading file...")
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = os.path.join(output_dir, f"export_{form_uid}_{timestamp}.xlsx")
        
        downloaded_file = self.download_file(download_url, filename)
        if downloaded_file:
            file_size = os.path.getsize(downloaded_file)
            print(f"Download complete: {downloaded_file}")
            print(f"File size: {file_size:,} bytes")
            return True
        
        return False

def main():
    parser = argparse.ArgumentParser(description="Export data from KoboCollect")
    parser.add_argument("--token", required=True, help="API Token")
    parser.add_argument("--form-uid", help="Form UID to export")
    parser.add_argument("--list-projects", action="store_true", help="List available projects")
    parser.add_argument("--output-dir", default="./downloads", help="Output directory")
    parser.add_argument("--format", default="xlsx", choices=["xlsx", "xls", "csv", "json"], help="Export format")
    
    args = parser.parse_args()
    
    # Initialize exporter
    exporter = KoboExporter(args.token)
    
    if args.list_projects:
        exporter.list_projects()
        return
    
    if not args.form_uid:
        print("Please provide --form-uid or use --list-projects to see available forms")
        return
    
    # Export data
    success = exporter.export_data(
        args.form_uid,
        output_dir=args.output_dir,
        export_type=args.format
    )
    
    if success:
        print("Export completed successfully!")
    else:
        print("Export failed!")

if __name__ == "__main__":
    main()

# Usage examples:
# python kobo_export.py --token YOUR_TOKEN --list-projects
# python kobo_export.py --token YOUR_TOKEN --form-uid YOUR_FORM_UID
# python kobo_export.py --token YOUR_TOKEN --form-uid YOUR_FORM_UID --format csv

Batch export script

Python

#!/usr/bin/env python3
"""
Batch Export Script for Multiple Forms
"""

import json
from kobo_export import KoboExporter
import time

def batch_export(api_token, form_uids, output_dir="./batch_downloads"):
    """Export multiple forms in batch"""
    exporter = KoboExporter(api_token)
    
    results = {
        "successful": [],
        "failed": []
    }
    
    print(f"Starting batch export of {len(form_uids)} forms...")
    
    for i, form_uid in enumerate(form_uids, 1):
        print(f"\nProcessing form {i}/{len(form_uids)}: {form_uid}")
        
        success = exporter.export_data(form_uid, output_dir=output_dir)
        
        if success:
            results["successful"].append(form_uid)
        else:
            results["failed"].append(form_uid)
        
        # Add delay between exports to avoid rate limiting
        if i < len(form_uids):
            print("⏳ Waiting 10 seconds before next export...")
            time.sleep(10)
    
    # Summary
    print("\n" + "="*50)
    print("BATCH EXPORT SUMMARY")
    print("="*50)
    print(f"Successful: {len(results['successful'])}")
    print(f"Failed: {len(results['failed'])}")
    
    if results['failed']:
        print("\nFailed exports:")
        for form_uid in results['failed']:
            print(f"   - {form_uid}")
    
    return results

# Example usage
if __name__ == "__main__":
    API_TOKEN = "YOUR_API_TOKEN"
    FORM_UIDS = [
        "form_uid_1",
        "form_uid_2", 
        "form_uid_3"
    ]
    
    batch_export(API_TOKEN, FORM_UIDS)

Screenshot: Automation Script Output

Add screenshot showing the console output of the automation script running

9

Troubleshooting

Common issues and their solutions when working with KoboCollect API v2:

Authentication issues

Error: 401 Unauthorized

Solutions:

Verify your API token is correct and active
Check if the token has expired
Ensure proper Authorization header format: Token YOUR_API_TOKEN
Try regenerating your API token

Form access issues

Error: 404 Not Found or 403 Forbidden

Solutions:

Verify the form UID is correct
Check if you have permission to access the form
Ensure the form is deployed and has submissions
Contact the form owner for access permissions

Export issues

Issue: Export stuck in "processing" status

Solutions:

Wait longer - large datasets can take 10-30 minutes
Check server status on KoboToolbox status page
Try exporting smaller date ranges
Contact KoboToolbox support if persistent

Data quality issues

Issue: Missing or incorrect data in export

Solutions:

Check export parameters (language, format settings)
Verify data exists in the web interface
Try different export formats (CSV vs Excel)
Check if filters are applied in the export request

Network issues

Python

# Robust error handling example
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def create_robust_session():
    session = requests.Session()
    
    # Define retry strategy
    retry_strategy = Retry(
        total=3,
        backoff_factor=1,
        status_forcelist=[429, 500, 502, 503, 504],
    )
    
    # Mount adapter with retry strategy
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("http://", adapter)
    session.mount("https://", adapter)
    
    return session

# Usage
session = create_robust_session()
try:
    response = session.get(url, headers=headers, timeout=30)
    response.raise_for_status()
except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")

Common error codes

Error Code	Description	Solution
400	Bad Request	Check request parameters and JSON format
401	Unauthorized	Verify API token and permissions
403	Forbidden	Check form access permissions
404	Not Found	Verify form UID and endpoint URL
429	Rate Limited	Reduce request frequency, add delays
500	Server Error	Try again later, contact support if persistent

Screenshot: Error Response Example

Add screenshot showing an error response in the API testing tool

Pro Tip: Always implement proper error handling and logging in your production scripts to diagnose issues quickly.

Table of contents

Overview

Prerequisites

Authentication setup

Generate API token

Test authentication

Get project list

Response parameters

Access form data

Export data

Export parameters

Download Excel file

Automation scripts

Complete Python script

Batch export script

Troubleshooting

Authentication issues

Form access issues

Export issues

Data quality issues

Network issues

Common error codes