1

Overview

This Standard Operating Procedure (SOP) provides a comprehensive guide for downloading data from KoboCollect (KoBoToolbox) to Excel format using the API v2. The process involves authentication, project selection, data export, and file download.

Note: KoboCollect API v2 is the recommended version for data export operations as it provides better performance and more features than v1.
Screenshot: KoboCollect Dashboard Overview
Add screenshot showing the downloaded Excel file opened in Excel or spreadsheet application
2

Prerequisites

Before starting the data export process, ensure you have the following:

  • Active KoboCollect account with appropriate permissions
  • API token or username/password credentials
  • Knowledge of your project/form ID
  • Basic understanding of REST API concepts
  • Tool for making API requests (cURL, Postman, Python, etc.)
Important: Ensure you have the necessary permissions to access and export data from the specific forms you want to download.
Screenshot: KoboCollect Account Settings
Add screenshot showing account settings and API token generation
3

Authentication setup

KoboCollect API v2 supports multiple authentication methods. The most secure method is using API tokens.

Generate API token

  1. Log in to your KoboCollect account
  2. Navigate to Account Settings
  3. Go to the "API Token" section
  4. Click "Generate New Token"
  5. Copy and securely store the token
Screenshot: API Token Generation
Add screenshot showing the API token generation interface

Test authentication

Test your authentication using the following cURL command:

cURL
curl -H "Authorization: Token YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     https://kf.kobotoolbox.org/api/v2/assets/
Success Response: If authentication is successful, you'll receive a JSON response with your projects/assets list.
4

Get project list

Retrieve a list of all your projects/forms to identify the specific form you want to export data from.

Endpoint: GET https://kf.kobotoolbox.org/api/v2/assets/
cURL
curl -H "Authorization: Token YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     https://kf.kobotoolbox.org/api/v2/assets/
Python
import requests

# Configuration
API_TOKEN = "YOUR_API_TOKEN"
BASE_URL = "https://kf.kobotoolbox.org/api/v2"

headers = {
    "Authorization": f"Token {API_TOKEN}",
    "Content-Type": "application/json"
}

# Get projects list
response = requests.get(f"{BASE_URL}/assets/", headers=headers)
projects = response.json()

# Display projects
for project in projects['results']:
    print(f"Name: {project['name']}")
    print(f"UID: {project['uid']}")
    print(f"Asset Type: {project['asset_type']}")
    print("---")

Response parameters

Parameter Description Example
uid Unique identifier for the form aBcD1234EfGh5678
name Human-readable name of the form Household Survey 2024
asset_type Type of asset (survey, template, etc.) survey
deployment__submission_count Number of submissions 156
Screenshot: Project List Response
Add screenshot showing the JSON response with project list
5

Access form data

Once you have identified your form UID, you can access the form's submissions data. This step shows you how to preview the data before exporting.

Endpoint: GET https://kf.kobotoolbox.org/api/v2/assets/{FORM_UID}/data/
cURL
curl -H "Authorization: Token YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     "https://kf.kobotoolbox.org/api/v2/assets/FORM_UID/data/?format=json"
Python
import requests
import json

# Configuration
API_TOKEN = "YOUR_API_TOKEN"
BASE_URL = "https://kf.kobotoolbox.org/api/v2"
FORM_UID = "YOUR_FORM_UID"

headers = {
    "Authorization": f"Token {API_TOKEN}",
    "Content-Type": "application/json"
}

# Get form data
params = {
    "format": "json",
    "limit": 10  # Limit to first 10 records for preview
}

response = requests.get(
    f"{BASE_URL}/assets/{FORM_UID}/data/", 
    headers=headers, 
    params=params
)

data = response.json()
print(f"Total submissions: {data['count']}")
print(f"Showing first {len(data['results'])} records:")
print(json.dumps(data['results'][0], indent=2))  # Pretty print first record
Tip: Use the limit parameter to preview a small number of records before downloading the entire dataset.
Screenshot: Form Data Preview
Add screenshot showing the JSON response with form submission data
6

Export data

Create an export job to generate the Excel file. KoboCollect uses an asynchronous export system where you first create an export request, then download the file once it's ready.

Endpoint: POST https://kf.kobotoolbox.org/api/v2/assets/{FORM_UID}/exports/

Export parameters

Parameter Description Options
type Export format xls, xlsx, csv, json, xml
source Data source submissions (default)
lang Language for column headers en, fr, es, etc.
hierarchy_in_labels Include group names in headers true, false
group_sep Group separator character /, -, _, etc.
cURL
curl -X POST \
     -H "Authorization: Token YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{
       "type": "xlsx",
       "source": "submissions",
       "lang": "en",
       "hierarchy_in_labels": true,
       "group_sep": "/"
     }' \
     https://kf.kobotoolbox.org/api/v2/assets/FORM_UID/exports/
Python
import requests
import time

# Configuration
API_TOKEN = "YOUR_API_TOKEN"
BASE_URL = "https://kf.kobotoolbox.org/api/v2"
FORM_UID = "YOUR_FORM_UID"

headers = {
    "Authorization": f"Token {API_TOKEN}",
    "Content-Type": "application/json"
}

# Create export request
export_data = {
    "type": "xlsx",
    "source": "submissions",
    "lang": "en",
    "hierarchy_in_labels": True,
    "group_sep": "/"
}

response = requests.post(
    f"{BASE_URL}/assets/{FORM_UID}/exports/",
    headers=headers,
    json=export_data
)

export_info = response.json()
export_uid = export_info['uid']
print(f"Export created with UID: {export_uid}")
print(f"Status: {export_info['status']}")

# Check export status
def check_export_status(export_uid):
    response = requests.get(
        f"{BASE_URL}/assets/{FORM_UID}/exports/{export_uid}/",
        headers=headers
    )
    return response.json()

# Wait for export to complete
while True:
    status_info = check_export_status(export_uid)
    print(f"Export status: {status_info['status']}")
    
    if status_info['status'] == 'complete':
        download_url = status_info['result']
        print(f"Export complete! Download URL: {download_url}")
        break
    elif status_info['status'] == 'error':
        print("Export failed!")
        break
    else:
        print("Export in progress... waiting 5 seconds")
        time.sleep(5)
Note: Large datasets may take several minutes to export. The API will return a status of "processing" until the export is complete.
Screenshot: Export Creation Response
Add screenshot showing the export creation response with UID and status
7

Download Excel file

Once the export is complete, you can download the Excel file using the provided download URL.

Endpoint: GET {DOWNLOAD_URL}
cURL
curl -H "Authorization: Token YOUR_API_TOKEN" \
     -o "exported_data.xlsx" \
     "DOWNLOAD_URL"
Python
import requests
from datetime import datetime

# Configuration
API_TOKEN = "YOUR_API_TOKEN"
DOWNLOAD_URL = "YOUR_DOWNLOAD_URL"  # Obtained from export status

headers = {
    "Authorization": f"Token {API_TOKEN}"
}

# Download the file
response = requests.get(DOWNLOAD_URL, headers=headers)

if response.status_code == 200:
    # Generate filename with timestamp
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    filename = f"kobocollect_export_{timestamp}.xlsx"
    
    # Save the file
    with open(filename, 'wb') as f:
        f.write(response.content)
    
    print(f"File downloaded successfully: {filename}")
    print(f"File size: {len(response.content)} bytes")
else:
    print(f"Download failed with status code: {response.status_code}")
    print(f"Error: {response.text}")
Success! Your Excel file has been downloaded and is ready for analysis.
Screenshot: Downloaded Excel File
Add screenshot showing the downloaded Excel file opened in Excel or spreadsheet application
8

Automation scripts

For regular data exports, you can use these complete automation scripts that handle the entire process from authentication to file download.

Complete Python script

Python
#!/usr/bin/env python3
"""
KoboCollect Data Export Script
Automatically exports data from KoboCollect to Excel format
"""

import requests
import time
import json
import os
from datetime import datetime
import argparse

class KoboExporter:
    def __init__(self, api_token, base_url="https://kf.kobotoolbox.org/api/v2"):
        self.api_token = api_token
        self.base_url = base_url
        self.headers = {
            "Authorization": f"Token {api_token}",
            "Content-Type": "application/json"
        }
    
    def get_projects(self):
        """Get list of all projects/forms"""
        try:
            response = requests.get(f"{self.base_url}/assets/", headers=self.headers)
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            print(f"Error fetching projects: {e}")
            return None
    
    def list_projects(self):
        """Display available projects"""
        projects = self.get_projects()
        if not projects:
            return
        
        print("\nAvailable projects:")
        print("-" * 80)
        for i, project in enumerate(projects['results'], 1):
            submission_count = project.get('deployment__submission_count', 0)
            print(f"{i:2d}. {project['name']}")
            print(f"    UID: {project['uid']}")
            print(f"    Submissions: {submission_count}")
            print(f"    Type: {project['asset_type']}")
            print()
    
    def create_export(self, form_uid, export_type="xlsx", **kwargs):
        """Create an export job"""
        export_data = {
            "type": export_type,
            "source": "submissions",
            "lang": kwargs.get("lang", "en"),
            "hierarchy_in_labels": kwargs.get("hierarchy_in_labels", True),
            "group_sep": kwargs.get("group_sep", "/")
        }
        
        try:
            response = requests.post(
                f"{self.base_url}/assets/{form_uid}/exports/",
                headers=self.headers,
                json=export_data
            )
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            print(f"Error creating export: {e}")
            return None
    
    def check_export_status(self, form_uid, export_uid):
        """Check the status of an export job"""
        try:
            response = requests.get(
                f"{self.base_url}/assets/{form_uid}/exports/{export_uid}/",
                headers=self.headers
            )
            response.raise_for_status()
            return response.json()
        except requests.exceptions.RequestException as e:
            print(f"Error checking export status: {e}")
            return None
    
    def download_file(self, download_url, filename=None):
        """Download the exported file"""
        if not filename:
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            filename = f"kobocollect_export_{timestamp}.xlsx"
        
        try:
            response = requests.get(download_url, headers=self.headers)
            response.raise_for_status()
            
            with open(filename, 'wb') as f:
                f.write(response.content)
            
            return filename
        except requests.exceptions.RequestException as e:
            print(f"Error downloading file: {e}")
            return None
    
    def export_data(self, form_uid, output_dir="./downloads", **kwargs):
        """Complete export process"""
        print(f"Starting export for form: {form_uid}")
        
        # Create output directory if it doesn't exist
        os.makedirs(output_dir, exist_ok=True)
        
        # Create export
        print("Creating export job...")
        export_info = self.create_export(form_uid, **kwargs)
        if not export_info:
            return False
        
        export_uid = export_info['uid']
        print(f"Export job created: {export_uid}")
        
        # Wait for completion
        print("⏳ Waiting for export to complete...")
        max_attempts = 60  # 5 minutes maximum wait time
        attempt = 0
        
        while attempt < max_attempts:
            status_info = self.check_export_status(form_uid, export_uid)
            if not status_info:
                return False
            
            status = status_info['status']
            print(f"   Status: {status}")
            
            if status == 'complete':
                download_url = status_info['result']
                break
            elif status == 'error':
                print("Export failed!")
                return False
            else:
                time.sleep(5)
                attempt += 1
        
        if attempt >= max_attempts:
            print("⏰ Export timeout!")
            return False
        
        # Download file
        print("Downloading file...")
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = os.path.join(output_dir, f"export_{form_uid}_{timestamp}.xlsx")
        
        downloaded_file = self.download_file(download_url, filename)
        if downloaded_file:
            file_size = os.path.getsize(downloaded_file)
            print(f"Download complete: {downloaded_file}")
            print(f"File size: {file_size:,} bytes")
            return True
        
        return False

def main():
    parser = argparse.ArgumentParser(description="Export data from KoboCollect")
    parser.add_argument("--token", required=True, help="API Token")
    parser.add_argument("--form-uid", help="Form UID to export")
    parser.add_argument("--list-projects", action="store_true", help="List available projects")
    parser.add_argument("--output-dir", default="./downloads", help="Output directory")
    parser.add_argument("--format", default="xlsx", choices=["xlsx", "xls", "csv", "json"], help="Export format")
    
    args = parser.parse_args()
    
    # Initialize exporter
    exporter = KoboExporter(args.token)
    
    if args.list_projects:
        exporter.list_projects()
        return
    
    if not args.form_uid:
        print("Please provide --form-uid or use --list-projects to see available forms")
        return
    
    # Export data
    success = exporter.export_data(
        args.form_uid,
        output_dir=args.output_dir,
        export_type=args.format
    )
    
    if success:
        print("Export completed successfully!")
    else:
        print("Export failed!")

if __name__ == "__main__":
    main()

# Usage examples:
# python kobo_export.py --token YOUR_TOKEN --list-projects
# python kobo_export.py --token YOUR_TOKEN --form-uid YOUR_FORM_UID
# python kobo_export.py --token YOUR_TOKEN --form-uid YOUR_FORM_UID --format csv

Batch export script

Python
#!/usr/bin/env python3
"""
Batch Export Script for Multiple Forms
"""

import json
from kobo_export import KoboExporter
import time

def batch_export(api_token, form_uids, output_dir="./batch_downloads"):
    """Export multiple forms in batch"""
    exporter = KoboExporter(api_token)
    
    results = {
        "successful": [],
        "failed": []
    }
    
    print(f"Starting batch export of {len(form_uids)} forms...")
    
    for i, form_uid in enumerate(form_uids, 1):
        print(f"\nProcessing form {i}/{len(form_uids)}: {form_uid}")
        
        success = exporter.export_data(form_uid, output_dir=output_dir)
        
        if success:
            results["successful"].append(form_uid)
        else:
            results["failed"].append(form_uid)
        
        # Add delay between exports to avoid rate limiting
        if i < len(form_uids):
            print("⏳ Waiting 10 seconds before next export...")
            time.sleep(10)
    
    # Summary
    print("\n" + "="*50)
    print("BATCH EXPORT SUMMARY")
    print("="*50)
    print(f"Successful: {len(results['successful'])}")
    print(f"Failed: {len(results['failed'])}")
    
    if results['failed']:
        print("\nFailed exports:")
        for form_uid in results['failed']:
            print(f"   - {form_uid}")
    
    return results

# Example usage
if __name__ == "__main__":
    API_TOKEN = "YOUR_API_TOKEN"
    FORM_UIDS = [
        "form_uid_1",
        "form_uid_2", 
        "form_uid_3"
    ]
    
    batch_export(API_TOKEN, FORM_UIDS)
Screenshot: Automation Script Output
Add screenshot showing the console output of the automation script running
9

Troubleshooting

Common issues and their solutions when working with KoboCollect API v2:

Authentication issues

Error: 401 Unauthorized

Solutions:

  • Verify your API token is correct and active
  • Check if the token has expired
  • Ensure proper Authorization header format: Token YOUR_API_TOKEN
  • Try regenerating your API token

Form access issues

Error: 404 Not Found or 403 Forbidden

Solutions:

  • Verify the form UID is correct
  • Check if you have permission to access the form
  • Ensure the form is deployed and has submissions
  • Contact the form owner for access permissions

Export issues

Issue: Export stuck in "processing" status

Solutions:

  • Wait longer - large datasets can take 10-30 minutes
  • Check server status on KoboToolbox status page
  • Try exporting smaller date ranges
  • Contact KoboToolbox support if persistent

Data quality issues

Issue: Missing or incorrect data in export

Solutions:

  • Check export parameters (language, format settings)
  • Verify data exists in the web interface
  • Try different export formats (CSV vs Excel)
  • Check if filters are applied in the export request

Network issues

Python
# Robust error handling example
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def create_robust_session():
    session = requests.Session()
    
    # Define retry strategy
    retry_strategy = Retry(
        total=3,
        backoff_factor=1,
        status_forcelist=[429, 500, 502, 503, 504],
    )
    
    # Mount adapter with retry strategy
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("http://", adapter)
    session.mount("https://", adapter)
    
    return session

# Usage
session = create_robust_session()
try:
    response = session.get(url, headers=headers, timeout=30)
    response.raise_for_status()
except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")

Common error codes

Error Code Description Solution
400 Bad Request Check request parameters and JSON format
401 Unauthorized Verify API token and permissions
403 Forbidden Check form access permissions
404 Not Found Verify form UID and endpoint URL
429 Rate Limited Reduce request frequency, add delays
500 Server Error Try again later, contact support if persistent
Screenshot: Error Response Example
Add screenshot showing an error response in the API testing tool
Pro Tip: Always implement proper error handling and logging in your production scripts to diagnose issues quickly.