Skip to main content
Version: 0.15 (Latest)

Apache Airflow

Apache Airflow is a powerful open-source platform for developing, scheduling, and monitoring batch-oriented workflows. FastTransfer can be seamlessly integrated into Airflow DAGs using the Docker operator and its official Docker image.

Docker Image

FastTransfer provides an official Docker image available on Docker Hub:

Image: arpeio/fasttransfer:latest
Repository: https://hub.docker.com/repository/docker/arpeio/fasttransfer/general

Prerequisites

Install Docker Provider

To use FastTransfer with Airflow, you need to install the Docker provider package:

pip install apache-airflow-providers-docker

Or add it to your Airflow requirements:

apache-airflow-providers-docker

Configure Airflow Connections

Airflow connections allow you to securely store and manage credentials for external systems. FastTransfer integration requires connections for your source and target databases.

A) Source Database Connection (MSSQL Example)

Create an MSSQL connection for your source database:

  1. Navigate to AdminConnections in the Airflow UI
  2. Click + to create a new connection
  3. In the Connection Type dropdown, select ODBC
  4. Configure the connection:

Connection Settings:

  • Connection Id: mssql_tpch10 (or your preferred name)
  • Connection Type: ODBC (select this from the dropdown)
  • Host: 192.168.65.254 (your database host)
  • Port: 11433 (your database port)
  • Login: Your database username (e.g., FastLogin)
  • Password: Your database password
  • Extra: {"database":"tpch10"}
tip

For ODBC connections, the database name should be specified in the Extra JSON field using {"database":"your_database_name"}.

B) Target Database Connection (PostgreSQL Example)

Create a PostgreSQL connection for your target database:

  1. Navigate to AdminConnections in the Airflow UI
  2. Click + to create a new connection
  3. In the Connection Type dropdown, select Postgres
  4. Configure the connection:

Connection Settings:

  • Connection Id: pg_target (or your preferred name)
  • Connection Type: Postgres (select this from the dropdown)
  • Host: 192.168.65.254 (your database host)
  • Port: 25433 (your database port)
  • Schema: postgres (this is the database name)
  • Login: postgres (your database username)
  • Password: Your database password
tip

For Postgres connections, the database name is specified in the Schema field.

Integration with DockerOperator

The DockerOperator from the Docker provider allows you to execute FastTransfer commands in a Docker container directly from your Airflow DAGs.

Complete DAG Example

Create a DAG file (e.g., fasttransfer_orders_mssql_to_pg.py) in your Airflow DAGs folder:

from datetime import datetime

from airflow import DAG
from airflow.sdk.bases.hook import BaseHook
from airflow.providers.docker.operators.docker import DockerOperator


def _conn_parts(conn_id: str):
"""
Extract connection details from Airflow connection.

Works with:
- ODBC connections (for MSSQL): database in Extra JSON {"database":"name"}
- Postgres connections: database in Schema field

Returns:
tuple: (host, port, user, password, database)
"""
c = BaseHook.get_connection(conn_id)
host = c.host
port = c.port
user = c.login
pwd = c.password

# Database in Extra {"database": "..."} (for ODBC MSSQL) or in Schema (for Postgres)
db = c.extra_dejson.get("database") or c.schema

if not db:
raise ValueError(
f"Connection '{conn_id}': database not found. Add it to Extra JSON "
f'{{"database":"..."}} (for ODBC) or Schema field (for Postgres).'
)

return host, port, user, pwd, db


with DAG(
dag_id="fasttransfer_orders_mssql_to_pg",
start_date=datetime(2025, 1, 1),
schedule=None,
catchup=False,
tags=["fasttransfer", "mssql", "postgres", "docker"],
) as dag:

# Extract connection details
mssql_host, mssql_port, mssql_user, mssql_pwd, mssql_db = _conn_parts("mssql_tpch10")
pg_host, pg_port, pg_user, pg_pwd, pg_db = _conn_parts("pg_target")

# FastTransfer Docker task
transfer_task = DockerOperator(
task_id="transfer_orders",
image="arpeio/fasttransfer:latest",
docker_url="tcp://host.docker.internal:2375",
api_version="auto",
auto_remove="success",
mount_tmp_dir=False, # Avoids the remote-engine tmp mount warning
do_xcom_push=False,
command=[
"--sourceconnectiontype", "mssql",
"--sourceserver", f"{mssql_host},{mssql_port}",
"--sourcedatabase", mssql_db,
"--sourceuser", mssql_user,
"--sourcepassword", mssql_pwd,
"--sourceschema", "dbo",
"--sourcetable", "orders",

"--targetconnectiontype", "pgcopy",
"--targetserver", f"{pg_host}:{pg_port}",
"--targetdatabase", pg_db,
"--targetuser", pg_user,
"--targetpassword", pg_pwd,
"--targetschema", "public",
"--targettable", "orders",

"--loadmode", "Truncate",
"--mapmethod", "Name",
"--method", "Ntile",
"--degree", "12",
"--distributekeycolumn", "o_orderkey",
"--nobanner",
],
network_mode="bridge",
)

Key Parameters:

  • image: Specifies the FastTransfer Docker image
  • docker_url: Docker daemon endpoint
  • command: List of FastTransfer CLI arguments
  • auto_remove: Automatically remove the container after successful execution
  • mount_tmp_dir: Set to False to avoid warnings with remote Docker daemons

Helper Functions Explained

The DAG includes a utility function to streamline connection management:

_conn_parts(conn_id)

Extracts connection details from an Airflow connection. This function:

  • Retrieves host, port, username, and password from the connection
  • Extracts the database name from either the Extra JSON field (for ODBC connections) or the Schema field (for Postgres connections)
  • Returns all components needed to build the FastTransfer connection parameters
  • Validates that all required fields are present

This approach ensures that credentials are managed centrally in Airflow and never hardcoded in DAG files.

Logging

All FastTransfer execution logs are captured by Airflow and can be viewed in the Airflow UI:

  1. Navigate to DAGs in the Airflow UI
  2. Click on your DAG (e.g., fasttransfer_orders_mssql_to_pg)
  3. Click on the task execution (green/red square in the Grid view)
  4. Click the Logs tab

Example Log Output

The logs provide detailed information about:

  • FastTransfer version and configuration
  • Source and target connection details (with masked passwords)
  • Data transfer progress
  • Performance metrics (rows/second, total time)
  • Exit status
[2026-02-24 11:20:50] INFO - Running docker container: arpeio/fasttransfer:latest
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.450 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- The "FastTransfer_Settings.json" file does not exist. Using default settings. Console Only with loglevel=Information
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.452 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- FastTransfer - running in trial mode – trial mode will end on 2026‑02‑27 - normal licensed mode will then start (6 day(s) left).
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.655 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Starting
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- FastTransfer Version : 0.15.0.0 Architecture : X64 - Framework : .NET 8.0.23
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- OS : Debian GNU/Linux 13 (trixie)
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Process ID : 1
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Generated Run ID : ad302781-9658-4549-ae1f-59178edea7ed
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Source Database : tpch10
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Source SqlInstance : 192.168.65.254,11433
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Source Connection Type : mssql
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Target Type : pgcopy
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Target Database : postgres
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.656 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Target Server : 192.168.65.254:25433
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.890 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Source Connection String : Data Source=192.168.65.254,11433;Initial Catalog=tpch10;User ID=FastLogin;Password=xxxxx;Connect Timeout=120;Encrypt=True;Trust Server Certificate=True;Application Name=FastTransfer;Application Intent=ReadOnly;Command Timeout=10800
[2026-02-24 11:20:50] INFO - 2026-02-24T10:20:50.890 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Target Connection String : Host=192.168.65.254;Port=25433;Database=postgres;Trust Server Certificate=True;Application Name=FastTransfer;Timeout=15;Command Timeout=10800;Username=postgres;Password=xxxxx
[2026-02-24 11:20:51] INFO - 2026-02-24T10:20:51.003 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Start Loading Data using distribution method NTile
[2026-02-24 11:20:51] INFO - 2026-02-24T10:20:51.015 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Start Loading Data using distribution method NTile
[2026-02-24 11:20:51] INFO - 2026-02-24T10:20:51.040 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Start Loading Data using distribution method NTile
[2026-02-24 11:20:51] INFO - 2026-02-24T10:20:51.067 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Start Loading Data using distribution method NTile
[2026-02-24 11:21:26] INFO - 2026-02-24T10:21:26.893 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 7 for o_orderkey between 35000007 and 40000007 : 1250001 rows x 9 columns in 34027ms
[2026-02-24 11:21:27] INFO - 2026-02-24T10:21:27.092 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 4 for o_orderkey between 20000004 and 25000004 : 1250001 rows x 9 columns in 34226ms
[2026-02-24 11:21:27] INFO - 2026-02-24T10:21:27.179 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 1 for o_orderkey between 5000001 and 10000001 : 1250001 rows x 9 columns in 34313ms
[2026-02-24 11:21:29] INFO - 2026-02-24T10:21:29.227 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 8 for o_orderkey between 40000032 and 45000032 : 1250001 rows x 9 columns in 36361ms
[2026-02-24 11:21:32] INFO - 2026-02-24T10:21:32.104 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 9 for o_orderkey between 45000033 and 50000033 : 1250001 rows x 9 columns in 39238ms
[2026-02-24 11:21:34] INFO - 2026-02-24T10:21:34.985 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 3 for o_orderkey between 15000003 and 20000003 : 1250001 rows x 9 columns in 42119ms
[2026-02-24 11:21:36] INFO - 2026-02-24T10:21:36.100 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 0 for o_orderkey between 1 and 5000000 : 1250000 rows x 9 columns in 43234ms
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.249 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 5 for o_orderkey between 25000005 and 30000005 : 1250001 rows x 9 columns in 44383ms
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.888 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load Query 6 for o_orderkey between 30000006 and 35000006 : 1250001 rows x 9 columns in 45022ms
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.889 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Total rows : 15000000
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.889 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Total columns : 9
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.889 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Total cells : 135000000
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.889 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Rows Throughput : 326465 rows/s
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.889 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Cells Throughput : 2938190 cells/s
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.889 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Transfert time : Elapsed=45946 ms
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.889 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Total time : Elapsed=46585 ms
[2026-02-24 11:21:37] INFO - 2026-02-24T10:21:37.889 +00:00 -|- -|- ad302781-9658-4549-ae1f-59178edea7ed -|- INFORMATION -|- -|- Completed Load
[2026-02-24 11:21:38] INFO - Task instance in success state
[2026-02-24 11:21:38] INFO - Previous state of the Task instance: TaskInstanceState.RUNNING
[2026-02-24 11:21:38] INFO - Task operator:<Task(DockerOperator): transfer_orders>
Copyright © 2026 Architecture & Performance.