Migrating legacy continuous queries to InfluxDB 2.x tasks

You have a single InfluxDB 1.x continuous query (CQ) that has downsampled per-second IoT telemetry to hourly means for years, and you need to reproduce it exactly as a 2.x Flux task before decommissioning the 1.x instance. The trap is that the “obvious” translation — a task with every: 1h reading range(start: -1h) — compiles, runs green, and silently drops late edge data, skips backfill, or double-writes hours if anyone widens the window. This walkthrough takes one concrete CQ and converts it clause by clause into a task whose read window tiles the timeline contiguously, then provisions and validates it. It is the detailed companion to the semantic overview in Continuous Query Migration to Tasks, and it fits into the wider downsampling & aggregation pipeline design work of moving 1.x rollups onto the explicit task engine.

The CQ we will migrate throughout is:

sql

CREATE CONTINUOUS QUERY "cq_temp_1h" ON "iot"
BEGIN
  SELECT mean("value") AS "value"
  INTO "iot"."long"."temp_1h"
  FROM "iot"."autogen"."temp"
  GROUP BY time(1h), "host"
END

Prerequisites

An InfluxDB 1.8+ source you can query with influx -execute (to read the CQ) and an InfluxDB 2.7+ or 3.x target with the Flux task engine enabled.
Flux 0.x, bundled with the 2.x/3.x target — no separate install.
A destination bucket (temp_1h in the examples) created before the first run, with an explicit retention period; a 2.x bucket does not inherit a retention policy the way the 1.x INTO target did — see retention policy design.
An operator or all-access token with read on the source bucket, write on temp_1h, and write on _tasks.
Python 3.9+ with influxdb-client 1.36+ for provisioning and validation.
The full CQ text captured verbatim from SHOW CONTINUOUS QUERIES, so the conversion can be checked clause for clause.

Solution walkthrough

1. Decompose the CQ clause by clause

Before writing any Flux, split the InfluxQL into its four load-bearing parts and note what each one implies. This is the whole migration in miniature — every clause becomes an explicit Flux stage or an explicit option task parameter, and the two implicit behaviors (run cadence and backfill) become deliberate choices rather than defaults.

CQ clause	Meaning	Flux target
`SELECT mean("value")`	aggregate function on a field	`aggregateWindow(fn: mean)`
`GROUP BY time(1h)`	the rollup window	`aggregateWindow(every: 1h)`
`GROUP BY "host"`	one series per host	tags retained through the pipe
`INTO "iot"."long"."temp_1h"`	destination + retention policy	`to(bucket: "temp_1h")` + bucket `retentionPeriod`
CQ run cadence (implicit)	how often it fired	`option task = {every: 1h}`
Automatic backfill (implicit)	missed intervals re-run	explicit replay task (Step 4)

The single most important observation: GROUP BY time(1h) and the CQ’s firing cadence are the same duration in 1.x but are two independent knobs in Flux — the aggregation window (aggregateWindow(every:)) and the task cadence (option task.every). Conflating them is the most common conversion bug and is covered as a gotcha below.

2. Assemble the boundary-safe task

Now write the Flux. The read window is the part that carries correctness: it must read exactly one cadence, ending one offset in the past, so that consecutive runs meet at a shared boundary — no overlap (which double-counts) and no gap (which loses an hour).

flux

option task = {name: "cq_temp_1h", every: 1h, offset: 10m}

from(bucket: "iot/long")
    |> range(start: -task.every - task.offset, stop: -task.offset)
    |> filter(fn: (r) => r._measurement == "temp" and r._field == "value")
    |> aggregateWindow(every: 1h, fn: mean, createEmpty: false)
    |> to(bucket: "temp_1h")

Three parameters do the work. offset: 10m delays the run ten minutes past the top of the hour, so edge gateways that buffer readings during cellular gaps have flushed before the window is read. The range(start: -task.every - task.offset, stop: -task.offset) expression spans exactly the hour that closed ten minutes ago — because both bounds shift by the same offset, run N and run N+1 tile contiguously. And createEmpty: false suppresses the null rows aggregateWindow would otherwise emit for hosts that reported nothing, matching the CQ’s behavior and keeping destination cardinality bounded.

The original CQ grouped by host. Flux carries the host tag through the pipe automatically, so aggregateWindow produces one mean per host per hour exactly as before — no explicit group() is required for this shape. If a downstream consumer expects the 1.x column layout (fields as columns rather than _field/_value pairs), add import "influxdata/influxdb/v1" and a |> v1.fieldsAsCols() stage before to(). The transformation logic itself follows the replay-safe conventions in Flux scripting for task automation.

3. Provision the task as code and run it once

Create the task through the client rather than the UI, so the definition lives in version control and can be diffed across staging and production. Triggering one manual run before trusting the schedule is the cheapest way to catch a token, bucket, or org-id mistake. This is the entry point to the broader Python client orchestration patterns.

python

import os
from influxdb_client import InfluxDBClient, TaskCreateRequest

client = InfluxDBClient(
    url=os.environ["INFLUX_URL"],
    token=os.environ["INFLUX_TOKEN"],
    org=os.environ["INFLUX_ORG"],
)

flux = """
option task = {name: "cq_temp_1h", every: 1h, offset: 10m}

from(bucket: "iot/long")
    |> range(start: -task.every - task.offset, stop: -task.offset)
    |> filter(fn: (r) => r._measurement == "temp" and r._field == "value")
    |> aggregateWindow(every: 1h, fn: mean, createEmpty: false)
    |> to(bucket: "temp_1h")
"""

tasks_api = client.tasks_api()
created = tasks_api.create_task(
    task=TaskCreateRequest(
        org_id=os.environ["INFLUX_ORG_ID"],
        flux=flux,
        description="Migrated from 1.x cq_temp_1h",
        status="active",
    )
)

run = tasks_api.run_manually(task_id=created.id)
print(f"Task {created.id} created; validation run {run.id} status {run.status}")

Keep the 1.x CQ running in parallel during this period — do not DROP CONTINUOUS QUERY cq_temp_1h until Step 4’s parity check passes.

4. Recover the historical gap with an explicit replay

The CQ engine re-ran missed intervals automatically; a scheduled task does not. To fill the window between the CQ’s last write and the task’s first run — or any later outage — run a one-off task whose range() names the historical window explicitly. Because the aggregate snaps to fixed hourly boundaries, re-running the replay overwrites the same series/timestamp keys instead of appending duplicates, so it is safe to run more than once.

flux

option task = {name: "cq_temp_1h_backfill", every: 1h}

from(bucket: "iot/long")
    // Replace with the exact gap you are recovering.
    |> range(start: 2026-01-18T00:00:00Z, stop: 2026-01-20T00:00:00Z)
    |> filter(fn: (r) => r._measurement == "temp" and r._field == "value")
    |> aggregateWindow(every: 1h, fn: mean, createEmpty: false)
    |> to(bucket: "temp_1h")

Delete this task once it completes, so it does not keep re-running against a frozen window.

Gotchas and edge cases

RESAMPLE EVERY ... FOR ... is not a single Flux stage. If the source CQ read CREATE CONTINUOUS QUERY ... RESAMPLE EVERY 30m FOR 2h ..., that clause encoded two things: a 30-minute cadence and a two-hour re-computation window that re-swept late data. It maps to option task = {every: 30m} plus a widened read window (range(start: -2h - task.offset, stop: -task.offset)) with the aggregate snapped to fixed boundaries so the overlapping re-computations overwrite rather than duplicate. Treating RESAMPLE as a plain rollup silently loses its late-data recovery. Sizing that offset and window against real arrival lag is covered in cron & interval scheduling logic.

createEmpty: false vs true for sparse fleets. The CQ never emitted rows for hosts that reported nothing in an hour, so createEmpty: false reproduces it and keeps cardinality flat. Switch to createEmpty: true only if a downstream deadman or gap-fill step genuinely needs the null markers — otherwise a fleet of intermittently-reporting sensors inflates the destination with null-filled windows. The trade-off for sparse sensors is examined in fallback chains for missing data.

Epoch alignment and DST drift. InfluxQL snapped GROUP BY time(1h) to Unix-epoch multiples. aggregateWindow also aligns to the epoch by default, so a straight UTC rollup matches. But if the old system used a local-time report boundary (common for daily rollups feeding business dashboards), you must state location explicitly, or globally distributed sensors land in different hourly buckets than they did under 1.x — and the mismatch appears only twice a year at daylight-saving transitions.

Last-digit value differences. Flux’s row-oriented arithmetic can differ from the 1.x engine at the final significant digit of a mean. If the aggregate feeds threshold logic where that digit matters, reconcile with precision mapping & rounding strategies before cutover rather than assuming byte-for-byte equality.

Verification snippet

Do not drop the CQ until the task’s output reconciles with it point-for-point over an overlapping window. First confirm the run itself succeeded and inspect schedule-to-start latency by reading the _tasks system bucket directly, rather than trusting the UI checkmark:

flux

from(bucket: "_tasks")
    |> range(start: -24h)
    |> filter(fn: (r) => r._measurement == "runs")
    |> filter(fn: (r) => r.taskID == "TASK_ID_HERE")

Then compare the hourly point volume the migrated task produced against the same window still being written by the CQ — once the offset lag is accounted for, the counts should reconcile:

flux

from(bucket: "temp_1h")
    |> range(start: -24h)
    |> filter(fn: (r) => r._measurement == "temp" and r._field == "value")
    |> group(columns: ["host"])
    |> count()

From the CLI, confirm the task exists and is active before redirecting production reads:

bash

influx task list --org "$INFLUX_ORG"

Continuous Query Migration to Tasks — the semantic map of every implicit CQ behavior and its explicit Flux equivalent.
Cron & interval scheduling logic — sizing the cadence, offset, and read window a migrated task fires on.
Fallback chains for missing data — the createEmpty and gap-fill trade-offs for sparse sensor fleets.

Up one level: Continuous Query Migration to Tasks

# Migrating legacy continuous queries to InfluxDB 2.x tasks

# Prerequisites

# Solution walkthrough

# 1. Decompose the CQ clause by clause

# 2. Assemble the boundary-safe task

# 3. Provision the task as code and run it once

# 4. Recover the historical gap with an explicit replay

# Gotchas and edge cases

# Verification snippet

# Related

Related pages

Migrating legacy continuous queries to InfluxDB 2.x tasks

Prerequisites

Solution walkthrough

1. Decompose the CQ clause by clause

2. Assemble the boundary-safe task

3. Provision the task as code and run it once

4. Recover the historical gap with an explicit replay

Gotchas and edge cases

Verification snippet

Related