If you’re using the Notion API template_id feature to create consistent database items, you’ve probably run into the least glamorous failure mode in automation: duplicates.
The painful part isn’t that duplicates happen—it’s why: retries, timeouts, fuzzy matching, and “create-first” agent behavior combine into a system that’s impossible to trust. This post lays out a production-grade, idempotent upsert pattern you can use with Notion (and with LLM-powered agents) so repeated runs converge on one canonical page.
We’ll cover:
- Why Notion automations create duplicate pages
- The Search → Match → Update → Else Create-from-Template workflow
- How to use
template_id(and when not to) - Idempotency keys, match evidence, diff-based writes, and write receipts
- A worked example: Email/PDF → Notion Trip record (a common operator workflow)
Why duplicates happen (the real root causes)
Most “duplicate page” bugs aren’t a Notion problem—they’re an automation contract problem.
Here are the patterns that cause duplicates in real systems:
1) No stable identifier (so every run is a fresh guess)
If your workflow identifies a record by something like “title contains John” or “date is close to today,” you don’t have an upsert—you have a best effort.
2) Fuzzy matching with no decision log
Even when your fuzzy match is good, the lack of recorded evidence makes it impossible to debug later:
- Which candidates were considered?
- What normalized fields were used?
- What was the confidence score / threshold?
- Why was Candidate A preferred over Candidate B?
Without this, the next operator who sees duplicates can’t fix the root cause; they can only clean up.
3) Create-first behavior during partial failures
The classic story:
- Run A creates a page, then times out before your system stores the returned
page_id. - Run B doesn’t know about Run A’s write, so it creates again.
This gets even worse with agentic runs (Claude Skills, tool-using agents, etc.) because a model might treat a 500 or timeout as “nothing happened” unless you explicitly design around it.
4) Template application is asynchronous
When you create a page using a template (template[type] = default or template_id), Notion returns quickly with a page object that may appear blank initially. The template content/properties are applied asynchronously, which can trip up follow-on steps that expect the page to be fully populated immediately.
The production pattern: Search → Match → Update → (Else) Create-from-Template
This is the workflow you want:
- Normalize input (deterministic)
- Query Notion for candidates (bounded)
- Match one record (explainable)
- Update (diff-based + allowlisted)
- Else Create page from template (
template_id) - Persist a write receipt (so retries become safe)
The key idea: upsert isn’t an API call—it’s a workflow with gates.
The “Upsert Gate” checklist
Before you write to Notion, your workflow should confirm:
- You have an idempotency key (or you’re explicitly in “human review” mode)
- Your candidate query is bounded (e.g., max 20 results)
- The match decision is either:
- Exact match on Tier 1 identifier, or
- Constrained fuzzy match with evidence, or
- No match → safe to create
- You can produce a proposed diff (what will change)
- You can produce a write receipt after the call
This is where workflow-first systems shine. In nNode terms, this is “white-boxing”: the logic is explicit, testable, and replayable.
Using template_id correctly (and what teams usually miss)
Notion’s Create Page endpoint supports a template body parameter:
template[type] = "none"(default)template[type] = "default"(apply the data source’s default template)template[type] = "template_id"(apply a specific template)
Two important production constraints:
- When applying a template, you can’t pass
childrenin the same Create Page request. If you need extra content, append blocks after the template finishes applying. - Template application is asynchronous. If downstream steps depend on the template’s content, you must wait (webhook or polling) before you proceed.
How do you get the template ID?
In production, don’t hardcode template IDs unless you truly control the workspace. Prefer one of these:
- List templates for a data source and choose by name (and/or
is_default). - Copy the template URL from Notion and extract the UUID.
Also note: a template is a page. Your integration needs access to it.
Designing your idempotency key (3-tier strategy)
You need an idempotency key that is stable across retries.
Tier 1 (best): external system ID
If the record comes from another system, store the external ID in Notion:
- Email Message-ID
- Stripe invoice ID
- Linear issue ID
- Booking reference / PNR
- PDF invoice number
Create a Notion property like External ID (rich text) and treat it as unique.
Tier 2: deterministic composite key
When there is no external ID, build a normalized composite key:
- normalized customer email
- date (YYYY-MM-DD)
- vendor name
- amount (in cents)
Example composite key:
trip|jane.doe@example.com|2026-03-11|united-airlines|$482.13
Tier 3: human review queue
When you can’t produce a high-confidence key (or match confidence is low), don’t guess. Route to a review queue:
- Create a “Needs review” page
- Or write to a staging database
- Or alert an operator
This is how you keep automation reliable without pretending the world is clean.
Match logic that doesn’t bite you later
Think of matching as a series of increasingly permissive steps.
Step 1: exact match on Tier 1 key
If you have External ID, this should be a strict equality match.
Step 2: constrained fuzzy match (candidate set + scoring)
If you don’t have a unique key, do this:
- Query a candidate set with strict filters (e.g., same client, within a date window)
- Compute a score locally (don’t let the API “fuzzy search” for you)
- Select a winner only if score ≥ threshold and margin is large enough
Step 3: no safe match → create new
If your match isn’t confident, prefer create-new with a new idempotency key (and ideally tag it so later cleanup is easy).
Always record match evidence
Persist a small JSON “match report” somewhere (your DB, a log table, or even a Notion property if you must):
- normalized inputs
- query filters used
- candidate IDs
- score breakdown
- chosen page ID (or none)
This is the difference between “automation” and “mystery machine.”
Safe updates: diff-based writes + property allowlists
Notion pages can hold lots of properties. Your automation should not try to be clever.
1) Allowlist properties
Keep a list of property names (or property IDs) your workflow is allowed to write:
Name(title)External IDStatusStart DateEnd DateVendorAmount
Everything else is read-only to the automation.
2) Compute a diff before writing
Generate a patch that includes only fields that changed.
Benefits:
- fewer accidental overwrites
- better audit logs
- easier debugging
- reduces the chance of a partial “update” undoing a human’s edits
3) Write receipts
After any Notion write, store a receipt:
page_id- idempotency key
- timestamp
- properties changed
- request id / run id
If you do this, retries become safe: a rerun can detect “already applied” and exit cleanly.
Reliability in the messy world (timeouts, retries, and replays)
Here’s the rule:
Treat “unknown outcome” as “write may have happened.”
If a create/update call times out, your next run must:
- Re-run the query by idempotency key
- If found, continue from there
- If not found, only then attempt a create
Use resumable run state
For multi-step flows, persist a run record:
run_idphase(normalized/query/match/write)notion_page_id(once known)receipt(after write)
This is exactly the kind of “workflow runtime” nNode is designed for: deterministic steps + persisted artifacts + safe replays.
Observability: the 5 artifacts to persist on every upsert
If you only adopt one thing from this post, make it this list. Persist these artifacts per run:
- Input normalization
- Query snapshot (IDs + key properties)
- Match decision report
- Proposed diff
- Write receipt
When duplicates happen (they will), these artifacts let you answer:
- Was it a bad key?
- Did query filters return too many candidates?
- Did an API call time out?
- Did the automation run twice concurrently?
Code: a production-grade Notion upsert with template_id (TypeScript)
Below is a practical sketch using the Notion JavaScript client. You’ll need to adapt property names to your database.
1) Normalize inputs + build an idempotency key
import crypto from "node:crypto";
export function normalizeEmail(email: string) {
return email.trim().toLowerCase();
}
export function normalizeDateISO(date: Date) {
return date.toISOString().slice(0, 10); // YYYY-MM-DD
}
export function buildTripKey(params: {
travelerEmail: string;
startDate: Date;
vendor: string;
}) {
const raw = [
"trip",
normalizeEmail(params.travelerEmail),
normalizeDateISO(params.startDate),
params.vendor.trim().toLowerCase(),
].join("|");
// Hash keeps it short + consistent
const hash = crypto.createHash("sha256").update(raw).digest("hex").slice(0, 24);
return { raw, hash };
}
2) Query candidates by idempotency key (Tier 1)
Assume you have a External Key rich text property.
import { Client } from "@notionhq/client";
const notion = new Client({ auth: process.env.NOTION_TOKEN });
export async function findByExternalKey(databaseId: string, externalKey: string) {
const res = await notion.databases.query({
database_id: databaseId,
filter: {
property: "External Key",
rich_text: { equals: externalKey },
},
page_size: 10,
});
if (res.results.length === 1) return res.results[0];
if (res.results.length > 1) {
throw new Error(`Uniqueness violation: External Key=${externalKey}`);
}
return null;
}
3) Diff-based update (allowlisted)
type TripPatch = {
Name?: string;
Status?: "New" | "Confirmed" | "Needs review";
Vendor?: string;
StartDate?: string; // ISO date
EndDate?: string;
};
export function buildNotionPropertiesPatch(patch: TripPatch) {
const props: any = {};
if (patch.Name) {
props["Name"] = { title: [{ text: { content: patch.Name } }] };
}
if (patch.Status) {
props["Status"] = { select: { name: patch.Status } };
}
if (patch.Vendor) {
props["Vendor"] = { rich_text: [{ text: { content: patch.Vendor } }] };
}
if (patch.StartDate) {
props["Start Date"] = { date: { start: patch.StartDate } };
}
if (patch.EndDate) {
props["End Date"] = { date: { start: patch.EndDate } };
}
return props;
}
export async function updateTripPage(pageId: string, patch: TripPatch) {
const properties = buildNotionPropertiesPatch(patch);
return notion.pages.update({ page_id: pageId, properties });
}
4) Create-from-template using template_id
Important constraints:
- Don’t include
childrenwhen using a template - Expect the page to be returned “blank-ish” at first
export async function createTripFromTemplate(params: {
parentDatabaseId: string;
templateId: string;
externalKey: string;
name: string;
}) {
return notion.pages.create({
parent: { database_id: params.parentDatabaseId },
// Apply a specific template
template: {
type: "template_id",
template_id: params.templateId,
// timezone: "America/New_York", // optional
},
// Provide only the minimum properties you *must* set at creation time
properties: {
"Name": { title: [{ text: { content: params.name } }] },
"External Key": { rich_text: [{ text: { content: params.externalKey } }] },
"Status": { select: { name: "New" } },
},
});
}
5) The full upsert (update-or-create)
export async function upsertTrip(params: {
databaseId: string;
templateId: string;
travelerEmail: string;
startDate: Date;
vendor: string;
tripTitle: string;
}) {
const { hash: externalKey } = buildTripKey({
travelerEmail: params.travelerEmail,
startDate: params.startDate,
vendor: params.vendor,
});
// Gate 1: find existing
const existing = await findByExternalKey(params.databaseId, externalKey);
if (existing) {
// Gate 2: update only what you own
await updateTripPage(existing.id, {
Name: params.tripTitle,
Vendor: params.vendor,
StartDate: normalizeDateISO(params.startDate),
Status: "Confirmed",
});
return { mode: "updated" as const, pageId: existing.id, externalKey };
}
// Gate 3: create
const created = await createTripFromTemplate({
parentDatabaseId: params.databaseId,
templateId: params.templateId,
externalKey,
name: params.tripTitle,
});
return { mode: "created" as const, pageId: created.id, externalKey };
}
Waiting for the template to apply
If your next step depends on template content (e.g., you want to append blocks under a specific heading created by the template), you have two common options:
- Use webhooks and proceed when you receive the relevant “page is ready” event.
- Poll block children until the page is no longer empty (bounded retries).
Keep this step explicit; don’t “sleep 2 seconds and hope.”
Worked example: Email/PDF → Notion Trip record (operator-grade)
Here’s a workflow that creates duplicates constantly unless you design it as an upsert.
Inputs
- An email with booking confirmation
- A PDF itinerary/invoice
- Extracted fields: traveler email, travel dates, vendor, booking reference
Recommended idempotency key
- Tier 1: booking reference / PNR (best)
- Tier 2: traveler email + start date + vendor
Data model (Notion)
Create a Trips database with:
Name(title)External Key(rich text)Booking Reference(rich text)Status(select)Start Date/End Date(date)Source Email(url or rich text)Source Files(files)
Flow
- Extract fields (LLM + deterministic parsing)
- Normalize and compute
External Key - Query Trips by
Booking ReferenceorExternal Key - Update if found (diff-based)
- Else create from a trip template (
template_id) - Attach the email/PDF references
- Write receipt and stop
Where nNode fits
This is exactly the kind of workflow where you want:
- explicit gates (“don’t write until match confidence is high”)
- persisted artifacts (query snapshot, match report, diff, receipt)
- tool reliability (Notion + Drive + Gmail)
- safe replays
nNode is built around that operator-grade stance: workflows first (white-box), with agents used where they’re strongest (extraction, classification, summarization), and receipts everywhere so you can trust production runs.
Checklist: the “copy this into your runbook” section
Upsert Contract (Notion)
- A stable idempotency key exists (Tier 1 or Tier 2)
- Candidate query is bounded (page_size, date window, client filter)
- Match decision is explainable and persisted
- Updates are diff-based and property-allowlisted
- Create uses
template_idordefaulttemplate (nochildrenin create) - Template readiness is confirmed (webhook or bounded polling)
- A write receipt is persisted (page_id + changed fields + run_id)
- Retries treat timeouts as “write may have happened”
Anti-duplication rules of thumb
- Never create without first trying to query by idempotency key.
- Never fuzzy match across the entire database.
- Never let an agent “decide” to create without leaving a match report.
Closing: make your Notion automations boring
The goal isn’t “smart.” The goal is boring, repeatable correctness: every run produces the same result, and when something goes wrong you can explain why.
If you’re building Claude Skills or tool-using agents that touch real Notion systems, consider adopting a workflow-first runtime where upserts, retries, receipts, and template application are first-class primitives.
If you want to see what that looks like end-to-end—Notion upserts, template-driven creation, and replay-safe runs—take a look at nNode at nnode.ai.