Data Quality and Validation in FHIR Pipelines

Data quality in FHIR integrations isn’t a single check run at the end. It’s a series of gates applied at different stages of the pipeline, catching different classes of problem before they compound into downstream failures that are expensive to diagnose and correct.

The organizing frame for this article is pipeline placement: where in the data flow does each type of validation belong, why does the order matter, and what should you do when a gate fails? For the mechanics of how FHIR profiles and validators work, see FHIR Profiling. For terminology system mechanics in FHIR, see FHIR Terminology.

Data quality dimensions

Before deciding where to validate, it helps to have shared vocabulary for what you’re validating against:

Dimension	Definition	Failure example
Structural	Resource shape matches FHIR JSON/XML schema	Missing `resourceType`, invalid element name
Conformance	Resource satisfies profile cardinality, must-support, and invariant rules	`Patient.identifier` absent when profile requires min:1
Terminology	Coded elements use codes from the required value sets	LOINC code not in the required observation value set
Referential integrity	References resolve to real, accessible resources	`Observation.subject` references a Patient that doesn’t exist
Semantic	Data conveys accurate clinical meaning	A “completed” Observation with no value, a future birthdate
Completeness	Expected data for the use case is present	Claims record missing diagnosis codes

These dimensions are roughly ordered from “cheap to check” to “expensive to check”. Structural validation is fast and stateless. Semantic validation requires domain knowledge and may require human review. Most automated pipelines focus on structural through referential integrity; semantic validation needs supplemental tooling or human QA.

Validation layers

Structural validation

Structural validation confirms the resource is valid FHIR: it has a resourceType, elements use correct FHIR paths, data types match, and required base-FHIR elements (where cardinality is min:1 in the base spec) are present.

This is the cheapest check to run and should always happen first. A structural failure means the validator can’t proceed meaningfully with higher-level checks — you can’t evaluate a profile against a malformed payload.

Most FHIR libraries handle this automatically during parsing. A payload that fails structural validation typically results in a parse error before any explicit validation call.

// OperationOutcome from a structural error
{
  "resourceType": "OperationOutcome",
  "issue": [{
    "severity": "error",
    "code": "structure",
    "diagnostics": "Unknown element 'Patient.fullName': element not defined in base FHIR R4 Patient"
  }]
}

What to do on failure: Reject the payload. Log the source, error details, and payload identifier. Do not attempt higher-level validation. Route to a quarantine queue for investigation — structural failures often indicate a misconfigured source system rather than an isolated data error.

Profile-based validation

Profile validation confirms the resource satisfies the constraints in one or more StructureDefinition resources: cardinality constraints, must-support flags, fixed values, slice rules, and FHIRPath invariants.

This is the primary conformance layer for interoperability programs. A resource that passes structural validation but fails profile validation may be valid FHIR but not valid for the specific use case.

// OperationOutcome from a profile violation
{
  "resourceType": "OperationOutcome",
  "issue": [
    {
      "severity": "error",
      "code": "required",
      "location": ["Patient.identifier"],
      "diagnostics": "Patient.identifier: minimum required = 1, but only found 0 (from profile http://hl7.org/fhir/us/core/StructureDefinition/us-core-patient)"
    },
    {
      "severity": "error",
      "code": "invariant",
      "location": ["Patient.name[0]"],
      "diagnostics": "Constraint us-core-8 failed: Patient.name.given or Patient.name.family or both SHALL be present"
    }
  ]
}

Profile validation requires loading the StructureDefinition (and its dependency packages — extensions, value sets) into the validator. Running without the correct profiles loaded produces no profile-level errors even for non-conforming data, which is a silent false-pass.

What to do on failure: Depends on severity and gate position. At an ingestion gate, error-severity violations should reject the payload. warning-severity violations may be logged and passed. Always capture the specific issue codes and profile URLs — this is the information source system owners need to fix their payloads.

Terminology validation

Terminology validation confirms that coded elements use codes from the required value sets at the required binding strength. A resource can pass structural and profile validation but still fail if it uses a code that’s semantically invalid — an unmapped local code, a retired code, or a code from the wrong system.

Terminology validation is more expensive than structural or profile validation because it requires either: (a) a terminology server connection for live lookups, or (b) pre-expanded value set bundles loaded into the validator.

// OperationOutcome from a terminology failure
{
  "resourceType": "OperationOutcome",
  "issue": [{
    "severity": "error",
    "code": "code-invalid",
    "location": ["Observation.code.coding[0]"],
    "diagnostics": "The code 'LOCLAB-123' from system 'http://local-lab.example.org/codes' is not in the value set 'http://hl7.org/fhir/ValueSet/observation-codes' (required binding)"
  }]
}

Binding strength determines enforcement:

Binding strength	Validator behavior
`required`	Code must be in value set — hard error if not
`extensible`	Code from outside value set generates warning if an equivalent in-set code exists
`preferred`	No error; recommendation only
`example`	Not enforced

In practice, terminology validation is often the bottleneck in pipeline performance. Strategies for managing it:

Cache terminology server responses (LOINC, SNOMED, RxNorm code lookups change infrequently)
Pre-expand value sets in your validator’s package registry rather than making live calls
Run terminology validation asynchronously in parallel with profile validation for batch loads

What to do on failure: For required bindings, treat as a reject at ingestion. For extensible violations, log and pass with a quality flag. Create a feedback loop to source system owners when local codes consistently fail to map — the fix is usually a code mapping table in the source system, not a one-time patch.

Referential integrity

Referential integrity validates that references within and across resources point to resources that actually exist and are accessible.

FHIR distinguishes between:

Literal references (reference: "Patient/123") — can be resolved by fetching the target
Logical references (identifier: { system: "...", value: "..." }) — resolved by lookup, not by ID
Contained resources — the target is embedded in the resource, no external fetch needed

// Observation with a reference to a Patient
{
  "resourceType": "Observation",
  "subject": { "reference": "Patient/456" }
}

Referential integrity validation asks: does Patient/456 exist on the FHIR server? This requires a network call and can’t be done purely offline.

Where it matters most:

Transaction Bundles: all references must resolve within the Bundle or to existing server resources
Batch ingestion: a resource that references a Patient not yet ingested will fail; dependency ordering matters
Cross-server references: references to resources on external servers may be intentional (logical references) or errors

What to do on failure: For transaction Bundles, reject the entire Bundle with an explanation. For batch ingestion, hold the dependent resource in a “pending” queue until its prerequisite arrives, then reprocess. Implement a maximum retry count and a dead-letter queue for resources whose dependencies never arrive.

Validation in the pipeline

The three-gate model — pre-migration, ingestion, pre-publish — is the practical structure for most FHIR integration pipelines. Each gate has a different cost tolerance and a different failure response.

Source data
    ↓
Pre-Migration Cleanup     ← cheap checks; fix what you can before converting
    ↓
FHIR Conversion
    ↓
Ingestion Gate            ← strict gate; reject non-conforming resources
    ↓
Internal FHIR store
    ↓
Pre-Publish Gate          ← final gate before exposing to API consumers
    ↓
FHIR API surface

Pre-migration cleanup

Before converting legacy data to FHIR, run lightweight quality checks against the source:

Completeness scan: are required source fields present? (e.g., does every lab result have a patient ID?)
Format normalization: dates in consistent format, coded fields trimmed, delimiters consistent
Known code mapping coverage: what percentage of source codes have a FHIR mapping? What’s missing?
Duplicate detection: are there duplicate patient or encounter records that need deduplication before conversion?

This is the cheapest place to find and fix problems. Fixing a missing field in a source CSV row costs a fraction of debugging why an Observation failed validation after conversion, ingestion, and downstream processing.

The output of pre-migration cleanup is not a validation report — it’s a fix list and a coverage map. “These 230 lab codes have no LOINC mapping; here is the list” is actionable. “300 records failed validation” is noise until it’s triaged.

Ingestion gate

The ingestion gate is the mandatory checkpoint before writing resources to the FHIR store. Run all four validation layers here:

Structural — synchronous, fail immediately on error
Profile — synchronous, fail on error severity; log warning severity
Terminology — may be asynchronous if using a terminology server; fail on required binding violations
Referential integrity — check within-Bundle references synchronously; cross-resource references may be deferred

The ingestion gate should produce a structured outcome for every resource:

{
  "resourceType": "Bundle",
  "type": "transaction-response",
  "entry": [
    {
      "response": {
        "status": "422 Unprocessable Entity",
        "outcome": {
          "resourceType": "OperationOutcome",
          "issue": [/* validation issues */]
        }
      }
    }
  ]
}

Partial batch handling: Decide up front whether your ingestion gate is atomic (fail the entire batch if any resource fails) or per-resource (accept valid resources, reject invalid ones). Atomic batches are simpler to reason about but discard valid data when a few records fail. Per-resource handling keeps good data flowing but requires careful quarantine and reprocessing logic.

Feedback loop: Ingestion failures are only useful if the source system gets actionable feedback. Build a reporting channel — at minimum, a daily digest of failure counts by issue type and profile, ideally a near-real-time notification for high-failure-rate feeds.

Pre-publish gate

The pre-publish gate runs before data is exposed to downstream API consumers. Its job is different from the ingestion gate — it’s not about catching malformed input, it’s about ensuring the data the API returns meets the quality bar your consumers expect.

Pre-publish checks include:

Completeness for the API use case: A Patient record with no name might be stored (with a flag), but should it be returned via the Patient Access API? Decide your policy.
Consistency: Cross-resource consistency checks that can’t be done at ingestion time (e.g., an Encounter that references a Practitioner no longer in the system)
Freshness: Data older than a threshold for time-sensitive resources (e.g., Coverage data that hasn’t been refreshed in 90 days)
Redaction: PHI elements that should be suppressed for specific consumer roles or access scopes

The pre-publish gate may be implemented as a query-time filter (dynamic, evaluated per-request) or as a pre-computed flag written to the resource (static, evaluated at ingestion and cached). Query-time filters are more accurate but add latency; pre-computed flags are faster but can become stale.

Reporting and feedback loops

Validation produces OperationOutcome resources for individual failures, but pipeline health requires aggregated metrics. Build a quality dashboard that tracks, at minimum:

Metric	Why it matters
Ingestion rejection rate by source feed	Identifies degrading source systems early
Validation failure breakdown by issue code	Reveals systematic problems (e.g., a specific code never maps correctly)
Terminology lookup cache hit rate	Signals when terminology content needs refresh
Quarantine queue depth and age	Dead-letter backlog indicates unreprocessable records needing manual review
Profile conformance rate by resource type	Tracks conformance quality over time

Feedback to source systems: The quality loop that produces improvements is: validate → categorize failures → route to source system owner with specific, actionable error details → measure improvement. Without routing failures back to owners, validation is just noise that accumulates in a log file.

Structured OperationOutcome responses make this tractable — you can parse issue codes and generate per-source-system summaries automatically. Free-text error messages make it a manual reporting exercise.

QA checklist

Before declaring a FHIR pipeline production-ready:

Data Quality and Validation in FHIR Pipelines

Data Quality and Validation in FHIR Pipelines

Data quality dimensions

Validation layers

Structural validation

Profile-based validation

Terminology validation

Referential integrity

Validation in the pipeline

Pre-migration cleanup

Ingestion gate

Pre-publish gate

Reporting and feedback loops

QA checklist

See also