Skip to Content
MetamodelData FlowsData Flows

Data Flows

What Is a Data Flow in ArchRepo?

A Data Flow describes how data moves between systems, services, and data stores in the solution — what data travels, through which technical components, how frequently, and at what volume. Data Flows sit in the Data concern and provide the primary mechanism for documenting integration data paths from end to end.

Examples of Data Flows:

  • “Supplier invoice data received from EDI gateway and loaded into the ERP”
  • “Customer order details replicated from the Order Management System to the Data Warehouse nightly”
  • “Real-time product availability updates pushed from the inventory service to the API gateway”
  • “Legacy customer records migrated from the old CRM to the new platform during cutover”

Data Flows are referenced using the prefix DF-DF-1, DF-2, and so on.


Data Mapping

Each Data Flow has a Data Mapping tab where source-to-target field mappings can be recorded. The data mapping captures:

  • The source field name, data type, and any transformation applied
  • The target field name and data type
  • Any filtering, routing, or conditional logic that applies

This is distinct from the Description field, which provides the narrative overview of what the flow does. The data mapping is the technical specification that integration developers use to implement the flow. Populating it early — even at a high level — helps surface data quality issues, missing fields, and transformation complexity before development begins.


Sizing and Frequency

Two dedicated fields capture the capacity characteristics of each Data Flow:

FieldWhat to record
SizingThe volume of data involved — number of records, file sizes, payload sizes. Used for capacity planning, infrastructure sizing, and NFR definition.
FrequencyHow often the flow occurs — real-time, event-driven, scheduled batch, or on-demand. Include the expected interval for scheduled flows (e.g. “nightly at 02:00”, “every 15 minutes during business hours”).

Accurate sizing and frequency information feeds directly into non-functional requirement definitions (throughput, latency, storage) and enables the team to estimate the infrastructure and performance requirements for each flow.


Traceability

Data Flows are connected to both the business layer and the technical layer:

Business layer — Data Flows support requirements, processes, outcomes, scenarios, and business rules, providing traceability from what the business needs to how the data moves to meet those needs.

Technical layer — Data Flows link to the APIs, streams, services, data stores, data sets, and technologies that participate in or implement the flow, providing a complete picture of the technical components involved.


Quality Attributes

Each Data Flow can be linked to the quality attribute mechanisms that govern it:

  • Availability — the availability patterns applied to ensure the flow is reliable
  • Recoverability — the recovery mechanisms in place if the flow fails
  • Performance — the performance targets and mechanisms for the flow
  • Security — the security controls applied to data in transit (encryption, access control, audit)
  • Deployability — the deployment patterns relevant to this flow

Linking quality attributes to Data Flows makes it explicit which non-functional concerns apply to each integration path and which mechanisms address them.


Transition States

Data Flows support Transition States, which is useful for documenting how integration data paths change with the solution:

  • An existing flow may be replaced by a new direct API integration
  • A batch flow may become real-time
  • A legacy flow may be retired once migration is complete

Use Transition States to make the change from As-Is to To-Be explicit in the data architecture.


Categories

Data Flows can be assigned to categories to group them by integration domain or theme. The Data Flows by Category view organises the register by category, making it easier to review all flows within a particular domain (e.g. Finance, Logistics, Customer Data).


Fields Reference

See Data Flow Fields for a description of each field and guidance on what to record.

Last updated on