Data Stores
What Is a Data Store in ArchRepo?
A Data Store is any physical or logical location where data is held in the solution. Data Stores sit in the Data concern and represent the physical layer of the data architecture — they are the containers that hold the Data Sets the solution creates, reads, updates, and deletes.
Examples of Data Stores:
- “Customer CRM database — PostgreSQL instance hosted in Azure”
- “Invoice archive — Cloudflare R2 object storage bucket”
- “Session cache — Redis cluster”
- “Data warehouse — Azure Data Lake Gen2 with Delta format”
- “Document store — MongoDB Atlas cluster”
- “Shared file store — Azure Blob Storage for inbound file drops”
Data Stores are referenced using the prefix DS- — DS-1, DS-2, and so on.
What Counts as a Data Store?
Any persistent location that holds data in the solution is a Data Store. The key question is: “Does this thing store data that the solution reads from or writes to?”
Common Data Store types:
| Type | Examples |
|---|---|
| Relational database | PostgreSQL, SQL Server, MySQL, Oracle |
| Document database | MongoDB, Cosmos DB, Firestore |
| Object / blob storage | Amazon S3, Cloudflare R2, Azure Blob Storage |
| Cache | Redis, Memcached, ElastiCache |
| Data warehouse / data lake | Snowflake, Azure Data Lake, BigQuery, Delta Lake |
| File system / file share | Network share, SFTP drop zone, local file system |
| Search index | Elasticsearch, Azure Cognitive Search |
| Graph database | Neo4j, Amazon Neptune |
| Message queue as store | A queue or topic used for durable message retention |
A single physical technology may have multiple Data Stores defined against it — for example, separate Data Stores for the production database and the archive database running on the same PostgreSQL instance.
Data Sets
A Data Store contains Data Sets — the logical, scoped collections of data that it holds. This is the logical-to-physical mapping in the data architecture:
- Data Sets define what the data is — its structure, entities, and attributes
- Data Stores define where it lives — the physical or logical store that holds it
The Data Stores v Data Sets collection view provides a completeness check, ensuring that every Data Set has been assigned to a Data Store and that every Data Store has its contents documented.
Streams
A Data Store can be connected to event streams or message queues in two directions:
| Relationship | What it means |
|---|---|
| Generates Stream | Data changes in this store trigger events published to a stream — for example, a change data capture (CDC) feed from a database |
| Receives Stream | This store is populated by events arriving from a stream — for example, a stream consumer that writes events to a data lake |
These relationships connect the persistent storage layer to the event-driven architecture of the solution.
Technology
The Uses Technology relationship links a Data Store to the technology that implements it. The Data Stores v Technology collection view shows the complete technology footprint of the data architecture — which stores rely on which technologies — and is useful for technology rationalisation reviews and vendor dependency analysis.
Quality Attributes
Data Stores are a critical focus for quality attribute design. Each Data Store can be linked to the mechanisms that govern it:
| Quality Attribute | Why it matters for Data Stores |
|---|---|
| Availability | Uptime targets; failover configuration; replication |
| Scalability | Horizontal/vertical scaling strategy; partitioning |
| Recoverability | Backup schedule, RPO/RTO targets; point-in-time recovery |
| Performance | Query performance; indexing strategy; throughput targets |
| Security | Encryption at rest; access control; audit logging |
| Deployability | Infrastructure-as-code; environment promotion strategy |
| Observability | Monitoring, alerting, and logging for the store |
Transition States
Data Stores support Transition States, which is useful for documenting how the physical storage layer changes with the solution — for example, a legacy on-premises database being replaced by a cloud-hosted equivalent, or a file-based store being superseded by a purpose-built data lake.
Fields Reference
See Data Store Fields for a description of each field and guidance on what to record.