Skip to main content

Storage Architecture

DgiDgi uses a layered storage architecture optimized for multi-tenant SaaS, combining relational data in PostgreSQL with object storage in Cloudflare R2 and Supabase Storage.

Storage Overview

Storage Layers

LayerTechnologyPurpose
Relational DataPostgreSQL (Supabase)Structured data, relationships, queries
Tenant Object StorageCloudflare R2Files, attachments, artifacts (regional)
Platform KnowledgeSupabase StorageAssurance knowledge base
CacheRedis (Upstash)Session data, job queues
Edge CacheCloudflare KVRate limits, config caching

Chat Media & Attachments

Chat sessions can include file attachments (images, documents, etc.) that are stored in regional R2 buckets with metadata tracked in PostgreSQL.

Storage Schema

File Types

TypeDescriptionStorage Location
attachmentGeneral file uploadssessions/{id}/attachments/
imageImage files (PNG, JPG, etc.)sessions/{id}/images/
audioAudio files (voice messages)sessions/{id}/audio/
videoVideo filessessions/{id}/video/
documentDocuments (PDF, etc.)sessions/{id}/documents/
artifactGenerated outputsprojects/{id}/artifacts/
exportExport filesprojects/{id}/exports/

Lifecycle Policies

Automatic cleanup policies manage storage retention:

PolicyFile TypesAgeAction
Temp Cleanuptemp, cache7 daysDelete
Export Cleanupexport30 daysDelete
Deleted CleanupSoft-deleted30 daysPurge

Regional Storage Buckets

DgiDgi maintains regional storage buckets for data residency compliance:

Platform Knowledge Storage

The assurance knowledge base is stored in a dedicated Supabase Storage bucket:

Assurance Bucket: dgidgi-one-assurance

SettingValue
AccessPrivate (authenticated only)
MIME Typesapplication/json, text/plain
Max File Size50MB
EncryptionAES-256

Content Structure

dgidgi-one-assurance/
├── assurance/
│ ├── frameworks/
│ │ ├── security/index.json (OWASP, NIST, CIS)
│ │ ├── compliance/index.json (SOC2, ISO 27001, PCI-DSS)
│ │ ├── privacy/index.json (GDPR, CCPA, LGPD)
│ │ └── ...
│ ├── controls/
│ │ ├── soc2.json
│ │ ├── iso27001.json
│ │ └── ...
│ ├── security/
│ │ ├── owasp/top10-2021.json
│ │ ├── nist/800-53.json
│ │ └── ...
│ ├── policies/
│ │ └── security-policies.json
│ └── standards/
│ └── coding-standards.json

Access Control

RouteAccessPurpose
GET /api/v1/assurance/*Authenticated usersRead knowledge
POST /api/v1/platform/assurance/*Platform admin onlyManage knowledge

Assurance Tenant Schema

Platform-level scans, evidence, and subscriptions are stored in PostgreSQL to keep tenant visibility separate from the static knowledge in the bucket. The definitions live inside packages/shared/src/assurance/tenant-data.schema.ts so the service layer and database migrations stay in sync with the documentation below:

  • assurance_traceability – records how each tenant implements a given control (framework/control IDs, implementation status/type, evidence refs, test dates, remediation plans, owners). Indexed by tenant and control for fast lookups.
  • assurance_assessments – stores assessment metadata (type, name, status, scores, results/responses, findings, recommendations) so you can run readiness checks or gap analyses per tenant.
  • assurance_findings – captures individual findings from any security/compliance scan (severity, category, title, file/line, recommendation, evidence path, resolved flag). Runs are joined via run_id and the table is filtered by severity/status indexes for dashboards.
  • assurance_evidence – keeps metadata for evidence artifacts (type, name, MIME, storage key, collected/expiry dates, collected by, related framework/control IDs, validity flags) so the UI can link findings to actual proof documents.
  • assurance_subscriptions – lets tenants opt-in to regulatory updates (framework, notification preferences, delivery channels, recipients). A uniqueness constraint prevents duplicate subscription rows for the same framework and type.

Relational Data (PostgreSQL)

All structured data is stored in PostgreSQL with Row-Level Security (RLS) for tenant isolation.

Key Tables by Domain

Row-Level Security

All tables enforce tenant isolation through RLS policies:

-- Example RLS policy (simplified)
CREATE POLICY tenant_isolation ON projects
USING (tenant_id = current_setting('app.tenant_id')::uuid);

Object Storage Key Hierarchy

All objects follow a strict tenant-prefixed hierarchy:

tenants/{tenantId}/
├── branding/
│ ├── logo.png
│ └── favicon.ico
├── avatars/
│ └── {userId}.jpg
├── sessions/{sessionId}/
│ ├── attachments/
│ │ └── {fileId}
│ ├── images/
│ │ └── {fileId}
│ └── audio/
│ └── {fileId}
├── projects/{projectId}/
│ ├── exports/
│ │ └── {filename}
│ ├── reports/
│ │ └── {filename}
│ ├── artifacts/
│ │ └── {filename}
│ ├── documents/
│ │ └── {filename}
│ └── runs/{runId}/
│ ├── outputs/
│ │ └── {filename}
│ └── logs/
│ └── {filename}
└── workspaces/{workspaceId}/
└── {assetType}/
└── {filename}

Security Model

Bring Your Own Storage (BYOS)

Enterprise customers can connect their own S3-compatible storage for full data ownership.

Supported Providers

ProviderEndpoint FormatNotes
Amazon S3https://s3.{region}.amazonaws.comStandard S3
Cloudflare R2https://{account}.r2.cloudflarestorage.comS3 compatible
Google Cloud Storagehttps://storage.googleapis.comInteroperability mode
Azure Blob Storagehttps://{account}.blob.core.windows.netS3 connector
MinIOhttps://your-minio-server:9000Self-hosted

BYOS Architecture

Configuration

Enable BYOS via API or Settings UI:

PUT /api/v1/regions/storage/{tenantId}
{
"storageMode": "byos",
"byosProvider": "s3",
"byosEndpoint": "https://s3.eu-west-1.amazonaws.com",
"byosBucket": "my-company-dgidgi-data",
"byosRegion": "eu-west-1",
"byosAccessKey": "AKIA...",
"byosSecretKey": "..."
}

Storage API

The storage API provides secure access to object storage:

Endpoints

EndpointMethodDescription
/api/v1/storage/uploadPOSTUpload file (server-proxied)
/api/v1/storage/upload-urlPOSTGet signed URL for direct upload
/api/v1/storage/signed-urlGETGet signed URL for download
/api/v1/storage/file/*GETStream file content
/api/v1/storage/fileDELETEDelete a file
/api/v1/storage/listGETList files in prefix
/api/v1/storage/statusGETCheck storage backend

Upload Flow

  • Option 1 — Best for small files under 10MB. Simple and secure.
  • Option 2 — Best for large files over 10MB. Faster, less server load.

Data Retention

Data TypeRetentionNotes
Chat messagesIndefiniteStored in PostgreSQL
Session attachments90 daysConfigurable per tenant
Build artifacts30 daysAuto-cleanup job
Logs7 daysCompressed after 24h
Working memory24 hoursShort-term agent memory
Episodic memoryConfigurableLong-term agent memory
Temporary files7 daysAuto-deleted
Exports30 daysAuto-deleted

Backup & Recovery

PostgreSQL (Supabase)

  • Automatic Backups: Daily snapshots
  • Point-in-Time Recovery: Up to 7 days
  • Replication: Automatic failover

Object Storage (R2)

  • Durability: 99.999999999% (11 nines)
  • Replication: Automatic across regions
  • Versioning: Optional, disabled by default

Assurance Knowledge (Supabase Storage)

  • Durability: 99.999999999%
  • Backups: Supabase managed
  • Recovery: Platform admin can re-seed from source

Environment Configuration

# PostgreSQL (Supabase)
DATABASE_URL=postgresql://...
SUPABASE_URL=https://xxx.supabase.co
SUPABASE_SERVICE_ROLE_KEY=xxx

# Object Storage (Cloudflare R2) - Primary credentials
R2_ACCOUNT_ID=your-account-id
R2_ACCESS_KEY_ID=your-access-key
R2_SECRET_ACCESS_KEY=your-secret-key

# Regional Storage Buckets
R2_BUCKET_US=dgidgi-storage-us # Americas
R2_BUCKET_CA=dgidgi-storage-ca # Canada (PIPEDA)
R2_BUCKET_SA=dgidgi-storage-sa # South America (LGPD)
R2_BUCKET_EU=dgidgi-storage-eu # Europe (GDPR)
R2_BUCKET_ME=dgidgi-storage-me # Middle East (PDPL)
R2_BUCKET_AF=dgidgi-storage-af # Africa (POPIA)
R2_BUCKET_AP=dgidgi-storage-ap # Asia Pacific (PDPA, APPI)
R2_BUCKET_IN=dgidgi-storage-in # India (DPDP)
R2_BUCKET_AU=dgidgi-storage-au # Australia (Privacy Act)

# Cache (Redis/Upstash)
REDIS_URL=redis://...