Data Layer Architecture - Scroll Less Docs

Data Strategy

Scrollless employs an Offline-First strategy. The is a local database. The data pipeline follows a Raw-to-Summary architecture:

Ingestion: Raw system events (app opens, notifications, scroll deltas) are captured and stored.
Processing: A background engine aggregates these raw logs into daily summaries.
Projection: For real-time features (like limits), a mediator combines persisted DB data with “in-flight” memory data to provide a zero-latency view of usage.

Key Repositories

The data layer is organized into functional repositories that abstract the complexity of the underlying and services.

Usage & Tracking

ScrollDataRepository: The primary engine. It syncs system events from the Android UsageStatsManager, processes them into DailyAppUsageRecord and DailyDeviceSummary, and handles historical backfills.
UsageRepository: A high-level mediator that uses the UsageProjectionEngine to calculate “Used vs. Remaining” time by combining DB records with live session buffers.
JourneysRepository: Reconstructs a chronological timeline of user behavior (the “User Journey”) by correlating unlock sessions, app switches, and notification triggers.

Metadata & Categorization

AppMetadataRepository: Manages the list of installed apps, caches high-resolution icons to the local file system, and determines if an app is “user-visible” (e.g., has a launcher icon).
AppCategoryRepository: Interfaces with a remote service to fetch and cache app categories (e.g., “Social”, “Productivity”).

Governance & Limits

LimitsRepository: Manages LimitGroup entities and their associations with specific apps. It handles the logic for “Quick Limits” and “Custom Groups.”
OutcomesRepository: Records the daily “Success” or “Failure” of a limit, tracking how often users snooze or exceed their goals.

Infrastructure & Settings

BackupRepository: Handles full database portability. It uses and GZIP compression to export/import the entire application state.
SettingsRepository: Manages user preferences and feature flags using .

Schema & Performance (Data Engineering)

The database schema is designed for high-frequency writes and complex analytical queries.

Core Tables:
- raw_app_events: High-volume log of every app transition and notification.
- scroll_sessions: Aggregated scroll distances (pixels) per app session.
- daily_app_usage: Summarized usage time and open counts per app, per day.
Indexes: Critical columns like packageName and dateString are indexed to ensure that dashboard queries remain performant even with months of data.
Relationships: Uses a mix of flat tables for performance and @Relation mappings for complex objects like GroupWithApps.

The Projection Model (Real-time Accuracy)

To solve the “Sync Lag” problem (where the DB might be a few seconds behind the actual system state), the UsageProjectionEngine implements a Memory-Buffer strategy:

DB Base: Reads the last known usage from the database.
Pending Buffer: Adds sessions that have finished but haven’t been processed into the daily summary yet.
Live Session: Adds the duration of the currently active app session.
Result: Provides a style accuracy for the UI.

Storage & Caching

Database: Room (AppDatabase) with automatic migrations.
File Storage: App icons are stored as .png files in the internal app_icons directory to avoid repeated overhead for binary data.
Network Cache: App categories are cached indefinitely once fetched, with a retry mechanism for “Pending” or “Uncategorized” apps.

Threading & Concurrency

Data operations are strictly offloaded from the Main Thread to prevent UI stutters:

Dispatchers.IO: Used for all Disk and Network operations.
Dispatchers.Default: Used for heavy data processing (e.g., reconstructing the User Journey).
Flow: All repositories expose APIs, allowing the UI to reactively update whenever the underlying data changes.
Suspend Functions: All write operations are marked as to ensure they are called within a coroutine scope.

Data Integrity & Backups

The BackupRepository ensures data safety:

Atomic Transactions: Imports are wrapped in a single transaction; if the file is corrupt, the database rolls back to its previous state.
Validation: Every backup includes a header with versioning and device metadata to prevent importing incompatible schemas.
GZIP Compression: Reduces the size of large event logs by up to 90% for easier sharing and storage.

Modules

In

Res

​Data Strategy

​Key Repositories

​Usage & Tracking

​Metadata & Categorization

​Governance & Limits

​Infrastructure & Settings

​Schema & Performance (Data Engineering)

​The Projection Model (Real-time Accuracy)

​Storage & Caching

​Threading & Concurrency

​Data Integrity & Backups