Skip to main content

Data Strategy

Scrollless employs an Offline-First strategy. The is a local database. The data pipeline follows a Raw-to-Summary architecture:
  1. Ingestion: Raw system events (app opens, notifications, scroll deltas) are captured and stored.
  2. Processing: A background engine aggregates these raw logs into daily summaries.
  3. Projection: For real-time features (like limits), a mediator combines persisted DB data with “in-flight” memory data to provide a zero-latency view of usage.

Key Repositories

The data layer is organized into functional repositories that abstract the complexity of the underlying and services.

Usage & Tracking

  • ScrollDataRepository: The primary engine. It syncs system events from the Android UsageStatsManager, processes them into DailyAppUsageRecord and DailyDeviceSummary, and handles historical backfills.
  • UsageRepository: A high-level mediator that uses the UsageProjectionEngine to calculate “Used vs. Remaining” time by combining DB records with live session buffers.
  • JourneysRepository: Reconstructs a chronological timeline of user behavior (the “User Journey”) by correlating unlock sessions, app switches, and notification triggers.

Metadata & Categorization

  • AppMetadataRepository: Manages the list of installed apps, caches high-resolution icons to the local file system, and determines if an app is “user-visible” (e.g., has a launcher icon).
  • AppCategoryRepository: Interfaces with a remote service to fetch and cache app categories (e.g., “Social”, “Productivity”).

Governance & Limits

  • LimitsRepository: Manages LimitGroup entities and their associations with specific apps. It handles the logic for “Quick Limits” and “Custom Groups.”
  • OutcomesRepository: Records the daily “Success” or “Failure” of a limit, tracking how often users snooze or exceed their goals.

Infrastructure & Settings

  • BackupRepository: Handles full database portability. It uses and GZIP compression to export/import the entire application state.
  • SettingsRepository: Manages user preferences and feature flags using .

Schema & Performance (Data Engineering)

The database schema is designed for high-frequency writes and complex analytical queries.
  • Core Tables:
    • raw_app_events: High-volume log of every app transition and notification.
    • scroll_sessions: Aggregated scroll distances (pixels) per app session.
    • daily_app_usage: Summarized usage time and open counts per app, per day.
  • Indexes: Critical columns like packageName and dateString are indexed to ensure that dashboard queries remain performant even with months of data.
  • Relationships: Uses a mix of flat tables for performance and @Relation mappings for complex objects like GroupWithApps.

The Projection Model (Real-time Accuracy)

To solve the “Sync Lag” problem (where the DB might be a few seconds behind the actual system state), the UsageProjectionEngine implements a Memory-Buffer strategy:
  1. DB Base: Reads the last known usage from the database.
  2. Pending Buffer: Adds sessions that have finished but haven’t been processed into the daily summary yet.
  3. Live Session: Adds the duration of the currently active app session.
  4. Result: Provides a style accuracy for the UI.

Storage & Caching

  • Database: Room (AppDatabase) with automatic migrations.
  • File Storage: App icons are stored as .png files in the internal app_icons directory to avoid repeated overhead for binary data.
  • Network Cache: App categories are cached indefinitely once fetched, with a retry mechanism for “Pending” or “Uncategorized” apps.

Threading & Concurrency

Data operations are strictly offloaded from the Main Thread to prevent UI stutters:
  • Dispatchers.IO: Used for all Disk and Network operations.
  • Dispatchers.Default: Used for heavy data processing (e.g., reconstructing the User Journey).
  • Flow: All repositories expose APIs, allowing the UI to reactively update whenever the underlying data changes.
  • Suspend Functions: All write operations are marked as to ensure they are called within a coroutine scope.

Data Integrity & Backups

The BackupRepository ensures data safety:
  • Atomic Transactions: Imports are wrapped in a single transaction; if the file is corrupt, the database rolls back to its previous state.
  • Validation: Every backup includes a header with versioning and device metadata to prevent importing incompatible schemas.
  • GZIP Compression: Reduces the size of large event logs by up to 90% for easier sharing and storage.
Last modified on January 22, 2026