How I've been storing data on the web
Because localStorage was never meant to be a database.
To no surprise, there’s tons of ways to store bytes on the web. From .mp3
files in Muse’s cloud system to .mdx
files for even this website’s content — it’s a lot of data. Let's see how we got here.
Humble beginnings
LocalStorage
Persistence for most web applications started with this popular API. localStorage
is simple, written for synchronous operations, and worked across sessions. You could store small amounts of data as key-value pairs, like so:
1localStorage.setItem("theme", "dark")2const theme = localStorage.getItem("theme")
Nice right? Until it wasn’t. When building production-level applications, rarely is the data structures and model so simple. We often need complex objects, nested data, and sometimes even binary blobs. In these cases, localStorage
quickly becomes a bottleneck.
More importantly, it's synchronous. On page load, the UI will freeze up for hundreds of milliseconds while it read from localStorage
. This is fine for small data, but as soon as you try to store anything larger than a few kilobytes, it becomes a performance nightmare - you're storing serialized objects in string land, after all.
LocalStorage done right
Web storage is not a single tool. It’s a toolbox. Some are sharp, others brittle. Some will cut you if you’re not careful, so we as engineers have the responsibility to choose the right tool for the job.
I say this because it (local storage) still has its place. Seleneo uses it (via Zustand + persist
) to save lightweight state like theme, current draft ID, and canvas config. It works well for tiny, UI-first state that doesn’t need to sync or be deeply structured.
1import { create } from 'zustand'2import { persist } from 'zustand/middleware'3
4const useUI = create(5 persist(6 (set) => ({7 theme: 'light',8 setTheme: (t) => set({ theme: t })9 }),10 { name: 'seleneo-ui' }11 )12)
But beyond 1000KB or so, it gets scary.
RIP WebSQL (and why we can't have nice things)
Right before the influx of the modern reactive web, there was WebSQL. To put it shortly: SQLite in the browser. People back then were writing real SQL queries, and although I wasn't around in 2010 to use it in everyday workflows - it looked glorious. You could fuck around and pull this off in the client:
1CREATE TABLE tracks (id INTEGER PRIMARY KEY, title TEXT, artist TEXT);2INSERT INTO tracks (title, artist) VALUES ('Song 1', 'Artist 1');
Every now and then I daydream about a world where WebSQL didn’t die. But it did. Mainly because:
- It was non-standardized (basically a wrapper over SQLite)
- Although, the community at the time loved it
- Firefox refused to implement it
- W3C deprecated it in favor of IndexedDB
- Mozilla felt it could codify the quirks of the SQLite implementation
WebSQL was too good to live, and IndexedDB is what we got instead.
IndexedDB
The long-term vision for Muse is for it to run almost entirely on IndexedDB — but through Dexie. Native IndexedDB's API is not the most dev-friendly, so we use Dexie for a their more convenient API and type safety. On the contrary I know that T3 Chat -used- Dexie (later migrating to Convex).
1import Dexie from 'dexie';2import { z } from "zod";3
4const TrackSchema = z.object({5 id: z.string(),6 title: z.string(),7 artist: z.string(),8 audioBlob: z.instanceof(Blob).optional(),9 favorite: z.boolean().default(false),10 playCount: z.number().default(0),11});12
13type Track = z.infer<typeof TrackSchema>;14class MusicDB extends Dexie {15 tracks!: Dexie.Table<Track, string>;16
17 constructor() {18 super('muse');19
20 this.version(1).stores({21 tracks: 'id, artist, favorite, playCount'22 });23 }24
25 async addTrack(trackData: unknown) {26 const validatedTrack = TrackSchema.parse(trackData);27 return this.tracks.add(validatedTrack);28 }29}30
31const db = new MusicDB();32
33async function getFavoriteTracks() {34 return db.tracks.where('favorite').equals(true).toArray();35}
Your UX is only as fast as your slowest dependency. IndexedDB isn't RAM-fast, but it's close.
navigator.storage, and estimating sanity
Browser storage isn’t infinite. If you don’t ask how much space you’ve got, you will get evicted. Guaranteed.
1const { usage, quota } = await navigator.storage.estimate()2console.log(`Used: ${usage}, Quota: ${quota}`)
In Muse, we check this when users start saving a lot of tracks. If you’re bumping against your quota, we warn you. If not, we carry on.
We also request persistent storage:
1await navigator.storage.persist()
This tells the browser: 'Hey, don’t wipe me unless you absolutely have to.' If you skip this step, your app data is fair game.
Users don’t see a modal. You don’t get a callback. Your storage just disappears.
PouchDB: offline dreams
I played around with PouchDB for Muse’s library sync. It’s a CouchDB-compatible client that uses IndexedDB under the hood. What you get:
- structured docs
- live sync
- attachment support (for blobs)
We didn’t ship it because of size, performance, and need for full CouchDB backend. But the dream is real. PouchDB makes offline-first feel native.
Storage strategy at scale (Muse edition)
Muse's storage system, as I mentioned, runs across a few layers. Let's break that down further with the context of its API and features:
-
Dexie for structured metadata:
- What's stored? Think playlists (from
/api/playlists
), individual song metadata (title, artist, album, duration, from/api/songs
), user's favorite songs and playlists (from/users/me/favorites/songs
and/users/me/favorites/playlists
), and potentially user statistics (from/users/me/stats
). - Why Dexie? It allows for rich querying. For example, "show me all songs by artist X in playlist Y, sorted by title" or "list all my favorite playlists." This makes the local cache highly effective for browsing the library without constant network calls.
- UX Impact: Snappy UI. When you open Muse, your library, playlists, and favorites can load almost instantly from Dexie, even before the first network request for updates completes.
- What's stored? Think playlists (from
-
Audio blobs in IndexedDB (via Dexie):
- The Challenge: Audio files are large. Storing them requires careful management.
- Process: When a user downloads a song (perhaps after getting a stream URL from
/api/songs/{id}/stream
and deciding to save it), the audio data is saved as a blob in an IndexedDB store. - Management: This is where
navigator.storage.estimate()
becomes vital. Muse would need to monitor available space and potentially implement a Least Recently Used (LRU) cache eviction strategy or allow users to manage downloads if space gets low. Thenavigator.storage.persist()
call is also key to signal to the browser that this data is important.
-
Storage API to monitor quotas: As above, crucial for managing large audio files. The app should proactively inform the user if they're nearing their quota.
-
Persistent flag to request durability: Essential for an app like Muse where users expect their downloaded music to stick around.
-
Cloudflare R2 for synced storage across devices:
- What's synced? Primarily user-specific metadata: playlist definitions, liked songs/playlists, play counts (perhaps via
/api/playlists/{id}/play
or/api/songs/{id}/listen
), and user settings. Audio blobs themselves are kept local to the device due to size and cost. - Strategy: This could involve a sync on app load, a periodic background sync if the app is open, or sync on specific actions (e.g., creating a new playlist). The goal is that if a user logs into Muse on a new device, their core library structure and preferences are rehydrated from R2.
- What's synced? Primarily user-specific metadata: playlist definitions, liked songs/playlists, play counts (perhaps via
We only sync metadata (not audio) to R2 — your downloads live locally. But if you log in from another device, we rehydrate your likes and playlists.
We chose this because:
- Audio is big
- Syncing blobs is expensive
- Most people stream anyway
So: local-first for audio, cloud-first for state. It’s worked surprisingly well.
Seleneo and the lightweight case
Seleneo, the design tool, has simpler needs for now. It doesn’t deal with media blobs or require extensive offline capabilities for its core offering. We use localStorage
+ Zustand to store UI preferences (like theme), active draft IDs, and sidebar state.
1import { create } from 'zustand'2import { persist, createJSONStorage } from 'zustand/middleware'3
4interface SeleneoUISettings {5 theme: 'light' | 'dark' | 'system';6 activeDraftId: string | null;7 isToolPanelOpen: boolean;8 lastUsedTool: string | null;9 // ... other UI specific settings10}11
12const useSeleneoUIStore = create(13 persist<SeleneoUISettings>(14 (set) => ({15 theme: 'system',16 activeDraftId: null,17 isToolPanelOpen: true,18 lastUsedTool: null,19 setTheme: (theme) => set({ theme }),20 setActiveDraftId: (id) => set({ activeDraftId: id }),21 toggleToolPanel: () => set((state) => ({ isToolPanelOpen: !state.isToolPanelOpen })),22 setLastUsedTool: (tool) => set({ lastUsedTool: tool }),23 }),24 {25 name: 'seleneo-ui-settings', // unique name26 storage: createJSONStorage(() => localStorage), // specify localStorage27 }28 )29)
This works because:
- The data is small and not overly complex.
- It's primarily for enhancing the immediate user experience on that specific browser (e.g., remembering UI state).
- Synchronous access is acceptable for these small, quick reads/writes.
However, if Seleneo were to evolve to support, say, offline editing of complex vector designs or storing large embedded image assets locally for a draft, localStorage
would quickly hit its limits in terms of both storage capacity and performance. At that point, migrating draft storage to IndexedDB
would be the logical next step, potentially keeping localStorage
for truly ephemeral UI state.
- LocalStorage: Good for small amounts (less than 5MB, ideally much less) of simple key-value data, like user preferences or UI state, where synchronous access is okay. Think Seleneo's current UI settings.
- IndexedDB: Best for larger, structured data, blobs (like Muse's audio files), needing asynchronous access, complex queries, or transactions. Think Muse's entire music library metadata and downloaded tracks.
That’s kind of the point. You don’t have to reach for the most advanced option. Pick the thing that makes sense for the scale you’re working at.
Nuggets from the field
Nugget/Tip | Explanation & Recommendation |
---|---|
IndexedDB Performance | Performance is generally good, but opening numerous stores simultaneously in a cold tab can be slow. Bundle your read operations if possible. |
Reactive Offline State | Dexie.js with liveQuery() offers a straightforward way to achieve reactive data that updates your UI automatically when underlying IndexedDB data changes. |
LocalStorage Usage | Avoid using localStorage for critical application state due to its synchronous nature and size limitations. It's better suited for persisting simple UI preferences. |
PouchDB Considerations | PouchDB is powerful for full database sync capabilities (especially with CouchDB backends) but can be heavy. Evaluate if its feature set is essential for your needs. |
navigator.storage API | This API is often overlooked. Use navigator.storage.estimate() to check storage quotas and navigator.storage.persist() to request persistent storage, reducing the likelihood of data eviction. |
Derived Data | Avoid storing data that can be derived from a source of truth. Recalculate it when needed to save space and maintain consistency. |
Storage Speed | Never assume client-side storage is fast. Always measure performance, especially with large datasets or frequent operations. |
Data Permanence | Don't assume data stored on the client is permanent. Browsers can evict data under storage pressure. Implement strategies for data backup or sync if permanence is critical. |
Diving Deeper into IndexedDB
While Dexie.js abstracts away much of IndexedDB's verbosity, understanding its core concepts can help you build more robust and performant applications.
Schema Versioning and Migrations
As your application evolves, so will your data structures. IndexedDB handles schema changes through database versions. When you open a database with a higher version number, an upgradeneeded
event is fired, allowing you to modify the database structure (create/delete object stores, create/delete indexes).
Properly managing schema versions and migrations is crucial to prevent data loss or application errors for your users when you deploy updates.
Here's a conceptual Dexie example of a migration:
1import Dexie, { type Version } from 'dexie';2
3const db = new Dexie('MyMusicAppDB');4
5db.version(1).stores({6 tracks: '++id, title, artist, album',7 playlists: '++id, name, trackIds',8});9
10// version 2: add album to tracks and a new settings store11db.version(2).stores({12 tracks: '++id, title, artist, album', // ensure 'album' is indexed13 playlists: '++id, name, trackIds',14 settings: 'key, value' // store for say global app settings15}).upgrade(tx => {16 // only called on version 117 return tx.table('tracks').toCollection().modify(track => {18 // if the album was previously stored in a different way or needs a default19 // for new tracks, 'album' will be undefined unless provided20 // for existing tracks, we'd set a default value21 if (typeof track.album === 'undefined') {22 track.album = 'Unknown Album';23 }24 });25});26
27// version 3: add genre to tracks28db.version(3).stores({29 tracks: '++id, title, artist, album, genre',30 playlists: '++id, name, trackIds',31 settings: 'key, value'32}).upgrade(tx => {33 return tx.table('tracks').toCollection().modify(track => {34 track.genre = track.genre || 'Unknown Genre';35 });36});37
38export default db;
In this example, each .version(N)
call defines the schema for that version. The .upgrade()
callback provides a transaction (tx
) to safely modify existing data to fit the new schema.
Indexing Strategies
Indexes are the backbone of fast queries in IndexedDB.
- Choose wisely: Only index fields you'll frequently query or sort by. Too many indexes can slow down write operations.
- Compound indexes: While IndexedDB doesn't support compound indexes directly in the same way SQL databases do, you can create them by combining multiple properties into a single indexed property (e.g.,
artistAndAlbum: "ArtistName_AlbumName"
) or use Dexie's multi-valued indexes for more complex scenarios. unique
constraint: Use this when a field must have unique values (e.g., ausername
).multiEntry
: Useful for indexing arrays. If a track has an array of tags['rock', '90s', 'alternative']
, amultiEntry
index ontags
would allow you to efficiently find all tracks tagged 'rock'.
Bulk Operations
For performance, always prefer bulk operations over multiple individual operations when dealing with many records. Dexie provides:
bulkAdd()
: Adds an array of objects.bulkPut()
: Adds or replaces an array of objects.bulkDelete()
: Deletes an array of records by their primary keys.
These are significantly faster as they perform operations within a single transaction, reducing overhead.
IndexedDB and Web Workers
For very large datasets or computationally intensive data processing, consider offloading IndexedDB operations to a Web Worker. This keeps the main thread free, ensuring your UI remains responsive.
Moving IndexedDB logic to a Web Worker can dramatically improve perceived performance by preventing database operations from blocking UI updates, especially on initial data hydration or large data writes.
While direct Dexie usage in a worker requires some setup (as Dexie instances are not directly transferable), you can communicate with a worker that handles its own Dexie instance or uses the native IndexedDB API.
Bridging the Gap: Real-world Data Synchronization Patterns
Storing data locally is one piece of the puzzle; keeping it in sync with a server or across multiple devices is another. Both Muse and Seleneo (if it were to expand its cloud features) would employ synchronization strategies.
Optimistic Updates
This pattern dramatically improves perceived performance. The UI updates as if the operation was successful immediately, while the actual network request happens in the background.
Example: Favoriting a song in Muse
- User Action: Clicks the "favorite" icon for a song.
- UI Update: Icon changes state instantly.
- Local Data Update (Dexie): The song's
isFavorite
flag is set totrue
in the local IndexedDB.1async function toggleFavoriteSong(songId: string, currentStatus: boolean) {2 const newStatus = !currentStatus;3 // propogate local DB immediately4 await db.tracks.update(songId, { isFavorite: newStatus });5 updateSongInUI(songId, { isFavorite: newStatus });67 try {8 await fetch(`/api/users/me/favorites/songs`, {9 method: 'POST',10 headers: { ... },11 body: JSON.stringify({ songId, favorite: newStatus }),12 });13 } catch (error) {14 // revert local change & notify user if something goes wrong15 console.error("Failed to sync favorite status:", error);16 await db.tracks.update(songId, { isFavorite: currentStatus });17 updateSongInUI(songId, { isFavorite: currentStatus });18 }19} - Network Request: A
POST
request is made to/users/me/favorites/songs
(as per Muse API docs). - Outcome:
- Success: All good. The local state already matches the server.
- Failure: The local change must be reverted, and the user notified. This is crucial.
Background Sync vs. Immediate Sync
- Immediate Sync: For critical actions like login, registration, or a payment transaction. The user usually waits for the server response.
- Background Sync (Service Worker
BackgroundSync
API): For non-critical data like analytics, logging, or less urgent updates. The Service Worker can wait for a stable connection to send the data. Muse might use this for syncing play counts if immediate accuracy isn't paramount. - Periodic Sync (Service Worker
PeriodicBackgroundSync
API): For regularly updating content, like fetching new playlist recommendations for Muse in the background.
Conflict Resolution
When data can be changed on multiple clients (e.g., two browser tabs editing the same Seleneo design, or Muse on two devices managing the same playlist), conflicts can arise.
- Last Write Wins (LWW): Simplest. The last update received by the server overwrites previous ones. Common but can lead to data loss.
- Timestamp-based: Each piece of data has a timestamp; the newest one wins.
- CRDTs (Conflict-free Replicated Data Types): More complex, but mathematically guarantee convergence. Overkill for many apps but powerful for collaborative tools.
- Manual Resolution: Prompt the user to resolve the conflict.
Muse's R2 sync for metadata likely uses a timestamp-based LWW or a server-side merge logic for things like playlist contents. Seleneo, if syncing complex designs, would need a robust strategy here.
Service Workers and the Cache API: True Offline Power
Beyond storing structured data in IndexedDB, Service Workers and the Cache API are essential for building truly robust offline-first web applications.
A Service Worker is a script that your browser runs in the background, separate from a web page, opening the door to features that don't need a web page or user interaction.
Key Capabilities:
- Network Proxying: Intercept network requests and decide how to respond (e.g., serve from cache, fetch from network).
- Caching Assets: Store your app shell (HTML, CSS, JS) and static assets for instant loading and offline availability.
- Caching API Responses: Store responses from your backend APIs, making data available even when offline.
- Background Sync: Defer actions until the user has stable connectivity.
- Push Notifications: Engage users even when they aren't actively using the site.
Caching Strategies with the Cache API
The Cache API allows you to create and manage named caches of request/response pairs. Common strategies include:
- Cache First (Offline First):
- Check the cache for a response.
- If found, serve it.
- If not, fetch from the network, cache the response, and then serve it.
- Best for app shell assets and data that doesn't change frequently.
- Network First, then Cache:
- Try to fetch from the network.
- If successful, cache the response and serve it.
- If the network fails (e.g., offline), serve from the cache as a fallback.
- Good for data that changes frequently but should still be available offline (e.g., user's latest feed).
- Stale-While-Revalidate:
- Serve from cache immediately (if available) for a fast response.
- Simultaneously, fetch from the network in the background.
- If the network fetch is successful, update the cache for the next request.
- Offers a good balance of speed and freshness.
- Cache Only: Only serve from the cache. Useful for assets you know are versioned and won't change until the next app update.
- Network Only: Always bypass the cache and go to the network. For non-critical requests or things that must always be live.
Service workers have a distinct lifecycle (installing, activated, fetching). Understanding this is key to avoiding common pitfalls, like not seeing updated content immediately after deploying a new service worker. Tools like Workbox can simplify this.
For an application like Muse, a service worker could:
- Cache the core application shell (HTML, JS, CSS).
- Cache album art and track metadata fetched from an API.
- Potentially even cache recently played audio files (though this needs careful quota management). This would allow users to open Muse, see their library, and play recently listened-to tracks even if they are completely offline.
Web Storage Security: Handle with Care
Client-side storage is inherently less secure than server-side storage. It's crucial to be aware of the risks.
localStorage
and Cross-Site Scripting (XSS)
localStorage
is synchronous and accessible via JavaScript on the same origin. If your site has an XSS vulnerability, an attacker can execute script that reads everything in your localStorage
.
Never store sensitive information like session tokens, API keys, or personal user data in localStorage
. An XSS attack could lead to complete account takeover. Use HttpOnly
cookies for session tokens.
IndexedDB Security
IndexedDB is also origin-bound, meaning only scripts from the same origin can access a specific database. This provides a good baseline of security against other sites. However, like localStorage
, if your own site has an XSS vulnerability, malicious scripts running on your origin can access and manipulate IndexedDB data.
Storing Sensitive Information
Generally, avoid storing highly sensitive, unencrypted data on the client. If you must store data that needs some level of protection client-side (e.g., user preferences that are sensitive but not critical secrets):
- Consider the threat model: What are you protecting against?
- Web Crypto API: For actual encryption/decryption tasks, the Web Crypto API provides cryptographic primitives. However, client-side encryption is complex. If the encryption key itself is derived from something accessible to client-side JavaScript (e.g., a user password typed into the app), an XSS attacker could potentially intercept that key.
Client-Side Encryption is Hard
While the Web Crypto API provides the tools, managing keys securely on the client-side is a significant challenge. Don't assume it makes data impervious if your site is compromised.
- Server-side is safer: For truly sensitive data, always prefer server-side storage and processing.
Choosing Your Storage Weapon: A Practical Decision Framework
Selecting the right client-side storage isn't just about picking the newest API; it's about matching the tool to the task, considering your application's specific needs. Here's a framework to guide your decision, using Muse and Seleneo as examples:
Consideration | Key Questions | localStorage (Seleneo - UI) | IndexedDB (Muse - Library & Audio) | Cache API (Muse - Assets/Offline) |
---|---|---|---|---|
Data Size & Complexity | How much data? Simple key-value or complex structured objects/blobs? | Small (less than 5MB, ideally less than 100KB). Simple JSON-serializable objects. (e.g., Seleneo's theme, active draft ID). | Large (MBs to GBs). Complex objects, arrays, blobs. (e.g., Muse's track metadata, playlists, downloaded MP3s). | Request/Response pairs. Good for app shell, static assets, API responses. (e.g., Muse's JS/CSS, album art). |
Querying Needs | Need to search/filter/sort data by specific fields? | No. Only key-based retrieval. | Yes. Supports indexes for efficient querying, sorting, and range scans. (e.g., "Find all songs by artist X"). | No. Cache key (Request object) based retrieval. |
Performance (Access Mode) | Synchronous or Asynchronous access? Impact on main thread? | Synchronous. Can block main thread if data is large or operations are frequent. Fine for tiny, quick reads/writes. | Asynchronous. Non-blocking, uses Promises/callbacks. Essential for large data or frequent operations. | Asynchronous. Non-blocking. |
Offline Requirements | Does data need to be available offline? How critical is it? | Yes, for its specific use (UI persistence). But not for core app functionality if it were more complex. | Yes, critical for Muse's offline playback of downloaded music and library browsing. | Yes, primary purpose is to enable offline access to cached assets and API responses. |
Transactions | Need to perform multiple operations as a single atomic unit? | No. | Yes. Supports transactions for data integrity across multiple operations. | No direct transaction model like IndexedDB, but operations are atomic. |
Data Sync Needs | Does this data need to be synced with a server or across devices? | Not inherently. Seleneo's localStorage is device-specific. Sync would be a separate concern. | Often, yes. Muse syncs metadata (not blobs) via R2. IndexedDB is the local cache/source of truth. | Can cache API responses that are part of a sync strategy, but doesn't handle sync logic itself. |
Ease of Use | API complexity? Need for libraries? | Very simple API. | Complex native API. Libraries like Dexie.js are highly recommended. | Relatively straightforward Promise-based API. Workbox can simplify further. |
Security Context | What kind of data is it? What are the risks if exposed via XSS? | Avoid sensitive data. Risk of XSS exposure. (Seleneo: UI prefs - low risk). | Avoid highly sensitive unencrypted data. Risk of XSS exposure. (Muse: Track metadata - moderate risk if user data is detailed; audio blobs - IP risk). | Can store API responses containing user data. Same-origin policy applies. Risk of XSNIFF if Content-Type is not set correctly. |
Many applications start with localStorage
for basic persistence. As features grow and data needs become more complex (like Seleneo potentially adding offline design storage), they might graduate to IndexedDB
. Muse likely started with simpler storage and evolved its sophisticated multi-layered approach. Don't over-engineer from day one, but be prepared to refactor and adopt more powerful tools when your requirements demand it.
Final note
Web storage isn’t about choosing the “best” API — it’s about knowing your constraints. The right answer depends on what you're building, who you're building it for, and whether you care if it works offline.
I care. I think we all should.
Whether you're building a cloud music player or a drag-and-drop design studio, you’ll eventually ask yourself:
Where should this data live?
Make sure you have an answer.
And if you don’t yet — you will.
Happy storing, happy building 💽