-
Notifications
You must be signed in to change notification settings - Fork 6
Client side SQLite with remote file
This is a follow-up to the earlier SQLite example. The app we built previously relies on downloading the whole file in advance. This approach imposes a practical limit on how large the database can be. We are going to try to ovecome that with the help of HTTP range requests.
We start out by making a copy of the original example. The first change we need to make is to move the data retrieval code to a worker thread. Because data requested by SQLite will no longer be available immediately, a wait must be performed, an operation not permitted in the main thread.
Open vite.config.js and add the multithreaded option to rollup-plugin-zigar:
zigar({ optimize: 'ReleaseSmall', multithreaded: true }),Add the following HTTP headers to the dev server options to enable shared memory:
server: {
headers: {
'Cross-Origin-Opener-Policy': 'same-origin',
'Cross-Origin-Embedder-Policy': 'require-corp',
}
},You'll need to do the same when you deploy the app. If your hosting service provider does not let you add HTTP headers (e.g. Github Pages) then you're out of luck.
If the database extension had been changed to .txt, rename it back and add back the
assetsInclude setting:
assetsInclude: [ '**/*.db' ],As of Zig 0.14.1, the standard library still does not properly support multithreading in WebAssembly. It needs to be patched in multiple places. If you haven't done so already, run the following command to apply the patch file:
npx rollup-plugin-zigar patchRename sqlite.zig to worker.zig. Open it and remove the @"meta(zigar)" declaration (it won't
be used since the file is no longer the root).
Create a new sqlite.zig:
const std = @import("std");
const wasm_allocator = std.heap.wasm_allocator;
const zigar = @import("zigar");
const worker = @import("./worker.zig");
var work_queue: zigar.thread.WorkQueue(worker) = .{};
pub fn startup() !void {
try work_queue.init(.{
.allocator = wasm_allocator,
.n_jobs = 1,
});
}
pub fn shutdown(promise: zigar.function.Promise(void)) void {
work_queue.deinitAsync(promise);
}
pub const openDb = work_queue.promisify(worker.openDb);
pub const closeDb = work_queue.promisify(worker.closeDb);
pub const findAlbums = work_queue.promisify(worker.findAlbums);
pub const getTracks = work_queue.promisify(worker.getTracks);
pub const @"meta(zigar)" = struct {
pub fn isFieldString(comptime T: type, comptime _: std.meta.FieldEnum(T)) bool {
return true;
}
pub fn isDeclPlain(comptime T: type, comptime _: std.meta.DeclEnum(T)) bool {
return true;
}
};Now our functions return promises instead of the actual results. We need to update our
JavaScript to reflect that. Open src/App.js and add await in front of the call to
openDb() and replace the call to byte() with blob():
let data
const dataPromise = (async () => {
const resp = await fetch(chinook)
data = await resp.blob()
await openDb('/db.sqlite3')
})()We're switching from a Uint8Array to a Blob in order to test async file access. In order
to read some part of the blob, Zigar has to wait for the fulfillment of the promise returned by its
arrayBuffer() method.
In the useEffect handlers, change the call to findAlbums():
findAlbums(deferredSearchString || '%').then(setAlbums)And getTracks():
getTracks(selectedAlbumId).then(setTracks)Finally, add a call to startup() at the top:
startup()Now the app is ready to run again:
npm run dev
It should work as before.
In the original app, we had to make sure that no query is performed until the database is downloaded. This made the start-up sequence a bit awkward. Now that SQLite is capable of waiting, we can clean up our code a bit.
We can get rid of the state variable ready. dataPromise is also not longer needed. We can make
our open listener async instead:
__zigar.on('open', async (evt) => {
if (evt.path.endsWith('.sqlite3')) {
if (!data) {
const resp = await fetch(chinook)
data = await resp.blob()
}
return data
} else {
return false
}
})We can move the call to openDb() into a useEffect hook:
useEffect(() => {
openDb('/db.sqlite3')
return () => closeDb()
}, [])The original plan for this tutorial was to use an existing npmjs package to handle the data retrieval. That turns out to be unworkable due to a check for the Accept-Ranges header, which is frequently missing even when a server is capable of performing such requests.
So we have to roll our solution--it's simple enough. Instead of mimicking a blob we're going to
create a class that implements Zigar's [file interface](File interface). Create a new file
web-file.js and paste in the following:
export class WebFile {
static async create(url) {
const resp = await fetch(url, { method: 'HEAD' });
const len = resp.headers.get("content-length");
if (!len) {
throw new Error('Missing Content-Length header');
}
const self = new WebFile();
self.url = url;
self.size = parseInt(len);
self.pos = 0;
return self;
}
async pread(len, offset) {
const end = offset + len;
const resp = await fetch(this.url, {
headers: {
Range: `bytes=${offset}-${end - 1}`,
},
});
if (!resp.headers.get('content-range')) {
throw new Error('Missing range header');
}
const buffer = await resp.arrayBuffer();
return new Uint8Array(buffer);
}
async read(len) {
const chunk = await this.pread(len, this.pos);
this.pos += chunk.length;
return chunk;
}
tell() {
return this.pos;
}
seek(offset, whence) {
let pos = -1;
switch (whence) {
case 0: pos = offset; break;
case 1: pos = this.pos + offset; break;
case 2: pos = this.size + offset; break;
}
if (!(pos >= 0 && pos <= this.size)) throw new Error('Invalid argument');
this.pos = pos;
return pos;
}
}The code above is largely self-explanatory, I believe. We first make a HEAD request to the
server to obtain the file's size. As pread calls come in, they're translated into
Fetch API calls.
Import the class in App.jsx and use it in the open listener:
__zigar.on('open', async (evt) => {
if (evt.path.endsWith('.sqlite3')) {
return data ??= WebFile.create(chinook)
} else {
return false
}
})Return to the browser. Open the development console and switch to the Network tab. When you reload the page, you should see a large number of small requests:

Retrieving the file in such small chunks is obviously ineffecient. Luckily, we can easily fix this by increasing the database's page size.
Install DB Browser for SQLite if you don't have it on your computer
already. Open src/assets/chinook.db and switch to the Edit Pragmas tab. You'll see that the
page size is only 1K. Set it to 16K and press Save:

Or you can use the CLI tool sqlite3 to run the following commands:
PRAGMA page_size = 65536;
VACUUM;After the change, the number of requests drop significantly:

If you examine the network traffic, you'll notice that SQLite regularly reads a small region near the beginning of the file:

So if we cache this part of the file, we can eliminate these superflouous requests. Instead of
making a HEAD request initially, we're going do do a partial GET instead:
static async create(url, prefetch = 16384) {
const resp = await fetch(url, {
headers: {
Range: `bytes=0-${prefetch - 1}`,
},
});
const data = await resp.bytes();
let size;
if (resp.status === 206) {
const range = resp.headers.get("content-range");
size = parseInt(range.slice(range.indexOf('/') + 1)); // e.g. "bytes 0-16384/1261568"
} else {
size = data.length;
}
const self = new WebFile();
self.url = url;
self.size = size;
self.pos = 0;
self.cache = data;
return self;
}Then in pread(), we check if the call can be satisfied with the data that was fetched:
async pread(len, offset) {
const end = offset + len;
if (end <= this.cache.length) {
return this.cache.slice(offset, end);
}
// ...
}As a bonus, our code now can handle the situation where the server chooses to send the whole file (status code 200 instead of 206).
One mistake that we made in the original example is neglecting to include our Zig functions as
dependencies of our useEffect hooks. When they change, these hooks aren't rerun. We'll fix that
now:
useEffect(() => {
openDb('/db.sqlite3')
return () => closeDb()
}, [ openDb, closeDb ])
useEffect(() => {
findAlbums(deferredSearchString || '%').then(setAlbums)
}, [ deferredSearchString, findAlbums ])
useEffect(() => {
if (selectedAlbumId !== undefined) {
getTracks(selectedAlbumId).then(setTracks)
} else {
setTracks([])
}
}, [ selectedAlbumId, getTracks ])The Chinook database is pretty small. Let us now try a larger sample file. Download
northwind.db
from this Github project. Move it into
src/assets and change the page size to 64K. The resulting file should be around 25 MB.
Rework the app so that it uses the following queries:
SELECT a.CustomerID, a.CompanyName, a.Region
FROM Customers a
WHERE a.CompanyName LIKE '%' || ? || '%'
ORDER BY a.CompanyNameSELECT a.OrderID, a.OrderDate, SUM(b.UnitPrice * b.Quantity * (1 - b.Discount)) as Total FROM Orders a
INNER JOIN "Order Details" b ON a.OrderID = b.OrderID
WHERE CustomerID = ?
GROUP BY a.OrderDate One searches for companies by name. The other returns the orders from a given company, calculating the total of each by summing the prices of the products involved.
Once you have everything working again, you'll discover that the app is essentially unusable. The second query is simply too slow. If we look at the database, we see one potential problem: the total lack of indices:

So we ask SQLite itself what indices it wants:
$ sqlite3 northwind.db
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
sqlite> .expert
sqlite> SELECT a.OrderID, a.OrderDate, SUM(b.UnitPrice * b.Quantity * (1 - b.Discount)) as Total FROM Orders a
INNER JOIN "Order Details" b ON a.OrderID = b.OrderID
WHERE CustomerID = ?
GROUP BY a.OrderDate;
CREATE INDEX Orders_idx_0a898c2d ON Orders(CustomerID, OrderDate);
SEARCH a USING COVERING INDEX Orders_idx_0a898c2d (CustomerID=?)
SEARCH b USING INDEX sqlite_autoindex_Order Details_1 (OrderID=?)
sqlite> .expert
sqlite> SELECT a.CustomerID, a.CompanyName, a.Region
FROM Customers a
WHERE a.CompanyName LIKE '%' || ? || '%'
ORDER BY a.CompanyName;
CREATE INDEX Customers_idx_a2d7b248 ON Customers(CompanyName);
SCAN a USING INDEX Customers_idx_a2d7b248
After adding the indices, performance seems to improve a little. The second query still takes an agonizingly long time to run though. Here SQLite is aggregating thousands of rows. It simply must read nearly all parts of the file. The only way to cut execution times down to a tolerable level is to limit the rows returned to something like 5.
You can see the web app in action
here and
here. For some reason
Cloudflare's servers refuse to honor range requests for files with a .db extension. That's why
the database were given a .jpg extension instead.
You can find the complete source code for this example here and here.
Using HTTP range requests to handle a database on the client side seems to be a viable solution. Performance is okay as long as we avoid queries that aggregate information from many rows. For something like a CMS or a product catalog, it's adequate.
One feature of SQLite that can very useful for a web app is full text search. We'll explore its use in a future example.