Extralite 3.0.0 Released
I’m pleased to announce the release of Extralite 3.0.0. This release marks the addition of object graph transforms, the transition to modern defaults for opening SQLite databases, and a minor security enhancement.
Extralite is a fast and innovative SQLite wrapper for Ruby with a rich set of features. It provides multiple ways of retrieving data from SQLite databases, makes it possible to use SQLite databases in multi-threaded and multi-fibered Ruby apps, and includes a comprehensive set of tools for managing SQLite databases.
Object Graph Transforms
In the last few years I’ve been transitioning from using an ORM (such as ActiveRecord) as a database abstraction to a more explicit, less magical approach where queries are expressed directly in SQL, and rows are represented using hashes and arrays. In my opinion, this approach has numerous advantages: less dependencies, less object allocations, better performance.
But something that was missing for me was the ability to take a query involving table joins, and generate an object graph that represents the relationships between the different entities returned in the result set. Suppose we have a database with two tables: posts and authors. Each post has a single author, i.e. a one-to-one relationship, and each author may have multiple posts, so a one-to-many relationship. To get a list of posts together with their authors, we could use the following query:
select posts.id, posts.title, posts.content, authors.id, authors.name
from posts
left join authors
on posts.author_id = authors.id
The result set will look something like the following:
| posts.id | posts.title | posts.content | authors.id | authors.name |
|---|---|---|---|---|
| 1 | “foo” | “…” | 1 | “Jim” |
| 2 | “bar” | “…” | 1 | “Jim” |
| 3 | “baz” | “…” | 2 | “Joe” |
Extralite will happily convert this result set into an array of hashes:
[
{ "posts.id" => 1, ..., "authors.id" => 1, ... },
{ "posts.id" => 2, ..., "authors.id" => 1, ... },
{ "posts.id" => 3, ..., "authors.id" => 2, ... },
...
]
While this contains all the data we asked for from the database, the data remains flat. Also, the fact that the same author information is repeated across multiple records is wasteful, and we have no way of knowing (other than checking the author id) that we’re dealing with the same author. For a many-to-many relationship, it would look even worse. Consider the following query:
select posts.id, posts.title, tags.id, tags.name
from posts
left join posts_tags
on posts.id = posts_tags.post_id
left join tags
on tags.id = posts_tags.tag_id
The result set will be returned by Extralite in the form:
[
{ "posts.id" => 1, ..., "tags.id" => 1, ... },
{ "posts.id" => 1, ..., "tags.id" => 2, ... },
{ "posts.id" => 2, ..., "tags.id" => 2, ... },
{ "posts.id" => 2, ..., "tags.id" => 3, ... },
...
]
Here, each post or tag may be repeated across multiple rows, which makes it more difficult for us to make sense of this information - we’ll need to continuously check the post and tag id in order to ensure we’re not processing the same post or tag multiple times. What if we had a tool that allowed us to make sense of such a result set without needing to do special processing. What if we had a way to get the result set in a form that reflects the relationships between the different entities?
Enter object graph transforms: the basic idea is that when we issue a query to an Extralite database instance, we provide, along with the SQL string, a transform specification that tells Extralite how to form the rows in the result set. Let’s look again at the posts-tags example above. We might express the object graph transform as follows:
# select posts.id, posts.title, tags.id, tags.name from ...
transform = Extralite::Transform.new do
{
id: integer.identity,
title: text,
tags: [{
id: integer.identity,
name: text
}]
}
end
With this transform we map the four columns we requested in our SQL query to nested entities. The transform DSL not only lets us express the form of the object graph, it also lets us express optional type coercions (since by default SQLite does not enforce column types) and which column is used to identify which entity in order to prevent duplicates. Will also enclose the tag entity in square brackets, in order to let Extralite know that a post may have multiple tags. The result set generated using the transform will look like:
[
{
id: 1,
title: ...,
tags: [
{ id: 1, ... },
{ id: 2, ... }
]
},
{
id: 2,
title: ...
tags: [
{ id: 2, ... },
{ id: 3, ... }
]
}
]
This object graph will be generated in such a way as to remove all duplication. The first two rows pertaining to post #1 have been merged into a single post entity with two tags. In addition, tag #2, which appears in the list of tags for both post #1 and post #2, is actually the same object.
Under the hood, Extralite converts the transform spec defined using the transform DSL into corresponding internal C structs in the form of a tree. When a query is ran using a transform, Extralite will allocate storage for identity maps for the different types of entites (i.e. posts and tags) which are used to store entities that have already been created. For each row in the result set, Extralite traverses the transform tree representation, reading data from each column as needed, and either skipping columns for already existing entities or creating new ones. And, we also get a nice way to tell SQLite to coerce values coming from the database into the types we need. Right now, Extralite supports the following types: integer, float, text, boolean, and JSON.
I believe this new feature can be a game changer for developing Ruby web apps
without using an ORM (understanding why ORM’s are problematic is left as an
excercise to the faithful reader). The transform DSL is simple, but lets you
express one-to-one, one-to-many and many-to-many relationships with ease. In the
future, I might add support for more types (for example, a Time type coercion)
and perhaps other features such as composite identity (based on two or more
columns) and composite columns in general (merging multiple columns into a
single one).
Modern Defaults for SQLite Databases
Extralite version 3.0.0 also marks the transition to opening databases with
modern defaults as regards concurrency and other settings. Namely, starting from
version 3.0.0, when opening an Extralite database, it will automatically be set
to use WAL journaling, with the synchronous pragma set to NORMAL, as well as
turning on foreign key constraints. This is a breaking change from previous
versions.
If you need to support older databases without these settings, you can open the
database in legacy mode:
db = Extralite::Database.new('/foo:bar', legacy: true)
Better Security for Loading Extensions
Another small but important change from previous versions is that the loading of
extensions using SQL queries, i.e. select load_extension('/foo/bar'), is not
allowed. Starting from version 3.0.0, the only way to load an extension is using
the built-in Database#load_extension method.
Conclusion
Extralite aims to be a comprehensive and performant solution to working with SQLite databases. For more information, please visit the Extralite repository. You can also consult the Extralite docs.