Jump to content

What's new in SQLite 3.7

+ 2
  jak's Photo
Posted Sep 07 2010 06:55 PM

SQLite 3.7 and Write-Ahead Logging


SQLite is a full-featured relational database that is designed to be embedded directly into an application or utility. It has a relatively small runtime footprint and stores entire database instances in a single file.

SQLite is available on all major desktop and server platforms, plus every major smartphone platform. It has also be ported to many embedded and hand-held platforms. Thanks to integration into products such as the Firefox web browser and Apple’s iPhone, it is estimated that SQLite is the most popular relational database product in the world, in terms of installed database instances.

The new O’Reilly book Using SQLite explains how to integrate the SQLite library into your own applications and projects. In addition to covering the SQLite API, Using SQLite also provides an introduction to the SQL language and the basic concepts of relational database design. Using SQLite is designed for the developer that has never had much of a need for relational databases before, but finds themselves searching for a tool to store and manage complex data sets.

Using SQLite covers version 3.6.23.1 of SQLite. This was the last official release in the 3.6 series. As Using SQLite was finishing up its final editing and going to press, the SQLite team officially released the first 3.7 build. SQLite 3.7.0 did not introduce a large number of new features, but it included one very significant feature: a new transaction model known as write-ahead logging. Write-ahead logging offers some significant performance and concurrency benefits, and is worth a more detailed look.

Transactions


Like most relational databases, SQLite supports transactions. Transactions allow the application to batch together a series of database modifications into a single unit of work. Either the whole transaction is applied to the database, or the whole transaction is not applied to the database. It should not be possible to apply only part of a transaction. SQLite will automatically wrap each SQL command in a stand-alone transaction, or the application can explicitly open a transaction and group together a series of SQL commands into a single transaction.

Transactions are a fundamental database feature. The classic example of a transaction is a balance transfer between two accounts. The transfer requires that a specific amount must be subtracted from one account record, and that same amount must be added to another account record. Both modifications must take place, resulting in a successful transfer, or both modifications must fail, resulting in a failed transfer. Any other outcome (such as subtracting from one account and failing to add anything to the second account) is bad news.

Applying a transaction to the database is known as a commit. Aborting an in-progress transaction and undoing any modifications is known as a rollback. The application may manually rollback a transaction if, for example, a business logic error is encountered. The database engine may also rollback an in-progress transaction if, for example, there is no disk space left, or if the application attempts to commit data that violates the constraint policies of the database.

Rollback Journal


Traditionally, SQLite uses a rollback journal to support transactions. SQLite database files are divided into fixed-size pages, similar to virtual-memory pages. When a page in the database file needs to be updated, the original content of the page is first copied from the database file to the rollback journal. This creates a backup copy that can, if necessary, be used to undo the modification and restore the original content of the page. The page in the database file can then be safely modified.

If the transaction (and any associated modifications) needs to be rolled back, the contents of the original, unmodified pages are copied from the rollback journal back to the database file, overwriting any changes. This returns the database file to the same state it was in when the transaction was started. To commit a transaction and make the modifications a permanent part of the database, the rollback journal can simply be deleted (or otherwise marked invalid). This eliminates any rollback possibility, making the modifications a permeant part of the database.

An application or system failure (such as an application crash or a power outage) in the middle of a transaction will leave both the partially modified database file and the rollback journal in place. If the SQLite library ever discovers a database in this state, it is assumed that the database file is inconsistent, and will rollback any modifications to a known state before proceeding. This rolls back any partial, uncommitted transaction.

Because uncommitted modifications are written directly to the database file, writers require exclusive access to the database. Once a writer indicates its intention to modify the database file, it blocks all new read and write operations. The writer must also wait for any existing read operators to finish before it can start to modify the database file. In this way, readers block writers and writers block readers. There can be only one writer at a time.

Although the use of a rollback journal limits concurrency, it has one major advantage: the whole system can be implemented using nothing but basic file locks to coordinate access between database connections. File locks are a fairly basic feature that most operating systems provide. This allows SQLite to provide concurrent access by multiple database connections in a fully transaction safe manor, even in relatively primitive environments.

Write Ahead Logging


SQLite 3.7 introduces a new transaction model known as write-ahead logging, or WAL. Like the rollback journal, write-ahead logging uses an auxiliary file known as the WAL file. The WAL file replaces the rollback journal.

WAL Writes

To make a database modification under WAL, the updated pages are simply written to the end of the WAL file. The modifications are committed by appending a special commit flag to the WAL file. A rollback is accomplished by truncating the file back to the last commit flag, removing any subsequent modifications. Future transactions can continue to append changes to the WAL file, possibly including newer updates to a page already found earlier in the WAL file.

Because database modifications are simply appended to the WAL file in sequential order, disk I/O tends to be more linear when using WAL. In general, this improves performance and reduces file fragmentation. Further, because write transactions require modification to only one file, fewer disk synchronizations are required to flush the data to physical media. Disk synchronizations are very expensive (in terms of time), so reducing the number of sync operations can provide significant improvements to performance and transaction rate. Most applications will see a 4x to 8x increase in write performance when using WAL.

WAL Reads

Although WAL simplifies the write process, the read process becomes a bit more complex. To read a page from the database, SQLite must first check the WAL file for an updated page. If the WAL file contains no updates, the page can be fetched from the database file.

Unlike the database file, the pages within the WAL file are not in any specific order. A WAL file may even contain multiple generations of the same page, the last of which may not be committed. Using the WAL file alone, each page fetch would require a full scan of the WAL file.

To prevent full scans of the WAL file, SQLite maintains a separate WAL index. The index is stored in a shared memory block that must be accessible by all active database connections. The WAL index is only active when at least one application is accessing the database file. The index allows a reader to quickly determine if the WAL file contains an updated version of the desired page. The reader can then fetch the page directly from the appropriate location in the WAL file, or from the original database file.

WAL Concurrency

When a read transaction starts, it notes the position of the last valid commit in the WAL file. This commit flag acts as a bounds marker. As the read transaction progresses, it will not fetch pages from beyond that commit marker, even if other connections have recorded (and possibly committed) newer data. This allows a read transaction to ignore newer data and ensures that the database state seen by a read transaction does not change, even if new modifications have been applied (this behavior is a requirement of the transaction model).

The ability to ignore newer data allows readers to be decoupled from writers. Under WAL, readers do not block writers and writers do not block readers. There can still be only one writer at a time, but the write transactions can run independently of the read transactions. This allows a significantly higher level of concurrency between database connections, as read transactions will almost never block.

WAL Checkpointing

Although the sequential writes tend to improve performance, the WAL file cannot be allowed to grow without bounds. Even with the shared index, lookup efficiency will drop as the WAL file gets larger and larger. Eventually the contents of the WAL file need to be merged back into the original database file. This is known as checkpointing.

The checkpointing process copies pages from the WAL file back into the database file. Under WAL, this is the only time the database file itself is modified. Checkpointing can happen concurrently with other read and write transactions, but if other transactions are active, they may prevent a checkpoint from running to completion. In this case a partial checkpoint is performed.

The checkpoint process copies updated pages from the WAL file to the database file. The checkpoint will start from wherever any previous partial checkpoint left off. If there hasn't been a partial checkpoint, the checkpoint will start from the beginning of the WAL file.

The copy process continues up to the first (oldest) commit marker that is still being used by a read transaction as a bounds marker. If no read transactions are active, it will run to the last commit marker in the WAL file. If there is additional data beyond this point, either committed or uncommitted, the position of the marker is noted and the partial checkpoint is closed. If the checkpoint is able to run to the end of the WAL file, it is considered to be a full checkpoint. After a full checkpoint the WAL file can be truncated or removed.

An application can explicitly checkpoint a database, or SQLite can be configured to automatically checkpoint a database when the WAL file exceeds a given size. A full, automatic checkpoint will also be performed when the last connection to a database file is closed. This will remove the WAL file.

By default, SQLite will perform an automatic checkpoint when the WAL file exceeds 1000 page modifications. If a full checkpoint cannot be performed, the WAL file will not be reduced in size. This will cause every subsequent commit to attempt a new checkpoint. The checkpoints will continue until a full checkpoint is able to run to completion and truncate the file below the trigger mark.

WAL Performance

Although the average commit time is significantly faster, any individual commit that triggers an automatic checkpoint is likely to take longer. This makes individual commit times a little less predictable.

For most applications, the average performance is more important than any specific transaction. These applications will generally be well serviced by the automatic checkpoint system. For those applications that have more specific timing requirements, the automatic system can be disabled, allowing the application full control over when the database is checkpointed.

However checkpointing is done, the application must strike a balance between read performance and write performance. Frequent checkpoints helps keep the WAL file (and the index file) smaller, which keeps read performance high. This comes with a cost, however. Frequent checkpointing also increases the average commit time, reducing overall write performance.

Less frequent checkpoints allow the WAL file to grow larger, keeping the average write time lower and increasing overall write performance. A larger WAL file decreases read performance, however.

For a given application, the best checkpoint strategy depends heavily on the relative number of reads and writes. Although a WAL file can significantly improve average write performance, it can actually decrease read performance. The read degradation is small enough (typically only 1% to 2%) that it only shows up in very read-dominated applications, but these types of applications may be better served with a rollback journal. That, or they should be aggressively checkpointed in an attempt to keep the WAL file smaller for the best read performance.

WAL Costs and Benefits

A WAL file enables better read performance and higher levels of concurrency, but there is a cost for these benefits. Compared to the rollback journal, the WAL system is more complex and demands more from the operating system. The design also imposes some inherent limitations on file placement and access.

To access a WAL-enabled database, database connections must be able to utilize the WAL index. This requires support for shared memory blocks that can be accessed and updated from different processes. Additionally, because every database connection requires access to the shared memory index, concurrent connections to a WAL-enabled database must all reside on the same computer. WAL is not designed to work over network file systems or shares.

WAL is also unable to provide full transaction protection when dealing with multiple databases. SQLite allows a single database connection to access multiple database files. When using a rollback journal, transactions that modify multiple databases are normally atomic across all of the databases. WAL is unable to support this functionality. Transactions that involve WAL-enabled databases will provide atomic transaction support for all the modifications made to each individual database, but atomic commits across multiple databases are not supported.

Databases using WAL cannot change their page size, even when performing a VACUUM. The page size can only change when the database is using a rollback journal. It is also not possible to open a WAL-enabled database from a read-only directory. Because every connection to a WAL database must participate in the maintenance of the shared index, every connection must have write privileges for the database files and the directory that contains them.

A WAL database also requires more files. In addition to the database file itself, there is also the WAL log file and the shared memory index, which is often implemented as a memory-mapped file. Although the additional files are normally cleaned up and deleted when all database connections are closed, the additional files tend to persist when a database connection is open and active. If a user deletes one of these auxiliary files, it can result in a corrupt database. This may make WAL less desirable when using SQLite as an application document file, or in other situations where the database file and directory are exposed to the end-user.

Finally, the addition of the checkpointing operation means more configuration and database administration tasks that must be managed by the application. Although significant performance improvements can be found by using WAL, careless management of the checkpoint process can lead to reduced performance. If the checkpoint process is somehow blocked, and a full checkpoint is unable to run, the WAL file and shared memory index file can continue to grow in size. In the extreme case, this can actually fill the available shared memory address space, resulting in an application crash.

As with most performance minded things, WAL is a compromise between increased performance and increased complexity. If you're application demands better performance, the added complexity and limitations of WAL may be well worth it. For applications with less strict performance demands, the simplicity and flexibility of the rollback journal may still be a better choice.

WAL Commands and APIs


WAL adds a number of new pragma options and C API functions. These interfaces can be used to enable and disable WAL, as well as manage the checkpoint process.

Enabling WAL

WAL is enabled using the journal_mode option of the PRAGMA SQL command:

PRAGMA journal_mode = wal;

If the system is able to successfully convert the database to WAL mode, the value 'wal' will be returned. If the database could not be converted, the pervious journal mode will be returned. Unlike other journal modes, a database file that is put in WAL mode will remain in WAL mode until it is explicitly set to another mode. The WAL mode will persist, even if the database is closed and re-opened. Also unlike other journal modes, being in WAL model is a function of the database, not the database connection. If a database has been put into WAL mode, all connections to the database must be using WAL mode or they cannot access the file.

WAL PRAGMAs

In addition to the new journal mode, WAL introduces two additional pragma options. The first pragma is used to configure the automatic checkpoint trigger:

PRAGMA wal_autocheckpoint;

PRAGMA wal_autocheckpoint = page_count;

The wal_autocheckpoint pragma gets or sets the automatic checkpoint trigger for all WAL databases associated with this database connection. By default, an automatic checkpoint will be done when the WAL file exceeds 1000 pages. The trigger level is a function of the database connection, and not the database files. Different database connections to the same WAL-enabled database file can have different trigger levels, but each database connection uses the same trigger for every WAL enabled database.

An application can also force an explicit checkpoint with the wal_checkpoint pragma:

PRAGMA wal_checkpoint;
PRAGMA database.wal_checkpoint;

The wal_checkpoint pragma forces an explicit checkpoint, regardless of the size of the WAL file. If the database is not in WAL mode, this pragma is silently ignored.

These two pragma commands allow basic control over the automatic checkpoint system from the SQL environment

WAL APIs


The WAL pragma options are fairly simple wrappers around some of the new WAL APIs. In total, WAL introduces three new C API calls that are used to control the checkpoint functionality. There is no API call to enable WAL mode. That must be done with the PRAGMA journal_mode = wal SQL command.

The first API call simply performs a checkpoint:

int sqlite3_wal_checkpoint( sqlite3 *db, const char *db_name );

  • db: A database connection.
  • db_name: A logical database name, given in UTF-8.
  • Returns: An SQLite result code.

This function causes a checkpoint to be performed, regardless of the current WAL file size. This can only be called when the calling database connection is in auto-commit mode, with no current transaction open. If the given database is not in WAL mode, nothing happens.

The automatic trigger level can also be set with the C API:

int sqlite3_wal_autocheckpoint( sqlite3 *db, int page_count );

  • db: A database connection.
  • page_count: The new trigger level.
  • Returns: An SQLite result code.

This call sets the automatic checkpoint level for this database connection. This trigger is used by for all WAL-enabled databases associated with the given database connection. Passing a trigger value of zero or less will disable automatic checkpointing.

For more fine-grained control, an application can provide its own callback. Once registered, this callback will be called any time this database connection successfully commits a transaction involving a WAL enabled database.

void* sqlite3_wal_hook( sqlite3 *db, wal_callback, void *udp );

int wal_callback( void *udp, sqlite3 *db, const char *db_name, int page_count );

  • db: A database connection.
  • wal_callback: An application defined callback function.
  • udp: An application defined user data pointer. This value is made available to the WAL callback.
  • db_name: The logical name of a database, given in UTF-8.
  • page_count: The current size of the WAL file, in pages.
  • Returns (sqlite3_wal_hook): Any previously registered udp value.
  • Returns (wal_callback): An SQLite result code.

This function is used to register a WAL commit callback. The callback will be called any time a successful commit happens on a WAL-enabled database. The callback is free to call sqlite3_wal_checkpoint(). Each database connection has only one callback, which is applied to all of the attached databases.

Be aware that sqlite3_wal_autocheckpoint() API and the wal_autocheckpoint pragma command both use sqlite3_wal_hook() to register an internal WAL callback. Any call to sqlite3_wal_hook() will disable automatic checkpointing. Setting an automatic checkpoint trigger level will also over-write any application registered WAL callback. If an application registers a WAL callback, it is fully responsible for checkpointing the WAL files.

Summary


SQLite 3.7 introduces a new transaction model known as write-ahead logging, or WAL. WAL offers considerably better average write performance, along with better concurrency between database connections. Although WAL continues to serialize writes, multiple readers and a writer can all simultaneously access the database.

To enable WAL, the SQLite file driver must support some type of shared memory system. WAL can only be used when all of the database users reside on the same host system, so that they can all share the same memory block. Additional restrictions exist. Currently, the SQLite distribution supports WAL on desktop versions of Microsoft Windows and Unix based operating systems, including Linux and Mac OS X.

WAL introduces the concept of checkpointing. The checkpoint procedure brings the WAL file and the database file back into sync. By default, SQLite will automatically checkpoint the WAL file anytime it exceeds 1000 pages in length. An application can set a different trigger value, or the application can manually checkpoint the database. The database is always automatically checkpointed when the last database connection is closed.

Overall, WAL mode provides the developer with more options. WAL is ordinarily much faster, but it has a fair number of limitations that may make it undesirable for some types of applications. If your main requirement is raw performance, however, it is a great option to check out.

For more information on WAL, see the official documentation.

Cover of Using SQLite
Learn more about this topic from Using SQLite. 

You can build database-backed applications for the desktop, Web, embedded systems, or operating systems without linking to heavy-duty client-server databases such as Oracle and MySQL. This book shows you how to use SQLite, a small and lightweight relational database engine that you can build directly into your application. With SQLite, you'll discover how to develop a database-backed application that remains manageable in size and complexity. Using SQLite guides you every step of the way.

Learn More Read Now on Safari


Tags:
0 Subscribe


0 Replies