Dolt – data version control

Dolt is an embedded and single-node DBMS, which is incorporating gift-styled versioning in the form of a first-class entity. It is behaving pretty much similar to GIT. However, you will be able to call that a local database that comes along with content-addressable fields. The main objects of it would be tables, regardless of files.

How to use Dolt?

A developer who is using Dolt will be able to create a local repository. This repository would contain the tables, which can be updated and read with the help of SQL. This is pretty much similar to GIT. That’s because all the writes are properly staged, up until the moment where the user is issuing a commit. Once the commit is done, all the writes will be appended into permanent storage. Merge and branch semantics are also supported, which will provide the facility for the tables to evolve for different users at different speeds. This will provide loose collaboration on data at the end of the day. On the other hand, it will be offering multiple views of the core data as well.

All the merge conflicts would be detected for data conflicts and schema conflicts. On the other hand, all the data conflicts that take place here would be cell-based ones, instead of line-based. On the other hand, remote repositories would allow proper cooperation among the different repository instances. Push, clone, and pull semantics would be available to the developers as well. 

Features of Dolt

When you are about to use Dolt, you should have a clear understanding of all features that come along with it as well. Here are the main features that you can receive along with Dolt as of now.

  • Checkpoints

It is important to keep in mind that Dolt is not offering any support to the mutable database files. Hence, it will not be taking checkpoints explicitly. Instead, you will figure out that there is a manifest in Dolt, which is responsible for storing all the pointers to the table files that are currently active. This will be automatically updated based on the mutation of your database.

  • Compression 

Dolt is using an open-source compression library, which is known as Snappy. This compression library is designed to pay more attention to speed instead of size. All the chunks will be compressed along with Snappy before they are stored. On the other hand, they will be decompressed as they are read back to the block cache. It is important to decompress all data in the database before they are processed with processing queries.

  • Concurrency control 

Dolt is not offering any support to the transactions. If there are concurrent SQL sessions within the same dolt, it would start to read what is committed. There will also be an auto-commit, which is taking place in each SQL statement that is exhausted. 

  • Data model and foreign keys 

Dolt will be emulating MySQL as the data model. Hence, all developers who are familiar with MySQL will be able to go ahead and get their work done with Dolt. Likewise, Dolt is responsible for emulating the foreign key support that is available along with MySQL as well.

  • Indexes

Dolt will be offering support to a B-Tree-shaped index structure. This table structure will be available for the primary keys of the table. There will also be configurable multi-column secondary indexes or single-column secondary indexes that are having the same data structure.

  • Isolation levels 

It is important to keep in mind that Dolt will not be offering any support transactions. This is a feature that is based upon the model of Dolt. The concurrent edits that you do will be taking place within separate branches. However, these changes will be eventually merged via the explicit user actions that take place.

  • Logging 

You will also need to understand that Dolt is not offering any support to logging as of now. However, it would ensure the durability of data. New data would be written when the new table files are containing a new chunk. All the table files would be written into the disk before updating the manifest references.

  • Execution of queries

In Dolt, you will figure out that support is available for the iterator-based query executions. This will be taking place without offering any support for the intra-query parallelism as well.

  • Query interface

Dolt will be offering a command-line query interface. This interface will be pretty much similar to the command-line interface that is present in Git. On the other hand, Dolt will be offering SQL queries via MySQL wire protocol. This will be taking place via the command line as well. Moreover, it is important to note that Dolt is allowing functionality to import and export data with the assistance of CSV files.

  • Storage model and storage architecture

Dolt is using the disk storage architecture. However, it will be able to store datasets up to around 100GB. On the other hand, you will figure out that Dolt is offering the N-ary storage model for the storage of data, along with clustered primary keys. The complete database would be a content-addressed one, which is quite similar to Merkle Tree. The Merkle Tree is offering a hash-based data structure. It would offer some generalization features as well.

  • System Architecture 

Dolt is not offering a distributed system architecture at a system level. Instead, Dolt will be designed to distribute the same database to multiple locations. This can be worked in an isolation, along with scenarios where edits that originated from a single location can be pushed or pulled into another location.

Final words

Now you have a clear understanding of what Dolt is all about. If you are happy to go ahead with this database management system, you have all the freedom to do it. Then it is possible to experience the unique features offered by it.

Leave a Comment