Open Collective
Open Collective
Loading

SirixDB

Create an evolutionary, immutable, accumulate only database system

Contribute


Become a financial contributor.

Financial Contributions

Recurring contribution
Backer

Become a backer for $5.00 per month and support us

Starts at
$5 USD / month

Latest activity by


Be the first one to contribute!
Recurring contribution
Sponsor

Become a sponsor for $100.00 per month and support us

Starts at
$100 USD / month

Latest activity by


Be the first one to contribute!
Custom contribution
Donation
Make a custom one-time or recurring contribution.

Latest activity by


Top financial contributors

Organizations

1
GitHub Sponsors

$20 USD since Nov 2022

Individuals

1
Johannes Lichtenberger

$20 USD since Dec 2021

SirixDB is all of us

Our contributors 3

Thank you for supporting SirixDB.

Budget


Transparent and open finances.

Credit from GitHub Sponsors to SirixDB

+$20.00USD
Completed
Added funds #594314
+$20.00USD
Completed
Contribution #503541

sirix.io domain

from Johannes Lichtenberger to SirixDB
-$111.00 USD
Approved
Reimbursement #57630
$
Today’s balance

$34.51 USD

Total raised

$34.51 USD

Total disbursed

--.-- USD

Estimated annual budget

--.-- USD

Connect


Let’s get the ball rolling!

Conversations

Let’s get the discussion going! This is a space for the community to converse, ask questions, say thank you, and get things done together.

Let us know what you think

Published on December 6, 2021 by Johannes Lichtenberger

Do you intend to use SirixDB? Do you have feature requests and/or having a hard time setting SirixDB up or embedding the library in a JVM based project?

About


SirixDB was started as a fork of a university project (University of Konstanz), called Treetank. Dr. Marc Kramis began to build the first version in 2006 with a few students for his Ph.D. thesis under the supervision of Dr. Marcel Waldvogel at the Distributed Systems group at the University of Konstanz.

As the project was nearly reaching its end, I (Johannes Lichtenberger, who began working on Treetank early on) have forked the project and kept on working on SirixDB in my spare time. At first, I introduced diffing capabilities between revisions. Furthermore, I've created interactive visualizations of the differences between revisions for my master thesis. Next, I began to work on a binding for Brackit(.org), a query engine based on XQuery and JSONiq for different storage engines with common query optimizations. You can add custom optimizations for a storage backend quickly through AST rewrites. Next, I've built several JSON layers to store binary JSON data and used some ideas from the Adaptive Radix Trie (ART) and Hash Array Mapped Tries (HAMT) to reduce the storage cost of pages with a lot of null references.

Furthermore, I've built a path summary and several custom index structures (name/field indexes, path indexes, content-and-structure indexes, and value indexes). SirixDB stores these indexes in the leaf nodes of subtrees of the RevisionRootPage. Database pages, which do not change between revisions, are referenced.

SirixDB allows the creation of several resources in a database (a collection of resources). It currently supports the import of XML and JSON files and stores these in a huge persistent, durable tree. An UberPage is the root of the resource tree and, thus, the main entry point. Underneath, SirixDB indexes revisions in a trie. A RevisionRootPage denotes the entry point into a specific resource. A second offset-file stores all revision-timestamps and the offsets into the main log-file for a particular revision to support binary search on the timestamps in-memory to fetch the specific RevisionRootPage from the log-file. SirixDB only ever appends data. A commit issues a postorder traversal of the in-memory transaction log and writes page fragments to durable storage. Each trie's last inner node level in the tree stores a predefined number of page fragments at most to the leaf data page fragments. A page fragment consists of currently changed nodes and nodes, which fall off a sliding window. Thus, SirixDB does not simply do the COW of a whole page, but stores batched fine-granular writes. Modern, byte-addressable, durable memory may be the best fit in the future to support small random reads of the page-fragments in parallel.

Moshe Uminer, as of the Hacktoberfest 2019, kept on working mainly on REST-API clients and a GUI web frontend for SirixDB and quickly became a core team member :-)

Our team