MongoDB Multi-Statement Transactions? Yes We Can!

Posted On April 3, 2013 | By Zardosht Kasheff | 6 comments

Earlier, I talked about the transactional semantics we are introducing to MongoDB. As I hinted at the end of the post, we are actually doing more. We are introducing multi-statement transactions. That’s right, multiple queries, updates, deletes, and inserts will be able to run inside of a single transaction. We are working on the details of the semantics as we develop our beta, but at a high level, think of it as having the same semantics as TokuDB and InnoDB’s multi-statement transactions in MySQL.

So how will it work? We introduce three new commands:

db.runCommand({"beginTransaction", "isolation": "mvcc"})

This begins a transaction with the isolation level of MVCC, which means queries will use a snapshot of the system. This is essentially the same as “repeatable-read” in MySQL. Isolations of “serializable” and “readUncommitted” will be supported

db.runCommand(“commitTransaction”)
db.runCommand(“rollbackTransaction”)

These commands either commit the transaction or rollback the transaction.

Here is a screenshot of a transaction that I started and rolled back.

Transactions will operate over a single connection. That is, after a “beginTransaction” command has been applied over a connection, all work on that connection will belong to a single transaction until that same connection sends a commit or rollback command. So, any driver that uses a pool of connections to handle requests must be careful using multi-statement transactions. We have yet to investigate drivers to see what it would take for them to work with this model. We expect some to just work, and others to require care or changes.

Do you want to participate in the process of bringing full transactions to MongoDB? We’re looking for MongoDB experts to test our build on your real-world workloads. Evaluator feedback will be used in creating the product road map. Please email me at zardosht@tokutek.com if interested.

6 thoughts

  1. Hi. Most drivers have features to pin the current thread to a socket in the driver’s pool, with a method called requestStart() or start_request() depending on the language. More info for the Python driver here:

    http://emptysquare.net/blog/requests-in-python-and-mongodb/

    We need this for other sequences of commands that require a single connection, like authenticate or copyDatabase.

    How do you intend to support transactions in a sharded cluster? And are operations done in a transaction visible on replica-set secondaries before they’re committed?

    1. zardosht says:

      That is great to hear. We expect and hope for this to work with most drivers. Users just need to be careful to make sure they don’t accidentally do multi statement transactions on drivers that use a connection pool to serve requests.

      As for sharding, we realize that sharded setups are an important use case and are currently digging into how it will work with fractal trees. As a result, we can’t yet comment on transactions on a sharded cluster.

      As far as replica sets go, this is also in development, but the answer is no. Transactions will not be visible on secondaries until they are committed. Here is how it will work at a high level:
      – transaction on a primary will store operations in a local buffer while the transaction is in progress
      – before committing of the transaction, the operations will be written to the oplog using the same transaction done to do all of the other work
      – once the transaction commits, then the data written in the oplog may be replicated to secondaries.

  2. Deepak says:

    I have created a tree structure in mongodb (which is not much in size), now i m searching 1mn record sequentially, on mongodb tree and updates as well which is taking more then 5 hour or single node cluster.How should i do it so that will become fast, also please suggest me that, does making more node cluster make it fast. Also i make a connection to same mongod –port 270217 does it distribute processing across to the cluster or not ?

    1. Zardosht Kasheff says:

      I am not sure I understand. Are you using TokuMX or MongoDB? If you are using TokuMX, email our google group, tokumx-user@googlegroups.com

  3. Samuel says:

    These new and shiny transactions can be nested?

    So, can you run something like this?


    begin transaction // 0
    insert
    begin transaction // 1
    insert
    rollback // 1 - remove second insert
    commit // 0 - will persist first insert

    1. Zardosht Kasheff says:

      Unfortunately not. There is no nesting. Only one level of transactions works. If you run that second “begin transaction”, you should get an error.

Leave a Reply

Your email address will not be published. Required fields are marked *