'How to simulate "multi-versioning" regarding database MVCC in JavaScript?
Here is a demo which demonstrates the problem with not having transaction "locking". It sort of simulates async / concurrency using setTimeout. I have never dealt with concurrency in languages like C, Go, or Rust, so I am not really sure how it works in implementation detail, but I am trying to grasp the concept of MVCC.
const db = {
table1: {
records: [
{ id: 1, name: 'foo', other: 'hello' },
{ id: 2, name: 'bar', other: 'world' },
]
}
}
function readTable1(id) {
return db.table1.records.find(x => x.id === id)
}
function writeTable1(id) {
const record = readTable1(id)
return new Promise((res, rej) => {
console.log('transaction 1 start')
setTimeout(() => {
record.other = 'qwerty'
setTimeout(() => {
record.name = 'asdf'
console.log('transaction 1 done')
res()
}, 1000)
}, 1000)
})
}
function wait(ms) {
return new Promise((res) => setTimeout(res, ms))
}
async function test1() {
writeTable1(1)
console.log(readTable1(1))
await wait(1100)
console.log(readTable1(1))
await wait(2200)
console.log(readTable1(1))
}
test1()
It logs
transaction 1 start
{ id: 1, name: 'foo', other: 'hello' } // read
{ id: 1, name: 'foo', other: 'qwerty' } // read
transaction 1 done
{ id: 1, name: 'asdf', other: 'qwerty' } // read
In the middle while the transaction is processing the record, it changes the real record which can be concurrently read. There are no locks on it, or however MVCC does it without locks (using multiple versions of records). I next try to implement how I think MVCC works, with hopes that you can correct my understanding. Here is that.
const db = {
table1: {
records: [
[{ id: 1, name: 'foo', other: 'hello' }],
[{ id: 2, name: 'bar', other: 'world' }],
]
}
}
function readTable1(id) {
const idx = db.table1.records.findIndex(x => x[0].id === id)
return [idx, db.table1.records[idx][0]]
}
// this is a long transaction.
function writeTable1(id) {
const [idx, record] = readTable1(id)
// create a new version of record for transaction to act on.
const newRecordVersion = {}
Object.keys(record).forEach(key => newRecordVersion[key] = record[key])
db.table1.records[idx].push(newRecordVersion)
return new Promise((res, rej) => {
console.log('transaction 2 start')
setTimeout(() => {
newRecordVersion.other = 'qwerty'
setTimeout(() => {
newRecordVersion.name = 'asdf'
console.log('transaction 2 done')
// now "commit" the changes
commit()
res();
}, 1000)
}, 1000)
})
function commit() {
db.table1.records[idx].shift()
}
}
function wait(ms) {
return new Promise((res) => setTimeout(res, ms))
}
async function test1() {
writeTable1(1)
console.log(readTable1(1)[1])
await wait(1100)
console.log(readTable1(1)[1])
await wait(2200)
console.log(readTable1(1)[1])
console.log(db.table1.records)
}
test1()
That outputs this, which seems correct.
transaction 2 start
{ id: 1, name: 'foo', other: 'hello' }
{ id: 1, name: 'foo', other: 'hello' }
transaction 2 done
{ id: 1, name: 'asdf', other: 'qwerty' }
[
[ { id: 1, name: 'asdf', other: 'qwerty' } ],
[ { id: 2, name: 'bar', other: 'world' } ]
]
Is this correct, generally how it works? Mainly, how many versions per record are created in a real implementation? Can there be more than 2 versions at a time? If so, in what situations does that occur generally speaking? And how do the timestamps work? I read about the timestamps on the wiki page, but it doesn't really register to me how to implement it. Also the incrementing transaction IDs. So basically how those 3 pieces fit together (versioning, timestamps, and transaction IDs).
I am looking for some sort of simulation of the timestamps and versioning in JavaScript, so I can make sure I understand the general concepts at a high level, yet at a sort of rough approximation of an implementation level. Just knowing what MVCC is and reading a few papers isn't enough in the weeds to know how to implement it.
In my example there will only ever be 2 versions of a record during a transaction. I am not sure if there are cases where you would need more than that. And I am not sure how to plug in the timestamps.
Solution 1:[1]
Short answer: Multiversion concurrency control in "databases" involves several different things; different vendors implement each of these things in several different ways.
- Here's a list of databases using MVCC (both RDBMs and No-SQL DBs), and which versions first supported MVCC: https://en.wikipedia.org/wiki/List_of_databases_using_MVCC
- Here's a good response for MSSQL (Microsoft's enterprise RDBMS database):
Does SQL Server really implement MVCC anywhere
Yes, since SQL Server 2005.
The SQL Server terminology is "row-versioning isolation levels". See the product documentation tree starting at Locking and Row Versioning. Note in particular that there are two separate "MVCC" implementations, read committed isolation using row versioning (RCSI) and snapshot isolation (SI).
and how does that reconcile with the idea of with (tablock, holdlock) if it does?
Using that combination of hints serializes access to whole table(s). It is the least concurrent option available, so using these hints should be very rare. Whether a particular use could be replaced with RCSI or SI isolation depends on the specific circumstances. You could ask a follow-up question with a specific example if you want us to address that aspect in detail.
You might also like to read my series of articles on SQL Server isolation levels.
Here is another good link: Well-known Databases Use Different Approaches for MVCC. It discusses things like:
- rollback segments (Oracle)
- row versioning (Microsoft)
Finally, here is a good paper from Microsoft Research:
https://www.microsoft.com/en-us/research/wp-content/uploads/2011/12/MVCC-published-revised.pdf
It's a big topic, with no short, simple "one size fits all" reply.
As far as "simulating MVCC in JavaScript":
- Javascript is inherently single-threaded; there is no concept of "threads" or "locking" in the language itself.
- "Promises" (as you're using) are an excellent way to "order" asynchronous callbacks
- We'd need more details on exactly what you're trying to accomplish.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
