The logging code that writes out transactions to disk needs to write out
the byte array that we actually use. The code is less hairly without the
generics so drop them.
It was a legit error -> the transaction doesn't have to be active
when commit() is called on it, and the right behavior in that case
is to return a TxTerminated error.
Fixes https://github.com/penberg/tihku/issues/59
Using insert() was a violation of our API, kind of, because
inserts are not expected to be called twice on the same id.
Instead, update or upsert should delete the version first,
and that's what's done in this patch.
At the same time, write-write conflict detection needed to be
implemented, because we started hitting it with rollback().
Finally, garbage collection is modified to actually work
and garbage-collect row versions. Without it, the number of tracked
row versions very quickly goes out of hand.
Previous commit was incorrect in two manners:
1. It *only* worked if the version was either pushed as the most
recent or 1 behind the most recent - that's fixed.
2. Comparing row versions incorrectly compared either timestamps
or transaction ids, while we *need* to only compare timestamps.
That's done by looking up the transaction and extracting its
timestamp - potentially expensive, and maybe we need to rework
the algorithm and/or consult the Hekaton paper.
For the time being, we still assume that the row versions vector
is *nearly* sorted, so we just perform a linear reverse search
and insert the version at an appropriate place.
During concurrency tests, the error was at most 1 offset,
and as long as we empirically prove it to be below a reasonable
constant, we're fine. Otherwise we should consider switching
to either a data structure that keeps elements ordered,
or at least a list that gives us constant insertion.
Before this commit, deadlocks were possible (and detected),
because some functions took row_versions lock first, and then
individual transaction locks, while other functions took the locks
in opposite order.
Without mutexes, it makes no sense anymore to use shuttle.
Instead, the test cases just spawn OS threads.
Also, a case with overlapping ids is added, to test whether
transactions read their own writes within the same transaction.
And specifically, the amount of things we don't have implemented
to even think of that. It's mostly about tracking commit dependencies
which allow speculative reads/ignores of certain versions,
as well as making sure that in the commit phase, we validate
visibility of all versions read, as well as that our scans
took into account all data. If some version appeared after the transaction
began, and it was not taken into account during its scans, it is considered
a "phantom", and it invalidates the transaction if we strive for
serializability.