[Bps-public-commit] r15621 - Prophet/branches/Prophet-trunk/doc
jesse at bestpractical.com
jesse at bestpractical.com
Thu Aug 28 21:14:46 EDT 2008
Author: jesse
Date: Thu Aug 28 21:14:39 2008
New Revision: 15621
Modified:
Prophet/branches/Prophet-trunk/doc/foreign-replicas
Log:
Start to clean up the few facts from the foreign replica irc log
Modified: Prophet/branches/Prophet-trunk/doc/foreign-replicas
==============================================================================
--- Prophet/branches/Prophet-trunk/doc/foreign-replicas (original)
+++ Prophet/branches/Prophet-trunk/doc/foreign-replicas Thu Aug 28 21:14:39 2008
@@ -1,89 +1,56 @@
+
+=head1 Resplutions
+
+Resolutions are stored in a seperate database because they're supposed to be propagated _like_ regular changesets but always sent before the regular changesets.
+
+=head1 Native Replicas
+
+=head2 Merge tickets
+
+=head1 Foreign Replicas
+
A foreign replica is a (possibly read-only) data store that is not a Prophet
replica (such as RT or Twitter). A Prophet replica can act as a gateway to a
foreign replica.
Because we can't store arbitrary metadata in foreign replicas, we do not yet
support arbitrary topology sync of foreign replicas. A single Prophet-backed
-replica must act as a gateway to a foreign replica.
+replica must act as a gateway to a foreign replica. It would be great if Prophet could, in a general way, allow arbitrary topology sync of foreign replicas. but it was not ever a goal.
+
+Foreign replicas never talk directly with each other. Their communciations are always intermediated by a Prophet replica.
+The design wasn't such that you could have multiple replicas gatewaying transactions between a pair of foreign replicas.
+
+Foreign replicas aren't really full-fledged replicas, they piggyback on another replica on the proxying host to store metadata about merges, conflicts and local id (luid) to uuid mappings. When working with Foreign Replicas, the local state handle that tracks data on behalf of a foreign database using merge tickets. Our merge tickets work like svk's. they're a high-water mark of "the most recent transaction merged from some other replica", keyed by the replica uuid of that other replica. Prophet always merges all transactions from a replica sequentially.
+
+So when bob is pushing to a foreign replica, we use metadata stored in bob's replica to interact with the foreign replica. _merge_ticket records are an example of this however, when you do a push to a foreign replica, it should be storing that transaction as merged
+(See App::SD::ForeignReplica::record_pushed_transaction)
+
+The test that's failing is Bob pulls a task from HM and then pushes to RT. RT never gets the HM task.
+
+the specific problem I'm seeing is when bob pushes to RT, RT needs
+to know what the high water mark from Hiveminder is. because RT
+doesn't have a full replica, it ends up accidentally using Bob's
+merge tickets exemplified by these two adjacent lines in the logfile:
+ Checking metadata in BOB-UUID: (_merge_tickets, HIVEMINDER-UUID, last-changeset) -> 3
+ RT-UUID's last_changeset_from_source(HIVEMINDER-UUID) -> 3
+
-<Sartak> still working out how foreign replicas work exactly, and why RT thinks it already received a changeset it hasn't
-<obra> ok. so it's not RT that thinks so
-<obra> but the local state handle that tracks data on behalf of RT
-<obra> note that our merge records work like svk's. they're a high-water mark
-<Sartak> right
-<obra> doc the flows as you reverse engineer htem?
-<obra> I'm happy to answer questions as they come up
-<Sartak> hmmmm
-<Sartak> it looks like the problem is that foreign replicas aren't really full-fledged replicas, they piggyback on another replica
-<Sartak> bob pulls a change from hiveminder, and tries to push it to RT
-<Sartak> Prophet asks what new changes Bob would get from Hiveminder, NOT what new changes RT would get from Hiveminder.. I'm not sure though
-<Sartak> uuid hell, heh
-<Sartak> basically I think when we push to RT, it uses Bob's "import state file" instead of one specifically for RT
-<obra> you're using terms that aren't quite the terms I'd expect.
-<obra> can you recast in precise terminology (in terms of replicas, changesets, rt transactions, state databases, etc?
-<obra> since this stuff is complex and we have a lot of possibly-confusing terms
-<Sartak> ok
-<Sartak> bob is pushing to RT
-<Sartak> whenever we merge two replicas, we need to check the "high water mark", which is a _merge_ticket record, to see which changesets the source needs to give to the target
-<obra> right.
-<Sartak> because foreign replicas are not full-fledged replicas, obviously, Prophet stores metadata in other, full-fledged Prophet replicas
-<Sartak> so when bob is pushing to RT, we use metadata stored in bob's replica to interact with RT. _merge_ticket records are an example of this
-<Sartak> however
-<obra> when you do a push to a foreign replica, it should be storing that transaction as merged
-<obra> App::SD::ForeignReplica::record_pushed_transaction
-<Sartak> right
-<Sartak> hmm
-<Sartak> there are no files with name or content matching prophet-txn-source :/
-<Sartak> the specific problem I'm seeing is when bob pushes to RT, RT needs to know what the high water mark from Hiveminder is. because RT doesn't have a full replica, it ends up accidentally using Bob's merge tickets
-<Sartak> exemplified by these two adjacent lines in my logfile:
-<Sartak> Checking metadata in BOB-UUID: (_merge_tickets, HIVEMINDER-UUID, last-changeset) -> 3
-<Sartak> RT-UUID's last_changeset_from_source(HIVEMINDER-UUID) -> 3
-<obra> I think there's something that you don't have quite right there.
-<obra> or maybe not
-<obra> foreign replicas never talk directly.
-<obra> and the design wasn't such that you could have multiple parties gatewaying between two foreign replicas
-<obra> so, rt only needs to know about bob's transaction numbers
-<obra> s/transaction umbers/merge tickets/
-<obra> and hivemindero nly needs to know bob's merge tickets
-<obra> I'd be thrilled if we could, in a general way, allow arbitrary topology sync of foreign replicas. but it was not ever a goal
-<Sartak> OK, that makes sense
-<Sartak> the test that's failing is Alice pulls a task from HM. Bob pulls from Alice and pushes to RT. RT never gets the HM task
-<Sartak> is the design such that that only one replica can be a gateway to foreign replicas?
-<Sartak> I wonder if it'd still pass if we took alice out of the mix..
-<obra> no. that test should totally work
-<obra> at least it's DESIGNED to work
-<obra> i wonder if we have a fencepost error
-<obra> how goes?
-<Sartak> the one Prophet replica syncing between two foreign replica tests fail too
-<obra> yay
-<obra> much easier to debug
-<Sartak> :)
-<obra> seems like a good opportunity for improving our debug logging
-<Sartak> hmm, I'm fairly certain my description of the problem still holds
-<Sartak> the same checks are made when bob pulls from HM as the checks that are made when bob pushes an HM change to RT
-<obra> exact same?
-<obra> is this a case of a missing uuid in the name of an identifier?
-<obra> could the code be refactored to be clearer so the mistake becomes more glaring?
-<obra> I know that code was written quickly
-<obra> and for a while the hm code was goto &::Replica::RT::*
<Sartak> I think state_handle should be an entirely separate replica, just as resolutions are
<obra> But it should never be propagated.
-<obra> resolutions are seperate because they're supposed to be propagated _like_ regular chnagesets but in a seperate stage
+
<Sartak> can't it be a replica we just don't propagate?
<obra> so far, your description doesn't give me any reaason to think that ending up with an explicitly seperate state database would improve anything. and it would add more moving parts.
<Sartak> we're being bitten by reusing the Prophet replica's records
<Sartak> if the foreign replica had its own replica, then there would be no overlap and this issue would just go away
<Sartak> the foreign replica is using the real replica's _merge_ticket records
-<obra> ok. that's wrong
-<obra> and I _believe_ that our state handle stuff should entirely replace the need to even use those
-<obra> merge tickets are "most recent changeset seen from replica ABC"
-<obra> those are generally useful to propagate around.
-<Sartak> yes
+<obra> I _believe_ that our state handle stuff should entirely replace the need to even use those
+<obra> merge tickets are "most recent changeset seen from replica ABC". those are generally useful to propagate around.
<obra> except in the case of the foreign replica where it only ever matters what the most recent local changest we've pushed to the foreign replica
<obra> (pulling from an FR should, I believe, use regular merge tickets)
-<Sartak> I agree with both statements
-<obra> Prophet::ForeignReplica should probably be subclassing the bits of code that deal with MergeTickets.
-<obra> will you log this discussion into a "notes about foreign replicas"
-<obra> right next to your explanations about merge algoritums with yuval?
-<obra> also, apparently "merge tickets" is a horrible name that confuses people
-<obra> it may want renaming
+
+
+=head1 Open issues
+
+Prophet::ForeignReplica should probably be subclassing the bits of code that deal with MergeTickets.
+
+also, apparently "merge tickets" is a horrible name that confuses people it may want renaming
More information about the Bps-public-commit
mailing list