Skip to content
Snippets Groups Projects
Commit 741988c9 authored by Manish R Jain's avatar Manish R Jain
Browse files

Final draft of MVP before coding begins

parent 67856de0
No related branches found
No related tags found
No related merge requests found
...@@ -8,7 +8,7 @@ low. ...@@ -8,7 +8,7 @@ low.
# MVP Design # MVP Design
This is an MVP design doc. This would only contain part of the functionality, This is an MVP design doc. This would only contain part of the functionality,
which can be pushed out of the door within a couple of months. This version which can be pushed out of the door within a month. This version
would not enforce strong consistency, and might not be as distributed. Also, would not enforce strong consistency, and might not be as distributed. Also,
shard movement from dead machines might not make a cut in this version. shard movement from dead machines might not make a cut in this version.
...@@ -29,19 +29,27 @@ type DirectedEdge struct { ...@@ -29,19 +29,27 @@ type DirectedEdge struct {
## Technologies Used ## Technologies Used
- Use [RocksDB](http://rocksdb.org/) for storing original data and posting lists. - Use [RocksDB](http://rocksdb.org/) for storing original data and posting lists.
- Use [Cap'n Proto](https://capnproto.org/) for in-memory and on-disk representation, - Use [Cap'n Proto](https://capnproto.org/) for in-memory and on-disk representation,
- [RAFT via CoreOS](https://github.com/coreos/etcd/tree/master/raft) - For this version, stick to doing everything on a single server. Possibly still
- Use [tcp](http://golang.org/pkg/net/) for inter machine communication. using TCP layer, to avoid complexities later.
Ref: [experiment](https://github.com/dgraph-io/experiments/tree/master/vrpc) - Possibly use a simple go mutex library for txn locking.
- Use UUID as entity id.
- Provide sorting in this version.
## Concepts / Technologies Skipped ## Concepts / Technologies Skipped
- Strong consistency on the likes of Spanner / Cockroach DB. For this version, - Distributed transactions the likes of Spanner / Cockroach DB.
we'll skip even eventual consistency. But, note that it is the long term plan For this version, the idea is to use mutex locks within a single process.
But, note that it is the long term plan
to make this a proper database, supporting strongly consistent writes. to make this a proper database, supporting strongly consistent writes.
- [RAFT via CoreOS](https://github.com/coreos/etcd/tree/master/raft)
- [MultiRaft by Cockroachdb](http://www.cockroachlabs.com/blog/scaling-raft/) would - [MultiRaft by Cockroachdb](http://www.cockroachlabs.com/blog/scaling-raft/) would
be better aligned to support all the shards handled by one server. But it's also be better aligned to support all the shards handled by one server. But it's also
more complex, and will have to wait for later versions. more complex, and will have to wait for later versions.
- Use [tcp](http://golang.org/pkg/net/) for inter machine communication.
Ref: [experiment](https://github.com/dgraph-io/experiments/tree/master/vrpc)
- No automatic shard movement from one server to another. - No automatic shard movement from one server to another.
- No distributed Posting Lists. One complete PL = one shard. - No distributed Posting Lists. One complete PL = one shard.
- Graph languages, like Facebook's GraphQL. For this version, just use some internal lingo
as the mode of communication.
## Terminology ## Terminology
...@@ -51,9 +59,48 @@ Entity | Item being described | [link](https://en.wikipedia.org/wiki/Entity%E2%8 ...@@ -51,9 +59,48 @@ Entity | Item being described | [link](https://en.wikipedia.org/wiki/Entity%E2%8
Attribute | A conceptual id (not necessary UUID) from a finite set of properties | [link](https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model) Attribute | A conceptual id (not necessary UUID) from a finite set of properties | [link](https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model)
Value | value of the attribute | Value | value of the attribute |
ValueId | If value is an id of another entity, `ValueId` is used to store that | ValueId | If value is an id of another entity, `ValueId` is used to store that |
Posting List | A map with key = (Entity, Attribute) and value = (list of Values). This is how underlying data gets stored. | Posting List | All the information related to one Attribute. A map with key = (Attribute, Entity) and value = (list of Values). This is how underlying data gets stored. |
Shard | Most granular data chunk that would be served by one server. In MVP, one shard = one complete posting list. | Shard | Most granular data chunk that would be served by one server. In MVP, one shard = one complete posting list. In next versions, one shard might be a chunk of a posting list.|
Server | Machine connected to network, with local RAM and disk. Each server can serve multiple shards. | Server | Machine connected to network, with local RAM and disk. Each server can serve multiple shards. |
Raft | Each shard would have multipe copies in different servers to distribute query load and improve availability. RAFT is the consensus algorithm used to determine leader among all copies of a shard. | [link](https://github.com/coreos/etcd/tree/master/raft) Raft | Each shard would have multipe copies in different servers to distribute query load and improve availability. RAFT is the consensus algorithm used to determine leader among all copies of a shard. | [link](https://github.com/coreos/etcd/tree/master/raft)
Replica | Replica is defined as a non-leading copy of the shard after RAFT election, with the leading copy is defined as shard. | Replica | Replica is defined as a non-leading copy of the shard after RAFT election, with the leading copy is defined as shard. |
## Simple design
- Have a posting list per Attribute.
- Store it in Rocks DB as (Attribute, Entity) -> (List of Values).
- One complete posting list = one shard.
- One mutex lock per shard.
- One single server, serving all shards.
## Write
- Convert the query into individual instructions with posting lists.
- Acquire write locks over all the posting lists.
- Do the reads.
- Do the writes.
- Release write locks.
## Read
- Figure out which posting list the data needs to be read from.
- Run the seeks, then repeat above step
- Consolidate and return.
## Sorting
- Would need to be done via bucketing of values.
- Sort attributes would have to be pre-defined to ensure correct posting
lists get generated.
## Goal
Doing this MVP would provide us with a working graph database, and
cement some of the basic concepts involved:
- Data storage and retrieval mechanisms (RocksDB, CapnProto).
- List intersections.
- List alterations to insert new items.
This design doc is deliberately kept simple and stupid.
At this stage, it's more important to pick a few concepts,
make progress on them and improve learning; instead of a multi-front
approch on a big and complicated design doc, which almost always changes
as we learn more. This way, we reduce the number of concepts we're working
on at the same time, and build up to the more complicated concepts that
are required in a full fledged strongly consistent and higly performant
graph databsae.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment