Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
dgraph
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Mirror
dgraph
Commits
0b03af34
Commit
0b03af34
authored
9 years ago
by
Manish R Jain
Browse files
Options
Downloads
Patches
Plain Diff
almost finalized presentation
parent
e88fb49c
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
present/sydney5mins/g.slide
+48
-12
48 additions, 12 deletions
present/sydney5mins/g.slide
with
48 additions
and
12 deletions
present/sydney5mins/g.slide
+
48
−
12
View file @
0b03af34
...
@@ -12,9 +12,11 @@ https://mrjn.xyz
...
@@ -12,9 +12,11 @@ https://mrjn.xyz
* What is Graph
* What is Graph
- Abstract data type to represent mathematical graph concepts.
- Abstract data type to represent relationships between objects.
- Made up of Entities and Edges. Example edge in triple format:
- Made up of Entities and Edges. Directed edge in triple format:
[Tom Hanks]--married-->[Rita Wilson]
Entity --attribute--> Entity/Value
[Tom Hanks] --married_to-> [Rita Wilson]
[Tom Hanks] --born_on----> [July 9, 1956]
- Popular graphs: Facebook Social Graph, Google Knowledge Graph.
- Popular graphs: Facebook Social Graph, Google Knowledge Graph.
.image graph.png
.image graph.png
...
@@ -25,25 +27,59 @@ DGraph is a distributed, low-latency graph serving system.
...
@@ -25,25 +27,59 @@ DGraph is a distributed, low-latency graph serving system.
- *Low*Latency*: Minimize the latency of query execution.
- *Low*Latency*: Minimize the latency of query execution.
Linear time complexity, based on
complexity
/depth of query, not results.
Linear time complexity, based on
attributes
/depth of query, not
number of
results.
Minimize the number of network calls required to run the query.
Minimize the number of network calls required to run the query.
Meant to be run in production, serving real time user queries.
Meant to be run in production, serving real time user queries.
- *Distributed*: Automatically distribute data to and serve from provided servers.
- *Distributed*: Automatically distribute data to and serve from provided servers.
Handle shard splits, shard merges, and shard movement.
Handle shard splits, shard merges, and shard movement.
- *Highly*Availability*: Automatic data replication and failover.
- *Resilience*: Automatically handle server failures, and reassignment to healthy servers.
- *Resilience*: Automatically handle server failures, and reassignment to healthy servers.
*
Low Latency
*
Implementation
Most interesting challenge in all of this.
- Use Flatbuffers for on-disk, in-memory and over-network representation.
- Entities assigned `uint64` uids for optimized representation.
- Use RocksDB to store data internally in posting list format.
Optimized for seeks: ram, ssd, disk
Used at Facebook, CockroachDB.
- Posting List = all directed edges from a given attribute
[attribute, entity] -> [sorted list of entities / value]
- Generally one complete posting list would be served by a server.
- If posting list is too _big_, chunk it into shards.
- Shard is the most granular data to be served or moved around.
- Server can serve many shards.
- Each shard replicated across at least 3 different servers.
* Example: Names of Friends of Friends of ME
- GraphQL query received. Parse into internal query rep.
* MVP
[Network call]
- Pick server serving posting list `friend`.
- Seek to `friend, me`. Get a list of friends uids, and return.
[Network call]
- Send all uids again. For each friend uid_i, seek to `friend, uid_i`
- Get lists of lists of uids. Merge them into one big list, and return.
[Network call]
- Pick server serving posting list `name`.
- For each uid_i, seek to `name,`uid_i`, and return.
- 3 RT network calls in total
- Network calls: O(m) where m = depth + attributes in query
- RocksDB Seeks: O(n) where n = total results.
* Minimum Viable Product
- Planning to launch in mid-November.
- Planning to launch in mid-November.
- Non-distributed, run on only one server.
- Non-distributed. Runs on only one server.
- Support GraphQL w/ JSON response.
- Focus on low-latency.
- Low-latency.
- Support a subset of GraphQL. Responses in JSON.
- Launch with benchmarks against Neo4J.
- Launch with performance comparisons against popular Neo4J.
- Possibly provide sorting.
Do try it out!
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment