Using Apache Spark for Graph Computation with Neo4j

Scale
06/01/2015 - 14:30 to 15:30
Stage 4 / Open Stage
Workshop (60 min)
Intermediate

Session abstract: 

This talk will introduce Mazerunner, an integration of Apache Spark with Neo4j that can be used to offload expensive global graph compute algorithms to a scalable cluster. Mazerunner uses a Docker based orchestration to set up the different components and reads and writes transactionally from a running Neo4j instance to the Spark cluster using an persistent queue. We will discuss the general architecture, walk through the setup and demo two different graph algorithms PageRank and Betweenness Centrality on the DBPedia
dataset.

Slide: 

Corporate-Design: Extragestaltung, Margarethe Hausstätter
Ilustration: cyan, Berlin