Graph databases are a new type of database that stores all the data in a big funky web, instead of boring old rows and tables. Your “records” are stored in things called “nodes”, and nodes can be connected to each other by “edges”. Once you have all your data and connections in place, you can start to pull NEW data out it by examining the relationships between stuff. Think about like how facebook recommends friend-of-friend, or amazon recommends things-you-might-also-like.
One of the most popular graph databases out there and the one I’ve been playing with is Neo4j. I think it’s the most popular because it’s documented inside-and-out, making it really easy to work with. One of the way to examine data in Neo4j is using its Cypher query language, which is like SQL for graph databases. It’s really powerful, and once you figure it out can do some really cool stuff with it.
I’m gonna try to walk you through setting up Neo4j and doing some tricks with it. One of the example datasets on the Neo4j site is an IMDB database. So you’ve got this fancy new toy that can find connections between things, and a movie database, what’s the first thing you want to do? Write a Kevin Baconator, right? So that’s what we’re gonna do, setup Neo4j and find the links from any actor to Kevin Bacon.