5 Things about Property Graphs using Neo4j and Cypher


by
Layouts five important characteristics of Property Graphs.

1. Graphs

A graph is simply a collection of nodes and the relationships that connect them.  As Neo Technology, the creators of Neo4j, like to point out – graphs are everywhere. Graphs are used to model domains as varied as the relationships between individual people (social network analysis) to Metabolic Pathways. An interesting demonstration of modeling Metabolic Pathways can be found here.

Graphs take many forms but the most ubiquitous is the Property Graph which has the following additional characteristics to a simple graph:

  • Nodes have a unique identifier.
  • Nodes have key-value pair properties.
  • Relationships have a unique identifier.
  • Relationships have a type. For instance, Tom “likes” Cindy or Jack “IS_THE_BOSS_OF” Bob.
  • Relationships are directed i.e. they have an orientation. For example, if Tom follows Jack on Twitter, the relationship would be defined as pointing from Tom to Jack.
  • Relationships have key-value properties.

We will be using Neo4j to explore Property Graphs.

To get started using Neo4j go here and follow the instructions for your operating system to get Neo4j installed. Once done, type the below and then open your browser to the web admin screen.

You’ll see a screen that looks like:

neo4jadmin

Click on the Power Tool Console window. This gives you a shell into the Neo4j Graph Database. Most of this post centers around working in this tab using Cypher, Neo4j’s graph manipulation language .

Let’s model the cinematic Avengers as a graph. Yes, I’m nerdy enough to distinguish between the cinematic and comic book Avengers. Shocking, I’m certain

Avengers Simple Graph

2. Nodes

Nodes represent entities. Entities can be anything from Molecular Substrates to People. Essentially, they are any concepts or objects that you wish to depict as having relationships. In the graph we are modeling, the nodes are the Heroes of the Avengers. Remember that this is a Property Graph and the nodes will have a unique identifier and key-value properties.

To create the nodes, we will use the Cypher Create Command.

Now let’s add a node:

Once that command is executed, you should see:

The Node[0] is the unique identifier and the key-value properties are {name:”Phil Coulson”,description:”Agent of Shield”}.

Add the rest of the Avenger’s Heroes using:

To see all the nodes you have created use the following:

3. Relationships

Relationships are the connections between nodes. Since this is a Property Graph, relationships can have a unique identifier, a type, are directed, have key-value properties,

Let’s create our first relationship using the Cypher Create command. In the Power Tool Console type:

The output should look similar to:

This establishes a “WORKS_FOR” relationship from Phil Coulson to Director Fury.

As noted above, relationships can also have properties. Here is an example:

It establishes a relationship between Iron Man and Phil Coulson with the key being “basis” and the value being “respect”.

Multiple relationships can be added to a node. This creates a two-way relationship between Pepper Potts and Iron Man.

These two statements create a relationship from Pepper Potts to Iron Man and from Iron Man to Pepper Potts. When we look at traversing a graph below you’ll see that the direction of a relationship has a profound impact on the Graph.

Beyond the relationships below, I’m not going to create all the relationships because I’m lazy.

To see all the relationships you have created use the following:

Here is a visualization of how our graph stands:

Avengers Relationships Graph

4. Traversal

The most interesting things to do with a graph is to traverse it which means walking the nodes via their directed relationships. You can visit all the nodes or limit traversing the graph based on the value of properties. There is much to read and cover regarding Traversals. Most of it is beyond the scope of this post but you can start here.

The below query walks the graph and discovers the shortest path between Iron Man and Director Fury.

It’s easy to guess the outcome looking at the visualization above. It should resemble:

One thing to note is because the relationships are directed the reversal of the query returns nothing. Try it out:

The outcome is empty.

Psst…if you leave out the directional arrow it works:

5. Centrality

Centrality is important metric of a node determining the power or influence it wields. One of the most common measures of determining centrality is counting the number of relationships that a node has. This metric is known as degree centrality.

Cypher makes it easy to determine node centrality. Run the below in the Power Tool Console.

And you should see:

Not surprisingly, given my laziness in defining relationships, Phil Coulson has the greatest degree centrality. There are many other methods for determining centrality but they are beyond the scope of this post. You can find more here.

All Things Open

I expanded on these topics and more at All Things Open. It was a fantastic conference. If you get a chance, attend next year. You won’t regret it. Also, it’s sister conference, POSSCON, is equally  rewarding.

Follow

Get every new post delivered to your Inbox.

Join 662 other followers

%d bloggers like this: