Vindya Hemali

An Insight to working with Neo4j Graph Databases

Neo4j is a popular graph styled database application that is widely used by engineers around the world to store, organise and visualise data. This Tech How-To Guide, produced by expert developers at Mitra Innovation, explains how to work with Neo4j Graph Databases.

neo4j-logo

Neo4j is a popular graph styled database application that is widely used by engineers around the world to store, organise and visualise data. Neo4j allows organisations to visualise data and relationships between datasets through an intuitive, networked representation of information. Neo4j works especially well with complex relationships and highly connected data such as hierarchical data and recursive relationships.

This Tech How-To Guide, produced by expert developers at Mitra Innovation, explains how to work with Neo4j Graph Databases.

 

Graphs

Typically a graph is made up of three components: Nodes, Edges and Properties represented in a manner of relationships.

Here is an example:

Imagine a simple relationship between two Nodes, namely ‘Jack’ and ‘Jill’.

  • Jack is a ‘Friend’ of Jill
    • Where ‘Jack’ and ‘Jill’ are Nodes
    • ‘Friend’ is an edge from ‘Jack’, leading to ‘Jill’
    • ‘Name: Jack’ and ‘Name: Jill’ are properties of their respective Nodes.

(Fig 1: a simple relationship between Nodes)

 

Graph Databases can be used to create, retrieve, visualise and also display information from more complex databases that power large online systems such as e-commerce and  e-learning platforms, publishing sites, social networks and many others. Graph Database tools such as Neo4j allow service platforms to quickly compare user behaviours and relationships to instantly deliver personalised recommendations to users – which is a driving feature of modern e-commerce sites and adtech platforms. Graph Databases also allow social networks to effectively recommend ‘friends’ with similar areas of interest to users.

Graph Databases, as a concept, moves on from traditional databases such as MySQL, Oracle, and Postgre SQL where data is stored in the form of tables, rows and columns, to the modern whiteboard modelled representation of data. Moving forward from ‘Relational Databases’ to ‘Not Only Relational Database Management Systems’ has opened up avenues for developers to create innovative projects to visualise information and assess its worth in terms of customer and user planning.

(Table 1: A comparison of Relational Database Management Systems vs Graph Databases)

 

Using Cypher Query Language (CQL), Graph Databases allow engineers to quickly sift through large data dumps and retrieve relationships between independent records and visualise interconnected networks of relationships between recorded entities.  

Popularly acknowledged as one of the most effective Graph Database tools around, Neo4j provides a number of essential tools to assist developers.

The main features of Neo4j are as follows:

  1. ACID for Data Integrity (Atomic, Consistent, Isolated, Durable)
  2. Flexible Schema
  3. High Performance Query Execution
  4. Cypher Query Language
  5. Scale and Performance
  6. Advanced Causal Clustering
  7. Built in Tooling and Visualisation
  8. Drivers for popular languages and framework
  9. Seamless data import
  10. Cloud ready deployment
  11. Elastic Scalability
  12. In-Memory Page Cache
  13. Hot Backups

Neo4j also lists a number of impressive case studies in areas of engineering such as:

 

Neo4j provides a free online demo here which can be trialled before downloading the application.

 

Setting up Neo4j on Windows – a ‘How-to-guide’

The installation of Neo4j on windows can be carried out using a desktop installer and zip archive file (as a console application or as a windows service). Follow the instructions below:

(Fig 2: Screenshot of Neo4j download page. We downloaded the ‘Community Edition’)

 

  • Select the version that is compatible with your operating system.

(Fig 3: Screenshot – download compatible version)

 

  • Double click on the downloaded file in the local disk to start the installation.

(Fig 4: Specify the installation location and click on ‘Next’ to proceed with the installation)

 

  • Accept the license agreement and go forward to install. After successful installation, a shortcut to Neo4j will be available on the start menu.

(Fig 5: Changing the database location)

 

  • The database location may be changed in the installation wizard. Once complete, start the server.

(Fig 6: Screenshot – Neo4j is ready for use via http://localhost:7474/)

 

  • The default database directory is populated and it can be seen in the chosen location
  • You may start working on Neo4j via http://localhost:7474/
  • Log in to Neo4j server using the default password ‘admin’.

(Fig 7: Neo4j Community Edition after installation and ready to use)

 

How to Setup Neo4j as an admin console application

(Fig 8: Extract the downloaded zip file and go to the Bin Directory)

 

  • Through the admin console go to the Bin Directory and start the server using the command ‘neo4j console’.

(Fig 9: Starting Neo4j Server using the command ‘etc\bin>neo4j console’)

How to Setup Neo4j as a windows service
  • Extract the zip file
  • Using admin console start Neo4j as Neo4j service
  • Go to the Bin folder and execute the commands (neo4j install-service -> neo4j start).

 

How to Setup Neo4j in Ubuntu
  • Download the required Neo4j tar file using this URL (https://neo4j.com/download/other-releases/) for the Linux
  • Open up your terminal/shell
  • Extract the contents of the archive, using: tar -xf  <filecode>. For example,
    tar -xf neo4j-enterprise-2.3.1-unix.tar.gz  the top level directory is referred to as NEO4J_HOME
  • Run Neo4j using $NEO4J_HOME/bin/neo4j console. Instead of ‘neo4j console’, you can use neo4j start to start the server process in the background
  • Visit http://localhost:7474 in your web browser
  • Change the password for the Neo4j account.

 

How to Setup Neo4j as a service in Ubuntu
  • Go to root mode in terminal
  • Then add the repository key to keychain
  • # wget -no-check-certificate -O – https://debian.neo4j.org/neotechnology.gpg.key | apt-key add –
  • Then add the repository to the list of apt sources
  • # echo ‘deb http://debian.neo4j.org/repo stable/’ >/etc/apt/sources.list.d/neo4j.list
  • Finally update the repository information and install Neo4j
  • # apt update
  • # apt install neo4j
  • The server should have started automatically and should also be restarted at boot. If necessary the server can be stopped with – # service neo4j stop
  • And to start – # service neo4j start
  • Now it is possible to access the database server via  http://localhost:7474/browser/.

 

Understanding the main building blocks of Neo4j

Node:  Nodes are the fundamental units of Graph Databases. Each Node contains properties as key value pairs. An example is given below and the Node ‘Employee’ contains the following properties (empId : 1, name: Namal, age: 35) as key value pairs.

(Fig 10: A node representing a record on Neo4j Graph Database)

 

Relationship: The links between Nodes are called ‘Edges’ or ‘Relationships’ and are fundamental building blocks of Graph Databases. The ‘Relationship’ connects two nodes or even can be the same node in recursive relationships. The ‘Relationship’ may have properties as key value pairs, similar to how it is done in associative entities in the ‘many-to-many’ relationship.

An example is given below:

(Fig 11: A relationship with properties, between ‘employee’ and ‘Project’ nodes)

 

Here, we find an additional property for ‘EMP_WORKS_FOR_PRO’ as ‘workHour’, which is specific for relationships between each ‘Employee’ Node and ‘Project’ Node since each employee can work on many projects, and each project can have many employees.

Important: In addition to the above there is an auto incremented id number as ‘id’ for each and every Nodes and Relationships, which is an auto generated unique number according to the Node or Relationship by Neo4j.

Labels: Labels associate a common name to a set of Nodes or Relationships. A Nodes or Relationship can contain one or more Labels. We can create new Labels to existing Nodes or Relationships. We may also remove the existing Labels from the existing Nodes or Relationships.

Database Information

Clicking on the top left ‘database’ icon reveals database information. It will display brief information about the Node labels, Relationships, Keys, and how many Nodes and Relationships there are.

(Fig 12: Database information can be viewed by clicking on the top left ‘database’ icon)

 

Neo4j Data Browser

The Neo4j data browser can be accessed via ‘http://localhost:7474/browser/’. It is used to execute CQL commands and view outputs. The top of the interface displays the CQL editor with a ‘$’ sign. Using the ‘enter’ key, press or execute button, queries can be executed and the results will be displayed in the UI view located below the editor.

(Fig 13: CQL command console on Neo4j)

 

  • The ‘Rows’ view displays properties for selected type of Nodes.

(Fig 14: The ‘Rows’ view displays properties for selected types of nodes)

 

  • The ‘Text’ view displays the text structure of Nodes.

(Fig 15: The ‘Text’ mode displays text structure of nodes)

  • The ‘Code’ view shows full json which is generated by query via bolt connection. Json generates the graph.

(Fig 16: ‘Code’ view displays full json code generated by query via bolt connection. This bit of json generates the graph)

 

Quick controls

The top right set of buttons in ‘view’ mode helps to enable the Fullscreen, Collapse, Cancel and Export options. Query results may also be exported as CSV, json or other given formats.

 

Colour coding your Nodes

Node colours and sizes can be changed by clicking on Node name on top of the view panel.

(Fig 17: Colour coding and resizing nodes)

 

Neo4j with Cypher Query Language (CQL)

Neo4J may be queried using Cypher Query Language. CQL is a declarative pattern matching language and follows SQL-like syntaxes. CQL syntaxes are relatively simple and are in human readable format.

 

Neo4j CQL clauses

(Table 2: CQL Read Clauses)

(Table 3: CQL Write Clauses)

(Table 4: CQL General Clauses)

(Table 5: CQL Functions)

Neo4j CQL Data Types

There are some data types which are used in CQL such as Java data types.  When defining properties, you will need to use these data types.

(Table 6: CQL Data Types and uses)

 

CQL operators which are used as filters usually with ‘where’ clauses are given below:

Here is an example:  where n.age = 35

(Table 7: CQL Operator types)

 

CRUD Operations

This section will help you to understand the basic Create, Read, Update and Delete operations using CQL on Neo4j.

Create

  • Creating a node:
    • $ CREATE (n)

(Fig 18: Creating a node using CQL)

 

  • Creating a Node with a Label:
    • $ CREATE (node1: Test

(Fig 19: Creating a Node with a Label using CQL language on Neo4J)

 

  • Creating multiple Nodes with unique Labels simultaneously:
    • $ CREATE (node1: test), (node2: Test2), (node: Test3)

(Fig 20: Using CQL to create multiple Nodes with unique Labels simultaneously)

 

  • Using the verification clause ‘RETURN’ to see results:
    • $ MATCH (n:Employee) RETURN n LIMIT 25

(Fig 21: Using CQL commands to query a result)

 

  • Creating Nodes with Properties:
    • $ CREATE (node1:Test {nodeId: 2, nodeName: ‘sample’, nodeDescription: ‘testing’})return node1

(Fig 22: creating Nodes with Properties using CQL)

 

  • Setting Properties when creating:
    • $ CREATE (node1: Test) set node1.name-’test’ return node1

(Fig 23: setting Properties when creating Nodes on Neo4j using CQL)

 

  • Creating a Relationship:
    • $ CREATE (emp:Employee), (pro:Project) ,(emp)-(ew:EMP_WORKS_FOR_PRO)->(pro) return emp, pro

(Fig 24: creating a Relationship between Nodes using CQL on Neo4j)

Read operations

The main read operation is the ‘MATCH’ clause which is the same as ‘SELECT’ with inner joins.

  • Query to return employees who work for projects:
    • $ MATCH (emp:Employee)-(ew:EMP_WORKS_FOR_PRO) return emp, pro

(Fig 25: Using a CQL query to view employees who work in Projects in Neo4j)

 

  • Matching a specific Node:
    • $ MATCH (emp:Employee{name:’Namal’}) return emp

(Fig 26: Using CQL to view a specific Node in Neo4j)

 

Update operations
  • Values of Properties can be updated using the SET operator
  • Updating a given Property of a given Node:
    • $ MATCH (emp:Employee{name:’Namal’}) set emp.name=’amal’ return emp

(Fig 27: Updating a Node using CQL on Neo4j)

 

Delete operations
  • Deleting a Node which has no relationship:
    • $ MATCH (n:Test {nodeId: 2}) DELETE n

(Fig 28: Deleting a non related Node using CQL on Neo4j)

 

  • Deleting a relationship:
    • $ MATCH (emp:Employee)-[ew:EMP_WORKS_FOR_PRO]

(Fig 29: Deleting a Relationship using CQL)

 

  • To delete a Node with is connected with another Node use the ‘DETACH DELETE’ command.

 

Summary

In this article we have attempted to explain the fundamentals of Neo4j to provide readers with an understanding to start practising with Neo4j. We hope this has been useful and if you have any questions let us know in the comments section.

Also, check out these links for more information about Neo4j:

https://neo4j.com/developer/example-project/

https://neo4j.com/news/facebookgroup-demo/

Vinday-hettige

Vindya Hemali

Software Engineer | Mitra Innovation