Thursday, 31 August 2017

How to Deploy Neo4j on Microsoft Azure, a Step-by-Step Guide

Learn more about deploying Neo4j on Microsoft Azure in this guide for graph databases in the cloudIn the first post of this series covering Neo4j on Microsoft Azure, we announced the availability of Neo4j in the Azure Marketplace. Today, we’ll go into more detail on how to deploy Neo4j in your Azure environment.

Step 1: Sign-Up for an Azure Account


To deploy Neo4j in the Azure Marketplace, you need to have an Azure account. At the time of this writing, Azure is running a $200 promotion to sign-up for a free account.

The Azure Marketplace account sign-up page

Step 2: Launch the Neo4j Enterprise Edition Template


Once you are logged in to your Microsoft Azure account, you can search for the Neo4j Enterprise Edition template by Neo Technology, Inc. Clicking on “Continue” will take you to your Azure portal.

Neo4j Enterprise Edition in the Microsoft Azure Marketplace


The first page you will see explains the licensing model (paid subscription) and provides useful links to training videos, the Neo4j website and the topology. Here we use a model called: Bring Your Own License (BYOL), which means you need to contact Neo Technology and purchase a license to use Neo4j.

If you do not have an existing agreement with Neo4j, a 30-day trial period will commence. You will need to remove and/or decommission your Neo4j instances after the 30-day period, or you will need to contact us to purchase a Neo4j license or extend your trial period.

As discussed in my earlier blog, the currently offered template will provision a High Availability (HA) cluster with at minimum three Azure virtual machines (VMs).

Later this year, we will introduce support for Causal Clustering, which adds options for ultra-large clusters and a wider range of cluster topologies.

The Neo4j High Availability (HA) template in the Azure Marketplace

Step 3: Configure Basic Settings


On the first page of the template, you create an account to access provisioned compute instances (VMs that will be running Neo4j) and the region where you want cloud resources deployed.

Microsoft Azure basic settings configuration

Step 4: Configure Neo4j Settings


On the second page, you get an opportunity to customize your Neo4j deployment. Here you can choose which version of Neo4j to deploy.

You need to provide an initial password for the DBMS admin user called neo4j. Additionally, you can upload an SSL certificate to ensure all data in transit (if accessing via the Bolt protocol remotely) is encrypted with your certificates; otherwise a self-generated certificate will be used.

The template allows you to choose the size of the cluster (number of instances) with a three-VM-cluster minimum. However, for scaling reads or higher availability, you can elect to run larger clusters.

Next, you need to select the Azure VM within which to run Neo4j. If you are new to Azure, note that you may get an error notifying that you have used more than the maximum of cores allowed to be provisioned (for me it was 10). I typically choose a DS2 v2 machine since it has 7GB of RAM which usually is enough for datasets I work with. Contact Azure support if you need this limitation removed.

When considering production use, evaluate the size of your dataset and provision a machine that has at least 30% more RAM to ensure all data will be loaded/cached in memory. A VM with 8-16 GBs of RAM can handle graphs with hundreds of millions of primitives, and a VM with 16-64 GBs can handle billions of primitives. We recommend VMs with SSDs for better performance with much larger graphs on less RAM.

Neo4j settings for Microsoft Azure


Finally, you need to define a virtual network. Default is 10.0.0.0/24. You have flexibility in specifying the subnet for clustered VMs (the default uses the entire virtual network range). Here you can also define a public IP endpoint. If you choose “None,” then the Neo4j cluster will be deployed without a load balancer.

After clicking “OK,” the Azure Marketplace will run an automatic validation of all your inputs, and if everything is acceptable, you will have the option to download and save this configuration for future use.

Neo4j automatic validation on Microsoft Azure

Step 5: Agree to the License Agreement and Deploy Neo4j


The last page will review the licensing agreement and requires a valid Neo4j license to continue. Please note that you will need to remove and/or decommission your Neo4j instances after 30 days or contact us to purchase a Neo4j license or to extend your trial period.

Clicking on “Purchase” will start the deployment and provisioning process.

License agreement for Neo4j Enterprise Edition on the Microsoft Azure Marketplace


That’s it! Now Azure will provision all required resources defined in the template: VMs, private and public IP addresses, load balancers, etc.

The Microsoft Azure deployment dashboard


Once the template has been successfully deployed, you can find the public IP address by clicking on the “Public IP” resource. This is the address where you can connect to Neo4j.

Finding a public IP address for Neo4j on Microsoft Azure


To access the Neo4j Browser, navigate to http://{configured public ip address}:7474/. For the Bolt URI use: bolt://{configured public ip address}:7687.

And lastly, SSH is available on port 22000 + instance_id. For a three-instance cluster, SSH ports open would be: 2200022001 and 22002.


Want to take your Neo4j skills up a notch? Take our advanced online training class, Neo4j in Production, and learn how to scale the world’s leading graph database to unprecedented levels.

No comments:

Post a Comment