Have you ever wanted to have an ultra fast in-memory database and a real time visualisation tool where you can create charts from your SQL query results?
Now you can! It can all be achieved using open source software and the effort of setting it up is minimal.
In a nutshell:
- Install Docker (https://www.docker.com)
- Download a Docker image that contains Spark + Zeppelin
- Execute the Zeppelin example
Install Docker
If you are in the IT world and you do not know what docker is; I recommend you to find out here.
The docker installation process is very straight forward. If you are using a MAC or Windows 10, I recommend you to install the beta version, which is much more efficient. The beta version requires Windows 10.
Download the Docker image
Once you have docker installed, download an image from: https://github.com/dylanmei/docker-zeppelin
As mentioned on the web page, you only need these two commands:
docker pull dylanmei/zeppelin
docker run --rm --name zeppelin -p 8080:8080 dylanmei/zeppelin
The first command can take from three to ?? minutes depending on your internet connection speed. In my case, it took four minutes.
Execute the Zeppelin example
After executing the second docker command which runs the image, you will see something like this:
Open your browser to this address: http://127.0.0.1:8080. Use Chrome or Firefox.
And you should see something like this:
Click on the Zeppelin Tutorial note.
Click save; this will cause the notebook to have the needed dependencies.
Then click the "play" icon for the first part or paragraph as it is called in Zeppelin.
As you can see, this will create a table called "bank" (1) from a text file located in (2).
Now you can start to explore the data:
Feel free to execute any of the charts, or alter the SQL statements to start exploring the data you just loaded.
In my next post I will write about using more advanced features.




