Spark on Mesos

A new recipe

A recipe for a nice blend of spark on mesos. Preparation time 10 minutes, using the recipe below. Afterwards put in a hot stove for 20 minutes, enjoy! šŸ˜›

Spark on Mesos

This blogpost covers the use of Spark in combination with Mesos and aims to determine what Mesos is and how it differs from the commonly used YARN container manager.

This blogpost will answer the following questions

apache spark logo

  • What is Mesos?
  • How does it work?
  • What are the differences with respect to YARN?
  • Shows the steps to set it up and run a test Spark job.

What is Mesos

Mesos is a distributed system kernel that provides API’s to applications (such as Hadoop, Spark and others) in order to deal with resource management and enable scheduling over cloud environments and datacenters.

YARN (Map-Reduce version 2) vs Mesos

Mesos is similar to other resource managers and job schedulers, such as Map-Reduce version 1 and Map-Reduce version 2 (YARN, Yet Another Resource Negotiator), in that it does exactly that. However there are some notable differences, such as:

  • Mesos uses resource offers, e.g. it abstracts resource allocation and offers a chunk of it to a user to run its job.
  • Mesos is more performant as it does not require a JVM and the (memory) overhead that comes with it.
  • Mesos is more flexible in that it abstracts at a higher level and therefore supports a lot of different applications (frameworks) such as Hadoop, Spark, etc., with the possibility to define your custom application (framework). See:
  • YARN is much more mature, as it is currently at 2.7.2 and Mesos at 0.28.2, but much less cleanly separated from Hadoop than Mesos;
  • YARN is limited to Hadoop frameworks / applications, and can therefore only run Map-Reduce and YARN workloads, while Mesos is able to run any number of applications based on different frameworks.

See for more information about the mesos architecture:


In order to setup all the parts to run spark on Mesos we perform the following steps;

sudo apt-key adv --keyserver --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)

Add the repository

Add the repo using:

echo "deb${DISTRO} ${CODENAME} main" | \
sudo tee /etc/apt/sources.list.d/mesosphere.list
sudo apt-get -y update


Configure and start ZooKeeper using the steps found at:Ā

Start Mesos

Start Mesos using:

sudo service mesos-slave start
sudo service mesos-master start
ps aux | grep mesos

Now shows the slave and master running.

Launch the ZooKeeper client

We launch the ZooKeeper client using:

sudo sh /<path_to_zookeeper>/bin/

The path is typically:


List Mesos zone

We list the Mesos zone(s) using:

ls /mesos

Get contents

We get the contents using;

get /mesos/json.info_0000000000 # or higher number

Mesos UI

The Mesos user interface should now be running at:

In their respective tabs we verify that there is at least one slave is available. We can also determine the version of Mesos. For this tutorial we used version 0.28.2.

Add hadoop to the mix

In order to host Spark somewhere reachable for all componentsĀ do:


And extract it using:

tar -zxvf hadoop-2.6.4.tar.gz

Add the spark framework

We download the latest stable of spark, at the time of writing this tutorial this is version 1.6.1, using:


And extract it using:

tar -zxvf spark-1.6.1.tgz

Host spark on HDFS

The steps to host spark on HDFS, and have it available through an URI that can be provided to all the components of this yummy recipe, we perform:

./<path_to_hadoop_bin_folder>/hadoop fs -put <path_to>/spark-1.6.1.tgz <optional_path_on_hdfs>

Build custom Spark

We need to build our custom spark for mesos (without YARN, which is default) using:

./<path_to_spark>/ --tgz

Spark shell with Mesos

Now we use spark with Mesos using the spark-shell:

./<path_to_spark>/bin/spark-shell --master mesos://host:5050

And submit a sample spark job to Mesos


./<path_to_spark>/bin/run-example SparkPi 10

Which in turn uses spark-submit.


Of the spark job is shown here:



Shots fired!


mesos spark

Mesos running spark job

mesos spark


Spark running theĀ SparkPi job using Mesos

Some sources

The binary of Mesos and steps can be found at, at the downloads page, andĀ select Apache Mesos, the get started button.


So there you have it, a nice blend of Spark on Mesos! Enjoy!