Hadoop Ozone part 2: tutorial and getting started of its features

The releases of Hadoop Ozone come with a handy docker-compose file to try out Ozone. The below instructions provide details on how to use it. You can also use the Katacoda training sandbox which provides a configured running environment.

This article is the second part of a three-part series presenting Ozone, demonstrating its usage and proposing an advanced replication strategy using Copyset.

At the time of writing, the latest release is 0.3.0-alpha. It still lacks security features (Kerberos and any kind of ACL) but it already implemented the S3 protocol. Let’s try it out!

Download and install

Download and extract the latest release here.

curl -LO http://mirror.ibcp.fr/pub/apache/hadoop/ozone/ozone-0.3.0-alpha/hadoop-ozone-0.3.0-alpha.tar.gz
tar xzf hadoop-ozone-0.3.0-alpha.tar.gz
cd ozone-0.3.0-alpha

The easiest way to boot up your first Ozone cluster is to use the docker-compose file provided in the release.

cd compose/ozones3
docker-compose up -d

This will start an Ozone Manager, a Storage Container Manager and a DataNode

Using the Ozone CLI

The Ozone CLI is very convenient to start playing with our cluster. We can use it from container since it provides an already configured environment (namely a ozone-site.xml file)

docker-compose exec datanode bash
# Create a volume name myvolume, for user homer with a quota of 1TB
ozone sh volume create --quota=1TB --user=homer /myvolume
# Create a bucket in that volume name mybucket
ozone sh bucket create /myvolume/mybucket
# Create a file and place under the myfile
echo 'Ozone is great!' > /tmp/ozone.txt
ozone sh key put /myvolume/mybucket/ozone.txt /tmp/ozone.txt
# Get our file
ozone sh key get /myvolume/mybucket/ozone.txt /tmp/getozone.txt
cat /tmp/getozone.txt
## Ozone is great!

As you could see, the Ozone CLI is very intuitive to use. Just a combination of ozone sh object action url. By default Ozone CLI uses the RPC protocol to connect with the local Ozone Manager (in ozone-site.xml).

Using HTTP and REST

The Ozone Manager is exposing its HTTP server on port 9874 and its RPC server on port 9862. The same operations can be made on both protocols. A WebUI is exposed by the HTTP server which you can browse on http://localhost:9874 (but there is not much to see yet).

To be more explicit, you can switch between the RPC and HTTP protocol by changing the scheme and port of you file URI:

# Get a file trough the Ozone CLI in RPC format
ozone sh key get o3://ozoneManager:9862/myvolume/mybucket/ozone.txt /tmp/getozone.txt
# Get a file through the Ozone CLI using http / REST
ozone sh key get \
  http://ozoneManager:9874/myvolume/mybucket/ozone.txt \
  /tmp/getozone.txt
# Get a file using REST directly (with wget)
curl -i \
  -H "x-ozone-user: homer" \
  -H "x-ozone-version: v1" \
  -H "Date: Mon, 22 Apr 2019 18:23:30 GMT" \
  -H "Authorization:OZONE" \
  "http://datanode:9880/myvolume/mybucket/ozone.txt"
#HTTP/1.1 200 OK
#Content-Type: application/octet-stream
#x-ozone-server-name: a037e9952722
#x-ozone-request-id: 4104582c-bee4-471f-b70f-f862a72d1953
#Date: Mon, 22 Apr 2019 20:50:41 GMT
#Content-Length: 20
#Connection: keep-alive
#
#Ozone is great!

Note, the Ozone CLI connect to the Ozone Manager HTTP port while the HTTP endpoint, for example using wget or curl, connect to the DataNode port. Why?

The REST API is only deployed on the DataNodes, exposing the namespace of the only Ozone Manager defined in its configuration. The Ozone Manager has a /serviceList endpoint which lists all the DataNodes in the cluster. The CLI randomly choses one DataNode to process the request. The DataNode simply connect to its Ozone Manager using RPC format… Weird behaviour but it is still in beta.

Let’s see what happens if we have multiple Ozone Manager running against the same DataNodes:

docker-compose scale ozoneManager=2
docker-compose exec datanode bash
# Create the volume /myvolume on both Ozone Manager
ozone sh volume create o3://ozoneManager1:9862/myvolume
ozone sh volume create o3://ozoneManager2:9862/myvolume
# It worked! Now delete it using http on the first one
ozone sh volume delete http://ozoneManager1:9874/myvolume
# Great! Now delete the bucket on the second one, should work
ozone sh volume delete http://ozoneManager2:9874/myvolume
## Exception org.apache.hadoop.ozone.client.rest.OzoneException: Delete Volume failed, error:VOLUME_NOT_FOUND

What happened? In both delete commands, it was the same DataNode who was asked to delete the volume. The DataNode only knows the first Ozone Manager. On the first Ozone Manager the volume was already deleted and raised the exception.

This behaviour is fixed when using the S3 protocol, which is easy to use and connects to the Ozone Manager using the S3 Gateway component. In short: do not use REST, it is buggy and it does not look like it will be fixed anytime soon: use S3.

Using S3

S3 is a bit complicated to use with Ozone: S3 has no notion of volumes. A mapping has to be made between an S3 bucket and an Ozone bucket, for example “/myvolume/mybucket”.

# For this we will need at least 3 datanodes
docker-compose scale datanode=3
# Create the S3 bucket name mys3bucket
aws s3api --endpoint-url http://localhost:9878 create-bucket --bucket=mys3bucket
# Place a file under the bucket
aws s3 cp --endpoint-url http://localhost:9878 ozone.txt s3://mys3bucket/ozone.txt
# Get the file 
aws s3 cp --endpoint-url http://localhost:9878 s3://mys3bucket/ozone.txt ozone.txt

Working great so far! Now here comes the tricky part. Since S3 has no notion of volumes, how to gain access to a file using the Ozone CLI. As of right now, the mapping is made as follow:

Ozone volume	Ozone bucket	S3 bucket
myvolume	mybucket	N/A
s3$AccessKeyLowercase	mys3bucket	mys3bucket (and $AccessKey configured)

The access key is the identification key used by every AWS protocol request. In the next snippet the Access Key configured in my ~/.aws/crendentials is “myAccessKey”. There is no need of a secret key since security is not yet implemented.

aws s3api --endpoint-url http://localhost:9878 create-bucket --bucket=mynewbucket
aws s3 cp --endpoint-url http://localhost:9878 ozone.txt s3://mynewbucket/ozone.txt
# And get the file using ozone cli
ozone sh key get /s3myaccesskey/mynewbucket/ozone.txt /tmp/s3ozone.txt

Using OzoneFS

OzoneFS is the Hadoop compatible Filesystem provided by Ozone. Let’s use it on the hdfs CLI. We will see that with no code change, we will be able to use the tool as we would on a HDFS Cluster. All this can be tested using the compose file under compose/ozonefs

First, we will add a new filesystem type to the core-site.xml:

<property>
  <name>fs.o3fs.impl</name>
  <value>org.apache.hadoop.fs.ozone.OzoneFileSystem</value>
</property>
<property>
  <name>fs.defaultFS</name>
  <value>o3fs://localhost:9864/volume/bucket</value>
</property>

We then need to add the ozone file system “.jar” file to the HADOOP_CLASSPATH.

export HADOOP_CLASSPATH=/opt/ozone/share/hadoop/ozonefs/hadoop-ozone-filesystem.jar:$HADOOP_CLASSPATH

Using the compose file:

cd compose/ozonefs
docker-compose up -d
sleep 10
docker-compose exec datanode ozone sh volume create /volume
docker-compose exec datanode ozone sh bucket create /volume/bucket
docker-compose exec hadooplast bash
echo "Hello Ozone from HDFS!" > /tmp/hdfsOzone.txt
hdfs dfs put /tmp/hdfsOzone.txt /hdfsOzone.txt
hdfs dfs ls /
exit
docker-compose exec datanode ozone sh keys list /volume/bucket

Dive on the inside with `scmcli`

Last but not least, the Ozone CLI contains a storage container command which happened to be very useful to see what’s happening ‘on the inside’.

To see it in action, let’s follow these instructions: place a large file (1.2GB) on a cluster with a container size set to 1GB, a replication factor set to 1 and a provision batch size set to 1 (it will only allocate containers one at time).

Once done:

Ozone sh key info /myvolume/mybucket/mylargekey
{
  "version" : 0,
  "md5hash" : null,
  "createdOn" : "Tue, 23 Apr 2019 14:48:57 GMT",
  "modifiedOn" : "Tue, 23 Apr 2019 14:49:09 GMT",
  "size" : 1200000000,
  "keyName" : "mylargekey",
  "type" : null,
  "keyLocations" : [ {
    "containerID" : 1,
    "localID" : 101976043538546689,
    "length" : 134217728,
    "offset" : 0
  }, {
    "containerID" : 1,
    "localID" : 101976043642945538,
    "length" : 268435456,
    "offset" : 0
  }, {
    "containerID" : 1,
    "localID" : 101976043780112387,
    "length" : 268435456,
    "offset" : 0
  }, {
    "containerID" : 1,
    "localID" : 101976044047564804,
    "length" : 268435456,
    "offset" : 0
  }, {
    "containerID" : 2,
    "localID" : 101976044178833413,
    "length" : 260475904,
    "offset" : 0
  } ]
}

We can see that 2 containers were allocated. Each chunks tries to be as large as one block in our block storage (256MB).

ozone scmcli list -s 0
{
  "state" : "OPEN",
  "replicationFactor" : "ONE",
  "replicationType" : "RATIS",
  "allocatedBytes" : 939524104,
  "usedBytes" : 939524104,
  "numberOfKeys" : 5,
  "lastUsed" : 112932039,
  "stateEnterTime" : 112782013,
  "owner" : "9a24f5f1-8f2e-47a6-938e-320b6e693f95",
  "containerID" : 1,
  "deleteTransactionId" : 0,
  "containerOpen" : true
}
{
  "state" : "OPEN",
  "replicationFactor" : "ONE",
  "replicationType" : "RATIS",
  "allocatedBytes" : 260475904,
  "usedBytes" : 260475904,
  "numberOfKeys" : 1,
  "lastUsed" : 112932039,
  "stateEnterTime" : 112862371,
  "owner" : "9a24f5f1-8f2e-47a6-938e-320b6e693f95",
  "containerID" : 2,
  "deleteTransactionId" : 0,
  "containerOpen" : true
}

The containers are not full yet. Let’s place another file and look at the containers state:

ozone scmcli list -s 0
{
  "state" : "CLOSED",
  "replicationFactor" : "ONE",
  "replicationType" : "RATIS",
  "allocatedBytes" : 1073741832,
  "usedBytes" : 1073741832,
  "numberOfKeys" : 6,
  "lastUsed" : 114971098,
  "stateEnterTime" : 112782013,
  "owner" : "9a24f5f1-8f2e-47a6-938e-320b6e693f95",
  "containerID" : 1,
  "deleteTransactionId" : 0,
  "containerOpen" : false
}
{
  "state" : "CLOSED",
  "replicationFactor" : "ONE",
  "replicationType" : "RATIS",
  "allocatedBytes" : 1326258176,
  "usedBytes" : 1326258176,
  "numberOfKeys" : 5,
  "lastUsed" : 114971098,
  "stateEnterTime" : 112862371,
  "owner" : "9a24f5f1-8f2e-47a6-938e-320b6e693f95",
  "containerID" : 2,
  "deleteTransactionId" : 0,
  "containerOpen" : false
}

The algorithm on the pipeline decided to over-allocate on one container, and then close it instead of creating a third container.

If we had set the replication factor to three, we would still have seen 2 containers. The container is itself replicated across DataNodes. Here the replication type is Ratis meaning it would replicate on three random DataNodes and the RATIS algorithm would guarantee the consistency of the data.

Conclusion

We have seen how to deploy a simple Ozone cluster and what Ozone is capable of. We have also discovered some limitations that are probably due to the fact that Ozone is still in development. Lastly we saw some tools that allows us to see what’s happening on the inside of an Ozone cluster.

On the last part of this post we will discuss more advanced replication strategy that will be available in Ozone.

Share this article