We’re successful in building a Multi-Node-Hadoop-Cluster with only LXD-Linux Containers.
i.e All Name/Data Nodes have been built inside four LXD containers (One Name Node, 3 Data Nodes) hosted inside a single Azure Ubuntu VM. Here goes the major points about the implementation.
For unparalleled scalability, I've chosen a standard Azure Ubuntu VM (8-Core CPU, 14-GB RAM, 4x250GB Data Disks) with ZFS-RAID file system. i.e To increase the hardware capability, I can scale up my Azure VM Hardware any time. To scale up the storage, I can dynamically add Azure Storage Disks on demand, and then add them to the ZFS Pool, which will dynamically provide the additional storage to all running containers. Cool hah? :)
Also with ZFS file system you can clone, snapshot your file system at any moment. You can also Hot-Add additional storage any time, which will be immediately visible to the OS as it is running. ZFS also provides RAID-0 (I've chosen) and RAID-Z provisioning.
1. Provisioned the Ubuntu VM inside Azure. Installed LXDE, RDP packages to get the remote desktop on my Local Machine
2. Installed ZFS Tools and created a new ZFS Pool containing the 4x250GB disks (1TB)
4. Mounted ZFSPool so that, LXD directories will reside inside the ZFS
5. Installed LXD, and updated networking settings (Changed the LAN subnet IPs)
6. Pulled Ubuntu-Trusty-AMD64 image from LXD repository
7. Created the first LXD container (HDNameNode-1) and configured Hadoop Single Node Cluster In it
8. Snap-Shoted ZFS File System, to keep my Hadoop-Single-Node-Cluster for later retrieval if required
9. Now I've updated HDNameNode-1 container, to have multi-node configurations (Which all nodes should have in a cluster).
(I've used tutorials 1 & 2). Tutorial-2 is for Hadoop-1.0+, So I'had to rely on Tutorial-1 to get settings for Hadoop-2.0+ multi-node configuration using YARN.
10. Cloned HDNameNode-1 container using LXD utility, to have 3-more containers which will act as Data Nodes
(HDDataNode-1, HDDataNode-2, HDDataNode-3)
11. Updated IP and Network settings for Name/Data nodes as desired
12. Updated HDNameNode-1 container, to have name-node specific multi-node configurations
13. Updated all DataNode containers, to have Data-node specific multi-node configurations
14. Restarted all containers, and started hadoop on NameNode, which in turn powered up Data Nodes.
15. Now I'm running a 4-Node-Hadoop-Cluster.
16. I can clone any Data Node any time, to have more data nodes if desired in future
17. Snap-shotted my ZFS pool for disaster recovery.
A few commands Listed below for your reference:
---Add ZFS package
ppa:zfs-native/stable apt-get update apt-get install ubuntu-zfs
---Create ZFS Pool and Mount for LXD directories
---Install LXD and update network settings for containerssudo zpool create -f ZFS-LXD-Pool sdc sdd sde sdf -m none sudo zfs create -p -o mountpoint=/var/lib/lxd ZFS-LXD-Pool/LXD/var/lib sudo zfs create -p -o mountpoint=/var/log/lxd ZFS-LXD-Pool/LXD/var/log sudo zfs create -p -o mountpoint=/usr/lib/lxd ZFS-LXD-Pool/LXD/usr/lib
--Add Ubuntu Trusty AMD64 Container Image and create first container, to setup SingleNodeClusteradd-apt-repository ppa:ubuntu-lxc/lxd-stable apt-get update sudo apt-get install lxd #To update the subnet to 192.168.1.0 and DHCP Leases sudo vi /etc/default/lxc-net sudo service lxc-net restart
lxc remote add images images.linuxcontainers.org lxc image copy images:/ubuntu/trusty/amd64 local: --alias=Trusty64 lxc launch Trusty64 HDNameNode-1 lxc stop HDNameNode-1 zfs snapshot ZFS-LXD-Pool/LXD/var/lib@Lxd-Base-Install-With-Trusty64 zfs list -t snapshot
***Configure Hadoop Single Node Cluster On first container and Snapshot it
**change ip to 192.168.1.2
***Update Name Node configuration to have Multi-Node-Cluster-Settingslxc start HDNameNode-1 lxc exec HDNameNode-1 /bin/bash lxc stop HDNameNode-1 zfs snapshot ZFS-LXD-Pool/LXD/var/lib@Hadoop-Single-Node-Cluster zfs list -t snapshot
-----Clone Name Node to 3 Data Nodes
lxc copy HDNameNode-1 HDDataNode-1
lxc copy HDNameNode-1 HDDataNode-2
lxc copy HDNameNode-1 HDDataNode-3
#Change IPs to 192.168.1.3,4 and 5
***Update all Data Node configurations to have Multi-Node-Cluster-Settings
---Snapshot the Multi Node Cluster
sudo zfs snapshot -r ZFS-LXD-Pool@Hadoop-Multi-Node4-Cluster
Hi Abraham. I am trying to implement a hadoop cluster on LXD containers using your blog. Just needed to know how did you set up static IP for containers so that they can be accessible over the lan.
ReplyDeleteI've used static IP for each container, by modifying /etc/network/interfaces and /etc/rc.local.
DeleteThen i'd used a software bridge (br0) at the host machine, as a parent network device for all containers, so that all container's virtual network interfaces will be attached to this host bridge. Finally i've bridged the physical ethernet interfaces in the host to this bridge, so that all packets arriving at the bridge will be routed through the entire LAN.
you can set the network parent device, by modifying the default LXD profile. Let me know if you need any clarifications.
Hi Abraham,
ReplyDeleteWe're working on automating the ZFS/LXD/big data parts in Ubuntu with Juju. Have you checked it out? We've got steps 4-14 pretty much all automated now and would like to see what you think of our solution. Ping me if you're interested in working together!
Great to know! I'm eager to learn about your implementations, though I'm new to Juju..
DeleteAbraham,
ReplyDeletePer Jorge's comment check out:
https://jujucharms.com/hadoop-processing/
Be great to hear what you think, if it helps you get to the science faster. We hang out in #juju@freenode if you have any questions.