Posts Tagged 'Sahara'

Hadoop and OpenStack

Hadoop and Open Stack

Recently the name changed from Savanna to Sahara due to possible copyright problems. Source The code is already in Havana, but new things are coming in Iceberg, next OpenStack Release, around April 17th as it’s currently in Release Candidate mode.

So what is the big deal with Sahara? It honestly ties well into the environment that the company I work for has deployed, but it is also the next big thing that is already here.  The goal is to get to “Analytics as a service” where you can connect any OS you are looking for either Linux or Windows and leverage multiple scripting language and not just the likes of Hive and a few others. This would make it easily deployed and then easier for mainstream to consume. Some of the integration with OpenStack Swift where you can Cache Swift Data on HDFS is a start, but it has to get more integrated in order for it to get widespread adoption.

Drawing1

Why is OpenStack and Hadoop the right mixture?

  1. Hadoop provides a shared platform that can scale out.
  2. OpenStack is agile on operations and supports scale out.
  3. Combine 2 of the most active Open Source communities.
  4. Attracting major ecosystem players that can increase adoption.

This new release of Sahara has some features that are welcome. Templates for both node group and clusters is awesome as well as scaling of the cluster in adding and removing nodes for scale.  Interoperability with different Hadoop distributions and new plugins for specific distributions (Hortonworks and Vanilla Apache. The new release is also enhancing the API to MAP/Reduce jobs without exposing details of infrastructure. New network configuration support with Neutron as well.

Why am I writing about this? Internap has many deployments for various customers, but does not always advertise capabilities. With some of the new enhancements of OpenStack as well as some of the new developer direction internally this is going to change.  With new report from Frost & Sullivan Internap is the Bare metal Cloud leader and the Bare metal servers are perfectly suited to address Hadoop and Big Data needs.


April 2024
M T W T F S S
1234567
891011121314
15161718192021
22232425262728
2930