Qubole: Elemental to Big Data

Joydeep Sen Sarma,CTO and Co - Founder

Joydeep Sen Sarma

CTO and Co - Founder

Passion is one of those intangibles that drives a start-up, gets them through the good times and the bad times, and ultimately dictates the success. Such is the story of Qubole, a Bangalore based start-up (headquartered in Santa Clara). Passionate about making data-driven insights easily accessible to anyone, Qubole, the leading cloud-agnostic big-data-as-a-service provider has created industry’s first autonomous cloud-based data platform that self-manages, self-optimizes and learns to improve automatically and as a result delivers unbeatable agility, flexibility, and TCO.The platform has consistently delivered 'workload continuity', ‘high performance’, ‘low reliance on cloud services’, and 'significantly greater cost savings’. By automating lower-level repetitive tasks involved in the optimal administering and management of big data, Qubole has allowed engineering teams to be less reactive to such problems, and hence focus more on directing better business outcomes.

Autonomous Data Management
Known to enable massive speed and scale to the data platform to provide better self-service access to the data for business users, Qubole collects and analyses the metadata generated (related to queries, clusters, users, data), when one uses the platform, and then evaluate using heuristics and machine learning. The team uses analytics to provide the user with AIR: Alerts, Insights and Recommendations that significantly improve cost efficiency, performance and team productivity. Bing the only data platform company that, via Agents, can intelligently automate typical data tasks, Qubole enables data engineers and dataops to stay focused on helping create business outcomes and avoid tedious, manual, repetitive tasks. “Our platform is built for everyone who uses data i.e. data engineers, data ops, data analysts and data scientists; it is engineered for multiple open source engines optimized for the cloud, including Spark, Hadoop, Hive and Presto; it is cloud-native, cloud-optimized, and cloud-agnostic,” informs AshishTushoo, Co-founder, Qubole.

To this, Joydeep Sen Sharma, Co-founder, Qubole, adds, “That said, our target market consists of three segments: companies that are moving workloads from an existing big data system to a new big data platform, companies that are moving away from traditional data warehousing platforms to new big data platforms, and companies that are building out a new application that requires some kind of big data analytics component, such as a retail
company that is building a new e-commerce site and wants to add product recommendations.

Being a cloud-native, cloud-agnostic, multi-technology, autonomous big data platform, Qubole helps to significantly lower costs, make teams more productive, scale more efficiently, and lower the risks of failure. “We have more cloud features than on-prem vendors that are late to the cloud. Additionally, as opposed to single-cloud and single-technology platforms which are very literal, we have agents which can take action for the user and intelligently make decisions based on the policies the user defines,” mentions Ashish.

Being a cloud-native, cloud-agnostic, multi-technology, autonomous big data platform, Qubole helps to significantly lower costs, make teams more productive, scale more efficiently, and lower the risks of failure

People. Presence. Power.
Globally, Qubole is a team of 250+ people with around 130 people working in India. Across geographies, the company’s product and engineering functions have around 120-130 people - the rest of the team comprises solution architects, support, sales, marketing, HR and finance. The team strength in India has been doubled every year.

While Qubole has customers all around the globe, a major portion of them are based in North America and Europe. Some of the popular enterprises include Pinterest, Flip board, Under Armor. “In India, we have around 15 percent of the popular clients that include Ola Cabs, Saavn and Indix. We are also getting a lot of traction from the APAC region with existing customers such as Autodesk and Traveloka, and new ones such as Malaysia Airlines,”says Joydeep.

For the years to come, team Qubole wishes to be focused on their goals of making advancements in the areas of serverless computing, data sciences and deep learning, in building faster adhoc query processing capabilities, in extending our deployments to more geographies, as well as enabling high security deployments such as in the healthcare industry.

Key Management
Joydeep Sen Sarma
Joydeep is the CTO and co-founder of Qubole, and also heads the India centre- based out of Bangalore. He was an early engineer at Facebook before starting Qubole and conceived and started the Apache Hive project there - he is the co-creator of Apace Hive at Facebook. He is an IIT-Delhi, Pitt, Oracle, Netapp and Yahoo alumni.

Ashish Tushoo
Ashish Thusoo is CEO and co-founder of the big-data start-up Qubole. Before Qubole,

Ashish Tushoo,CEO and Co-Founder

he ran Facebook's data infrastructure team. He is also the co-creator of Apache Hive and served as the project's founding vice president at the Apache Software Foundation

There’s more to the story!
The seeds of Qubole were germinated during their days at Facebook - a time when Joydeep and Ashish were a part of the data team there. As was typical in those days, anyone in the company who sought data beyond small and curated summaries stored in the data warehouse had to make requests to the data team. Despite the data team being excellent, fulfilling data requests had become a clear bottleneck. The initial challenge was that the non-scalable infrastructure had hit its limits. Seeing this, they began experimenting with Hadoop, but owing to Hadoop’s lack of user-friendliness, even among engineers, the bottleneck of data requests persisted.

However, considering SQL was widely used by both engineers and analysts, and was powerful enough for most analytics requirements, the duo decided to create an SQL-based declarative language that would allow engineers to plug in their own scripts and programs when SQL wasn’t adequate – moreover, it was built to store all of the metadata about Hadoop-based datasets in one place, thus making the programm ability of Hadoop available to every one. That language, of course, was Hive, and the rest is history.

Upon the release of the first version of Hive internally at Facebook, data scientists and data engineers began to access the data they needed directly. Around then, we had two critical appraisal moments, or observations, that eventually led us to founding Qubole. These observations firmed up our vision of the tremendous opportunities that could stem from completely democratizing data. With the lessons learned from our successes there, and with the goal of not only enabling data platforms with massive speed and scale but also providing better self-service access to data for business users, we launched Qubole in 2013, with the very same product principles of speed, scale and accessibility in analytics.

Since then, Qubole has hadan exciting and insightful journey. The company currently processes more than 750 petabytes of data in the cloud each month, and with a substantial amount of that data being run through AWS. Their AWS users have seen $140 million in total cost savings with the Qubole Data Service (QDS).