How Freshworks handled its growing database

Data rules the world today. It impacts nearly every major task we work on, and has made its way to the centre of operations in many major companies. Each second, several terabytes of data are being created, and companies are using this data to improve their services or products, and scale their operations. And with the mainstreaming of artificial intelligence and machine learning, data is gaining more prominence than ever before.

Freshworks is no stranger to this phenomenon. From being a small team running a single product company, we have now grown to a multi-product company. Through this journey, we have seen the amount of data around us grow multifold. As we have scaled our operations, we have learnt to manage this data, organize it so we can make best use of it, and have tackled multiple engineering challenges.

So, it was only natural for us to choose to talk about data during our first edition of Saas@Scale event, a knowledge sharing forum that we host to talk about our engineering practices with other developers, technical architects, and engineers.

There are multiple aspects of data that we can talk about and discuss at a forum like Saas@Scale, but for starters, we wanted to talk about how we at Freshworks handled data as we scaled our business.

At the event in Bengaluru, Kiran Darisi, senior principal engineer at Freshworks, spoke about how the engineering team at Freshworks scaled its architecture as the company grew.

With the successes of growth came challenges of handling large databases, and Kiran spoke about how the team explored different ways to build a multi-tenant system, what algorithms they used, what automation systems the team used, and the reasoning behind their choices. He also gave the audience a sneak peek into what the engineering team at Freshworks is currently working on, and what plans they have for the near future in terms of scaling their engineering architecture.

Kiran’s talk led to conversations about different approaches to scaling databases, and served as an extremely rich source of learning for other companies currently in the phase that Freshworks was in a few years ago.

He began by talking about the customer journey in a multi-tenant system. In such a system, you would start with creating a certain basic footprint for the customer like creating the rows, tables, CDN files, routing entries, etc. But once the customer subscribes to a particular plan, you need to create access to your resources based on the plan he has selected. For instance, if he selects a premium plan he has to be given access to analytics. And you will have to gradually scale this process.

At Freshworks, we began this scaling by partitioning our database, making use of MySQL’s built-in partitioning capabilities.    

Kiran also spoke about the challenges his team faced during this scaling process. For instance, the partitioning worked well for a little while but the team had to soon tweak their approach as the company began adding more features to the product, and subsequently, more load to the system. Kiran shared simple but critical tips on handling the algorithm and selecting the partition key.

He then moved on to share the Freshdesk database sharding story. He took the audience through the journey of making the initial decisions on how customer data should be sharded, managing problems like noisy neighbours, migrations, failures, etc., how the team managed the shared directory, and so on.

Arvind Aravamudan, a technical architect and data engineer at Freshworks, then gave the audience an overview of Freshworks’ internal data lake.

Freshworks data lake – Baikal

As Freshworks added more products to its suite and started accumulating more data, different teams within the company started using the rich data to improve their products and processes. For instance, the machine learning and artificial intelligence teams started using the data to make products more efficient and also create Freddy, our omniproduct, omnichannel AI-driven assistant.

As the demand to access the database started piling on, the data engineering team decided, in 2015, to democratise the data within the company. The team created a data lake – Baikal – so as to make it easy for anyone within the company to easily query the large database for their needs.

Freshworks pulled the project off in just six months, compared to nearly two years that most companies take to build their data lakes. Arvind detailed how Freshworks started the data lake initiative, what components they focused on, how they designed the integration with other products, and how they ensure security of all data.

The idea behind the talk was to offer companies a broad framework that they can work within and develop  as they scale their operations, and also bounce ideas off other developers and technical architects. And the objective was more than just met. We were inundated with questions from participants and had discussions on various aspects of data, including things that weren’t directly related to our data lake or our scaling journey.  In fact, the response was so good that we intend to take the Data@Scale edition to more cities.

“It is a positive trend, given the explosion of software tools and methodologies and the pace at which things are changing, such events give us an industry perspective which refreshes and restructures our thinking. Freshworks is really changing the way we work,” a participant said.

 

Following the success of the first event on data, we hosted our second SaaS@Scale event on the theme of security. The Security@Scale event was held in Hyderabad in January. Watch this space for a roundup about the Security@Scale event, and for updates on the upcoming editions of Saas@Scale by Freshworks. We’ll share with you the theme, date, venue, and registration links.