How does Freshworks Analytics build its complex Reporting?

Freshworks Analytics is a critical part of the product experience for every Freshworks customer. For perspective on its scale, it recently crossed 5,000,000 widget creations using Analytics that underscores how much users rely on it to make critical data-backed decisions.  

Given the magnitude, it is only imperative that our backend infrastructure is upto the task for a seamless experience. 

In this post, we deep dive into the concept of querying in Analytics and how exactly we make use of it to facilitate a userbase of thousands of teams using Freshworks Analytics at any given time. 

Introduction

For the uninitiated, Freshworks Analytics is the built-in analytics platform across the  Freshworks product suite allowing users to use or build interactive, comprehensive reports like this: 

What does the Analytics architecture look like?

At the core, the Analytics architecture can be separated into two layers: 

  • Data abstraction layer 
  • Querying layer

 

A snapshot of the different layers and data transfer across them before we take a deep dive into each of them.

Data Abstraction Layer

Every Freshworks product has a unique database schema with customized, unique reports built on them. The data behind the reports you see are stored in data warehouses such as Snowflake.

The core of the Analytics application is the ability to store varied complex configurations based on the individual product’s reporting needs and generate queries, on-the-go. The complexity comes when there are multiple dynamic SQL queries rendered based on their configurations on top of the reports. 

So how does a product handle such complex, dynamic queries?

To see that, let’s begin with an example of a report that can be created in Freshworks Analytics.

For the trend analysis report you see above, here’s how the query looks like in the backend:

Let’s now deep dive into how we generated the query.

The Warehouse database representation for the above report would look like this in two tables:

The Ticket Table contains information related to tickets while the Group Table contains details specifically about the group the tickets belong to, such as group name which is not present in the Ticket Table.

Schema1_Table_1 – Ticket Table

Column Data Type
ticket_id BigInt
account_id BigInt
display_id BigInt
group_id BigInt
created_at Timestamp
visible Boolean
fd_bi_deleted Boolean

 

Schema1_Table_2 – Group Table

Column Data Type
ticket_id Bigint
group_id Bigint
name varchar

Freshworks Analytics contains thousands of such tables which can be used to generate a wide range of reports

Let’s now look at how it will be for one of our other metrics, Unresolved Tickets:

For the report you see above, here’s how the query looks like:

Just like how we saw in the previous example, the Ticket Table contains information related to ticket details while the Ticket Events Table contains details specifically about the ticket events such as when the event occurred (even_at)

Schema2_Table_1 – Ticket Table

Column Data Type
ticket_id BigInt
account_id BigInt

 

Schema2_Table_2 – Ticket Events Table

Column Data Type
ticket_id BigInt
account_id BigInt
ingest_timestamp Timestamp
status_id BigInt
created_at Timestamp
event_at Timestamp
fd_bi_deleted Boolean
visible Boolean

 

Here are sections that go into building these queries:

  • Select columns ( columns seen in tabular, names of metric etc)
  • Where conditions ( filters in the report)
  • Group by ( group by in report)
  • Order by ( order in which results needs to be displayed – ascending or descending)
  • Join conditions ( relationship with other tables)

The columns of the above tables like Table_1, Table_2  are configured as select fields (tabular configurations), filters, group by, associations, join conditions, different types of metrics, mapping names etc in a MYSQL Database. Using these configurations , the query is constructed.

Configuration Examples:

Configuration Table/Module Columns/Attributes
Metric Tickets Unresolved Ticket
Select column Tickets, Tickets ticket_id
Filter Tickets created_at,visible,fd_bi_deleted,account_id.
Group name
Tickets event_at,ticket_id,account_id,status_id
Tickets Events Account_id, visible, fd_bi_deleted
Group by Tickets created_at
Associations and Join conditions Tickets and Group Account_id, ticket_id
Tickets and Tickets Events Account_id, ticket_id

 

Querying Layer

Having seen how Analytics stores multiple product configurations, let us take a look at our querying layer. Simply put, the answer to the most complicated question of how querying happens for Freshworks Analytics, is Jooq.

Our road to JOOQ

Traditionally, applications built using Java use Object-Relational Mapping (ORM) based Database models. Though Freshworks Analytics is a Java microservice, we use Jooq for its data abstraction. 

Why?

By treading the ORM path for the above examples, we’d need entities like Table1, Table2, Table3 classes etc. For a multi-product use-case, we’d need thousands of dynamic entities to be created on-the-go and such huge objects have to be managed to process these queries which is where Jooq comes in removing the complexities.

Jooq, an open-source software supports multiple SQL Dialects like POSTGRES, MYSQL, DERBY etc. The warehouses currently supported by Freshworks are Redshift and Snowflake which follow the POSTGRES dialect. This dialect is set up during configuration. Jooq reference can be found here – https://www.jooq.org/doc/latest/manual/sql-building/ 

JOOQ ships with its own DSL (or Domain Specific Language) that emulates SQL in Java. This means, that you can write SQL statements almost as if Java natively supported it, just like .NET’s C# does with LINQ to SQL. 

Enhancing Jooq capabilities for Freshworks Analytics

Here are some code snippets of how Jooq has simplified the Query Building component of Analytics Application.

Jooq Usages

Other operations supported by JOOQ:

Operation Name Description In Query Usage
addSelect Adding fields to select statement selectQuery.addSelect(field)
setDistinct Add distinct to select selectQuery.setDistinct(true)
addFrom Add from table name selectQuery.addFrom(tableName)
addJoin Execute join query with condition selectQuery.addJoin(joinTable, JoinType.LEFT_OUTER_JOIN, joinCondition)
DSL.when() Add case conditions to query DSL.when(conditions, field)

An example to generate a Jooq Query:

Here is the JOOQ code snippet for generating the above query:

NOTE: All the product configurations have been assigned as constants

Conclusion

We saw how our inbuilt platform solves the complex problem of handling huge volumes of data by storing different schema definitions in an effective way. The querying problem is solved by JOOQ which seamlessly creates queries using the stored product configurations.

For a robust data analysis engine like Freshworks Analytics which serves thousands of customers across the globe, a strong foundation like this helps support the current volume and also scale as required in the future.