Alessio Nobile, Head of Technology at CustomerGauge discusses how CustomerGauge has been able to scale so fast. A combination of smart thinking, team work and leveraging the best of Amazon Web Services have helped to scale the company and bring the major benefits of speed and security for clients.
How should developing teams cope with rapid growth? It’s an important question for any company. You can’t stop daily operations while you wait for new developments. At CustomerGauge we often talk about having to change the wheels while the car is speeding along!
CustomerGauge’s rapid growth path has meant we have been continuously adapting and improving our infrastructure in order to satisfy load and flexibility on feature development.
To set the scene, CustomerGauge is a SaaS company that helps organizations improve their customer retention through best in class features like firefighting, text analytics, account success, and a powerful reporting suite to name a few.
We measure what customers are thinking about our B2B clients and we help them to identify risk of customer churn, or opportunities to grow. Then we automatically dispatch communications to the right people in the organization to help, and present results back to managers, and ultimately the customer to improve retention by more than 10%. We base the metrics on the Net Promoter® Score (NPS®) - the open-source reliable KPI and we benchmark numbers with other clients to help companies grow.
We work companies such as H&R Block, Philips, Nilfisk, Electrolux, Tommy Hilfiger and we have built a system able to interface their extended and sometimes special needs. Over the years the CustomerGauge system has filled out to become an enterprise platform.
CustomerGauge, like many other enterprise systems, has multiple technical challenges:
- Collecting and validate data
- Very flexible business rules on emailing and surveying
- Real time reporting
- Self-service setup
Implementing technical solutions able to support these points also pose challenges on a business and operational point of view: Costs, security & compliance, maintainability, scaling - not just in terms of computing power, but also in terms of engineers and people in operations that need to deal with the system. Our story is about how we managed to design software and architecture able to win those challenges at scale.
CustomerGauge has been one of the fastest growing tech companies in the Netherlands. Last year we were nominated to the Deloitte Technology Fast 50. Furthermore CustomerGauge experienced this growth based on bootstrapped finance, and not on external funding. Only in December 2015 we closed a first round of funding that will boost the growth even more.
Our home base of Amsterdam is a great place to find talent and other companies to network with.
When I joined CustomerGauge we had a one room office - and the software architecture was just as basic! It was simple and hosted in three servers; reporting application and surveys were in the same code base. It was a very rudimentary software and server architecture built from Adam and Roy during the time before I joined.
The system had no redundancy, no auto-scaling. We didn’t even have a version control system. Code was deployed directly to production by SFTP uploads.
Clearly, there was a lot of work to be done at all levels.
I started to work methodically through the software architecture, security, scalability, server infrastructure, version control, deployment, logs aggregation, monitoring tools and product.
From 2011 to today this is the kind of scaling we have experienced so far:
- From 3 to 80+ server instances in autoscaling
- From no repository to 34 GIT repositories
- From 1 to 3 clusters distributed in 3 regions
- From 500K to 10M+ surveys per month
This is how it looks like today.
HOW DID WE DO IT?
From 3 to 80+ server instances in autoscaling
The first thing we had to do it was to rewrite most of the old code base with the goal to decouple the application from the survey. The main idea was to make sure that surveys would have been working in a stand alone web service able to render itself getting instructions from an API.
It looks simple, doesn’t it? In reality it has a lot of challenges.
Working with enterprise level companies means we need to implement a setup able to fit needs of different countries and business units on a single setup, and every survey and email setup should meet this requirement. Potentially every survey, every email can be totally different from each other. There are a ton of business rules applied and the challenge was to build an API able to deal with all of it.
In short, we were starting to distribute the architecture on a micro-services architecture. Each node should execute a specific task and have the ability to scale when needed.
Good lessons here for all businesses. There are many reasons why you should distribute your infrastructure in micro-services.
Every action/task becomes an API. It will give you a huge amount of flexibility in reusing code - virtually it can transform your infrastructure to a language agnostic environment. Once every service can talk to each other by API, you will have the flexibility to replace nodes anytime no matter what language you use, as long as you keep the response structure. Not to mention how easy it is for developers to get to understand the code and start to be productive immediately.
It has security benefits too, as you are be able to run unit tests on each node reducing effectively errors on production deployments.
POINT OF FAILURE
Your infrastructure will not depend anymore on a single service. You will distribute the load and task execution on many different nodes and if you are smart with designing the infrastructure you can reduce point of failures to almost 0. You can scale with ease every node and establish redundant systems.
Isolating each service means that you can distribute security on different layers. As an example, creating different layers of abstraction from your web application to the Database, makes it very difficult to hackers to penetrate your database and do malicious things with it.
Every node can scale with more flexibility and can take action from different parameters.
At CustomerGauge we are now at 80+ instances across three different regions fully autoscaled and redundant. Thanks to this design we have been able to support a growth from 500K to 10M+ surveys per month.
So far we have managed to scale proportionally to what our (originally bootstrapped) business could afford. The best key learning on this is that by having micro-services and technologies like Docker you can always scale up and down with extreme ease, while avoiding the provisioning of resources that will make your bill fat without a direct correlation to your scaling needs.
From no repository to 34 GIT repositories
As a result of the previous touch point, we had to create a repository for each node of our new infrastructure. We have about 34 repositories used to generate about 15 nodes.
This has made deployment and fault detection a lot easier. It’s also been helpful to development and task distribution across multiple Scrum teams. Doing this you reduce the risk of code conflicts.
From 1 to 3 clusters distributed in 3 regions
One of the biggest challenges at CustomerGauge is to comply with the strict regulations in regards to data compliance. We have clients from all over the world and many of them have very specific requirements we have to respect in order to buy our software.
This could have been a major obstacle but it was made easier for us with Amazon. We were able to easily deploy clones of our infrastructure from Europe to Australia and United States. Having a well-organized infrastructure it is a relative easy task, it still had its challenges, but nothing you can’t do within 8 hours.
For people working on “old school” environments as many on-premise solutions are, all of that may sound like to be very complex and expensive to achieve.
In reality when working with cloud services everything becomes a lot easier and cheaper.
At CustomerGauge we fully rely on Amazon Web Services and we wouldn’t be able to bring to market solutions described above in a timely manner.