Site reliability engineering at Systematic
T3ch life Companies Apps
by Systematic September 15, 2021 at 4:28 PM

Site reliability engineering at Systematic

Alexandru Dejanu, Site Reliability Engineer at Systematic: One of the most significant advantages of being an SRE at Systematic is that the team is technology agnostic, which means that I'm interacting with new frameworks frequently.

Systematic is an international software company with a Danish foundation. The development center in Romania started its activity in 2017 and currently has over 120 employees working on education, healthcare, and defence projects.

The Site Reliability Engineer role is crucial at Systematic since it enables the development teams to achieve better product reliability. Alexandru Dejanu, Site Reliability Engineer at Systematic, tells us about how is like being part of the Customer Operations team. Learn more from Alex and his experience as an SRE at Systematic.

apiVersion: apps/v1

kind: SRE



? labels:

? ??app: ALEX DEJANU

At Systematic, I fully embraced the Site Reliability Engineering role, a pretty new paradigm in the IT field (especially in Romania) whose goal is to improve the reliability of systems in production.

Before onboarding the SRE journey, I worked as a DevOps. My main focus was to bridge the gap between development and operation teams by?enabling CI/CD and automating different processes. Still, here at Systematic, I discovered that a new challenge lies in front of me.

Taking a step forward as an SRE, I've understood some of the main responsibilities of this position by helping both development and operation teams to have full visibility to the complete application lifecycle. Here, I am focused on reducing toil and ensuring the applications' availability while also establishing and monitoring service-level metrics.

Three main categories of activities that a Site Reliability Engineer does at Systematic

Now I am part of the Customer Operations department. I'm working in a multi-project squad,?which means that we serve multiple teams, encompassing various industry sectors such as library and learning, healthcare, defence, renewables.

The tech stack is quite diverse, meaning we're working with Kubernetes, Openshift, Azure, Ansible, Grafana, Prometheus, and so forth.Given the vast industries and the technology stack, I can say that no two days are the same.

From a high-level perspective, the main activities are focused around observability (not to be confused with monitoring in which you are handling "predictable" failures, whereas observability provides a way to infer the state of a system), incident response (e.g., postmortems). Last but not least, another big part of the tasks is implementing POC's, capacity management, and incident management.

A day in a life of an SRE and recurrent tasks

Recurrent it's quite a strong word. There aren't intrinsically?recurrent tasks. We're using the Feature Driven Developmentprocess, which is oriented towards speed and efficiency.

One day you could implement a new Prometheus exporter, and the next day you could measure the cost allocation for a K8s cluster. Grafana dashboards are for sure one of our "golden hammers," and at some point, some investigation tasks will require juggling between Lucene's query syntax and PromQL.

But at the end of the day, an essential detail is taking the DevOps mindset a step forward. I wholeheartedly can say that all the daily tasks aim to achieve better product reliability.And when the team's main values are collaboration and progress, we are confident that this goal will be reached.

The challenging part is finding new ways to measure service reliability while proactively monitoring and optimizing workflows.

?Also, one key detail of this role is understanding the importance of Service-Level Objectives,?Agreements and?Indicators. I would say that they're a direct measurement of a service's behavior.

Keeping up to date with the SRE key trends

One of the most significant advantages of being an SRE at Systematic is that the team is technology agnostic, which means that I'm interacting with new frameworks quite frequently. In one project, you could work in a setup consisting of Terraform with Azure and the other Ansible with Openshift.

I tend to read different articles and blog posts like?RedHat. I'm also part of various communities such as StackOverflow, GitKraken which certainly helps with being up to date on multiple topics. Sometimes I'm giving my two cents on different subjects on platforms such as Medium and StackOverflow. Here you can read more about my opinionated views regarding some tech topics I'm interested in:

Find out more about Systematic HERE.

What else can you do

Since you scrolled down here
lets enjoy this a bit more!

Blind peek another awesome story

Share this one