CareersJobsSoftware Development

Senior DevOps Engineer

Senior DevOps Engineer

Software Development

Infrastructure Engineering

São Paulo, SP

Remote

SHARE

Why join us

TRACTIAN is transforming the industrial world by empowering frontline maintenance workers to achieve more. We’ve fused cutting-edge hardware with innovative software into one powerful platform, disrupting legacy systems and delivering smarter, faster solutions for our clients.

Engineering at TRACTIAN


The Engineering team at TRACTIAN builds and operates the cloud-native backbone that powers our industrial IoT platform. We design for massive scale, high reliability, and security across AWS, Azure AKS, and Oracle Cloud (OCI) Kubernetes clusters.

What you'll do


- Own end-to-end delivery pipelines—from GitHub commit to production—running on GitHub Actions, ECS Fargate, AKS, and OCI Kubernetes.- Evolve our multi-cloud, multi-cluster architecture (AWS + OCI) with zero-trust networking.- Write and maintain IaC (Terraform + Terragrunt), Helm charts, and Kubernetes operators to automate everything.- Optimize observability: build dashboards/alerts using Grafana OSS stack, Prometheus, Loki, Tempo, and Datadog.- Troubleshoot complex incidents involving microservices, monoliths in containers, and AI workloads on GPU nodes.- Improve security posture—harden images, manage secrets, enforce policies, and audit compliance.- Help other engineers on DevOps best practices and drive continuous improvement.

Responsibilities

  • Apply DevOps practices to increase deployment speed, security, and quality.

  • Architect and run CI/CD workflows in GitHub Actions (matrix builds, reusable workflows, OIDC federation).

  • Design, build, and maintain Terraform/Terragrunt modules for VPCs, subnets, security groups, side-to-side VPNs, and private links.

  • Manage container orchestration on ECS Fargate and Kubernetes (AWS & OCI) with Helm, Keda.

  • Implement autoscaling, blue-green / canary releases, and cost-optimization for GPU and CPU workloads.

  • Diagnose performance bottlenecks across network, compute, storage, and application layers.

  • Maintain high-quality documentation.

Requirements

  • B.S. in Computer Engineering, Information Systems, or equivalent experience.

  • Strong scripting skills (Python, Bash); Go or Rust a plus.

  • Hands-on CI/CD with GitHub Actions and experience running production workloads on:

  • AWS: ECS Fargate, S3, RDS, CloudWatch, VPC networking.

  • Kubernetes: OCI OKE, Helm, Istio, Keda.

  • IaC expertise with Terraform and Terragrunt in multi-account/multi-cloud setups.

  • Solid networking foundations: VPC design, subnets, routing, VPN/IPSec tunnels, security groups, load balancers.

  • Observability stack experience (Grafana, Prometheus, Loki, Tempo, Datadog).

  • Familiarity with container security, SBOMs, image scanning, secret management, and least-privilege IAM.

  • Excellent problem-solving skills, ownership mindset, and ability to work autonomously within a distributed team.

COMPENSATION

  • • Competitive salary and stock options

  • • Optional fully funded English / Spanish courses

  • • 30 days of paid annual leave

  • • Education and courses stipend

  • • Employee Giving

  • • Earn a trip anywhere in the world every 4 years

  • • Day off during the week of your birthday

  • • Up to R$1.000/mo for meals and remote work allowance

  • • Health plan with national coverage and without coparticipation

  • • Dental Insurance: we help you with dental treatment for a better quality of life.

  • • Gympass and Sports Incentive: R$300/mo extra if you practice activities

I want to apply

If you want to build a ship, don't organize people to collect wood, assign them tasks, and give orders. Instead, teach them to long for the vast and endless sea.

Antoine Saint-Exupery