Skip to content
Job Title
 

DevOps Engineer

Work Model: Remote

At Hepapi, we help companies build reliable, scalable, and modern technology platforms. We work across DevOps, Cloud, QA, and AI to simplify software delivery, automate operations, and support long-term growth.

We're looking for a Mid-Level DevOps Engineer with 2–4 years of hands-on experience to take ownership of infrastructure, automation, and delivery pipelines across our projects. You'll work independently on meaningful problems while collaborating closely with engineering, operations, and security teams. If you enjoy building robust systems, automating away toil, and improving reliability at scale, this role is for you.

  • Automate infrastructure provisioning and operational tasks using Infrastructure as Code
  • Architect and manage hybrid environments spanning public cloud (AWS, Azure, and/or GCP) and on-premises infrastructure
  • Operate and optimize containerized workloads using Docker and Kubernetes
  • Manage deployments through GitOps workflows as the source of truth for infrastructure and application delivery
  • Implement monitoring, logging, and alerting to ensure system reliability and performance
  • Troubleshoot production issues and lead incident response and root-cause analysis
  • Partner with development and security teams to embed best practices into the delivery lifecycle
  • Drive improvements in documentation, processes, and engineering standards
  • Bachelor's degree in Computer Engineering, Computer Science, Software Engineering, or a related field (or equivalent practical experience)
  • 2–4 years of professional experience in a DevOps, SRE, or cloud infrastructure role
  • Proficiency in English
  • Solid working knowledge of Linux administration, Git, and scripting (Bash, Python, or similar)
  • Hands-on production experience with at least one major cloud platform (AWS, Azure, or GCP)
  • Experience operating and maintaining on-premises infrastructure, including hybrid cloud setups
  • Practical experience with containerization and orchestration (Docker and Kubernetes)
  • Experience with Infrastructure as Code tools such as Terraform or Ansible
  • Experience building and maintaining CI/CD pipelines (e.g., Azure DevOps, Jenkins, GitLab CI, GitHub Actions)
  • Hands-on experience with GitOps practices and tooling (e.g., ArgoCD, Flux)
  • Familiarity with monitoring and observability tooling (e.g., Prometheus, Grafana, EFK)
  • Strong problem-solving skills and the ability to work independently
  • Good communication and teamwork skills
  • Cloud or Kubernetes certifications (e.g., AWS Solutions Architect Associate/Pro, CKA/CKAD, Azure DevOps Engineer Expert)
  • Experience with secrets management, security hardening, or compliance practices
  • Experience managing self-hosted or air-gapped environments
  • An active GitHub profile, open-source contributions, technical blog posts, or a personal 
  • Remote-first work with flexible hours
  • Work with experienced engineers on real-world, high-impact projects
  • Ownership and autonomy over the systems you build
  • A collaborative and transparent team environment
  • Strong engineering culture with room for experimentation
  • Clear path to grow into senior and lead engineering roles
  • Internal tech talks and shared learning sessions
  • Opportunities to attend local and Europe-based meetups and conferences
  • Home office allowance for your setup
  • Mentoring and paid cloud and platform learning paths
  • Support for AWS and Kubernetes certification exams

Job Description

We are looking for Monitoring & Operations Engineers at Junior, Mid, and Senior levels to operate and monitor hybrid environments including AWS, Azure, On-Premise infrastructures, Windows/Linux servers, Databases, Cloud Services, and Kubernetes platforms.

This role focuses on 24/7 monitoring, incident detection, first-level troubleshooting, and operational support, working closely with DevOps, SRE, Platform, Infrastructure, and Application development teams to ensure high availability and system reliability.

Responsibilities

  • Monitor cloud, on-premise, and Kubernetes-based systems in a 7/24 shift-based environment
  • Track system health, performance, and availability using:
    • AWS CloudWatch, Azure Monitor
    • Grafana, Prometheus
    • ELK
  • Monitor Windows and Linux servers (CPU, memory, disk, services, events)
  • Monitor Kubernetes clusters (EKS / AKS / On-Prem K8s):
    • Nodes, pods, deployments, services
    • Cluster events and resource usage
  • Analyze alarms and alerts, identify potential root causes, and take first-level actions
  • Escalate incidents to relevant teams with clear technical findings and evidence
  • Perform end-to-end system checks during incidents (infra, application, network, security, platform)
  • Execute operational procedures using runbooks / SOPs
  • Log incidents, events, and actions accurately in ticketing systems
  • Support maintenance, change, and release activities
  • Contribute to improving monitoring coverage, alert quality, and operational processes

Required Skills & Qualifications

Core Technical Skills

  • Experience or strong interest in hybrid environments
    • Cloud (AWS, Azure)
    • On-Prem infrastructure
  • Knowledge of Windows Server and Linux fundamentals
  • Hands-on experience with monitoring & observability tools:
    • CloudWatch, Azure Monitor
    • Grafana, Prometheus
    • ELK
  • Kubernetes monitoring and troubleshooting knowledge
  • Understanding of:
    • Networking basics (DNS, TCP/IP, Load Balancers)
    • Application metrics, logs, and events
  • Ability to distinguish false alerts vs real incidents
  • Experience with ticketing and incident management tools
    (Jira, ServiceNow, Opsgenie, PagerDuty, etc.)

Interested?

Apply here or send your resume to hr@hepapi.com