Backend engineer specializing in Go/Python, cloud-native microservices, APIs, distributed systems, Kubernetes, observability, and GenAI integrations.
Aktualisiert am 02.12.2025
Profil
Freiberufler / Selbstständiger
Remote-Arbeit
Verfügbar ab: 01.12.2025
Verfügbar zu: 100%
davon vor Ort: 100%
Distributed systems
Microservices architecture
Back-End
Go
Python
Cloud Engineer
API design
REST & gRPC
Kubernetes
Cloud-native development
Observability & telemetry
RAG pipelines
DevOps
Azure & AWS
Infrastructure as Code
English
Fluent
German
Proficient

Einsatzorte

Einsatzorte

Heidelberg (+150km)
Deutschland, Schweiz
möglich

Projekte

Projekte

1 year 11 months
2024-01 - 2025-11

Observability Engineering (AI/ML)

Observability Engineer (AI/ML) Prometheus Grafana Kubernetes ...
Observability Engineer (AI/ML)

Designed and implemented an observability and telemetry framework for production LLM workloads (hosted and proxied) in enterprise environments. Focused on backend architecture, metrics pipelines, and standardized telemetry to give engineering and SRE teams insight into performance, reliability, and cost of AI services across multi-cloud deployments.

  • Designed the overall observability architecture for LLM gateways and supporting microservices running on Kubernetes, with a clear separation of concerns for metrics, logging, and tracing.
  • Instrumented services end?to?end using OpenTelemetry to capture traces, latency, token usage, and error profiles for LLM requests and downstream dependencies.
  • Built metrics pipelines backed by Prometheus and integrated dashboards/alerts in Grafana and Promitor to support SLO/SLA tracking, anomaly detection, and incident response.
  • Defined telemetry schemas and APIs for internal consumers (platform, product, SRE teams), enabling consistent querying and aggregation across services and environments.
  • Worked with SRE and platform teams to tune sampling, retention, and alert thresholds to balance observability depth with system performance and operational cost.

Prometheus Grafana Kubernetes Helm Python OpenTelemetry Promitor Dynatrace Go Loki Jaeger Tempo KEDA HelmArgoCD MLflow
SAP
1 year 11 months
2024-01 - 2025-11

AI/ML Platform Engineering

Backend Services Engineer LLM as Backend SystemArchitektur Back-End ...
Backend Services Engineer
Built a scalable GenAI platform for enterprise workloads, integrating advanced LLM-as-a-Backend capabilities with support for RAG pipelines, model routing/orchestration, fine-tuning, and content-moderation workflows. Delivered cross-language SDKs, internal CLI tools, and fully automated CI/CD pipelines to streamline AI adoption for product teams while optimizing for cost efficiency, security, and operational reliability.


Responsibilities:

  • Architected and implemented backend microservices in Go and Python for RAG pipelines, model routing, fine-tuning jobs, and content moderation workflows.
  • Designed and documented multi-tenant REST/gRPC APIs to provide consistent access to LLM capabilities for multiple internal products.
  • Implemented RAG workflows, including ingestion, chunking, vectorization, and retrieval, abstracted behind stable service interfaces.
  • Built SDKs and automation tooling in Python and Go to standardize authentication, request patterns, and observability across consuming teams.
  • Containerized services and implemented GitOps-style deployments using ArgoCD and CI pipelines in Jenkins, with environment-specific configuration and rollout strategies.
  • Introduced metrics and logging around request volumes, latency, and model usage to optimize cost and performance of the platform.
LLM as Backend SystemArchitektur Back-End Python Prompt-Engineering Azure OpenAI RAG ArgoCD Jenkins GitOps Go Kserve Knative Kuberntes Inference Azure Devops
SAP
10 months
2023-03 - 2023-12

Agentic AI Platform & RAG Excellence Framework

GenAI Engineer Python OpenAI LangChain ...
GenAI Engineer

Designed and delivered a production-grade agentic AI platform supporting full-stack development, orchestration, deployment, and observability of LLM-driven autonomous agents across multiple enterprise business units. Implemented standardized blueprints for agent topologies, tool invocation layers, RAG/Indexing pipelines, and LLMOps workflows, ensuring horizontal scalability, fault tolerance, auditability, and regulatory alignment within a highly controlled financial-services environment. Built comprehensive evaluation, governance, and safety frameworks that accelerated organizational adoption of AI copilots and significantly reduced time-to-market for new intelligent-automation workloads.

Responsibilities:
  • Owned full-lifecycle delivery of agentic AI and ML initiatives?from problem scoping, feature engineering, and dataset curation to model development, quantitative/qualitative evaluation, deployment, and post-production monitoring.
  • Architected advanced agent systems (e.g., planner?executor, hierarchical/multi-agent, tool-augmented agents) leveraging memory modules, reflection loops, action-scoring policies, and controllability/safety constraints.
  • Designed and optimized RAG pipelines, including document ingestion, chunking heuristics, embedding generation, vector store configuration, hybrid retrieval, reranking, caching, and evaluation frameworks for precision/recall, hallucination rate, and latency SLAs.
  • ? Implemented MCP for standardized, permissioned tool integrations and orchestrated heterogeneous LLM workloads using LangChain, LlamaIndex and custom microservices for tool execution.
  • Established enterprise-grade LLMOps practices including experiment tracking (MLflow/Weights & Biases), dataset/prompt versioning (DVC/Git), CI/CD pipelines (GitHub Actions/Azure DevOps), model registries, workload autoscaling, telemetry, drift detection, and incident-response runbooks.
  • Enforced reliability, safety, and compliance controls through prompt-injection defenses, schema validation, content-moderation pipelines, differential access controls, policy enforcement layers, adversarial/red-teaming evaluations, and pre-production quality gates.

Python OpenAI LangChain Kubernetes RAG Chatbot Go (Golang) Firebase Firestore LlamaIndex Semantic Kernel Vector Databases Redis MCP AWS FastAPI async asynchronous programming
Remote
11 months
2022-05 - 2023-03

Cloud Orchestration Platform (Go / Distributed Systems)

Senior Software Developer Go(lang) WebSocket OpenSearch ...
Senior Software Developer
Contributed to the design and implementation of a custom orchestration system, inspired by Kubernetes, to automate provisioning of hundreds of products across thousands of customers. Focused on distributed control-plane components, APIs, and operators for high scalability and availability.


Responsibilities:

  • Co-designed the control-plane architecture, including components equivalent to an API server, controller manager, and namespace isolation model.
  • Implemented operators and controllers in Go, following the Kubernetes Operator pattern to automate provisioning workflows for different products and teams.
  • Designed and exposed APIs for provisioning requests, status tracking, and lifecycle management, with clear contracts and versioning.
  • Built asynchronous communication and progress streaming to clients using WebSockets, ensuring responsiveness during long-running operations.
  • Integrated Redis and OpenSearch for state management, queuing, and indexing of provisioning data and audit logs.
  • Deployed the orchestration system on Kubernetes in AWS, using S3 for artifact storage, and instrumented the platform with Prometheus and Grafana for metrics and dashboards.
  • Collaborated across teams to onboard new operators, define SLAs, and ensure the system scaled with growing customer and product counts.

Go(lang) WebSocket OpenSearch LocalStack Redis Prometheus Grafana AWS S3 Kubernetes RBAC Distributed Systems Webhook
SAP
Walldorf
1 year 8 months
2020-09 - 2022-04

Enablement of SAP Analytics Cloud?s SaaS offering

Senior cloud engineer Cloud Foundry Prometheus API gateway ...
Senior cloud engineer

Developed backend microservices to offer SAP Analytics Cloud as a SaaS product via the Cloud Foundry marketplace. Work focused on service broker APIs, metering, and billing services that integrate with downstream capacity management and system provisioning systems.


  • Designed and implemented the Cloud Foundry service broker, providing lifecycle APIs (provision, bind, update, deprovision) for SAC instances.
  • Developed metering and billing microservices in Go, exposing REST APIs to capture usage, calculate charges, and integrate with internal billing and capacity systems.
  • Introduced an API gateway to standardize authentication, routing, and rate limiting across the broker, metering, and billing endpoints.
  • Used Redis and Postgres to persist service instances, usage records, and billing data with appropriate indexing, constraints, and migration strategies.
  • Instrumented services with Prometheus metrics and logging to monitor availability, latency, and throughput, and to support troubleshooting and capacity planning.
  • Collaborated with product and finance stakeholders to align technical metering models with commercial pricing structures.
Cloud Foundry Prometheus API gateway Redis Postgres Grafana Go
Walldorf
1 year 5 months
2019-04 - 2020-08

Developed cloud infra. for the SAP HANA-as-a-Service

DevOps engineer HashiCorp (Terraform/ Vault/ Consul) Ansible AWS (VPC/ EC2/ S3/ Glacier/ Cloud Watch/ API Gateway) ...
DevOps engineer
Built robust cloud infrastructure on AWS for HANA-as-a-Service, automating deployments, upgrades, and multi-region provisioning.
  • Developed Terraform and Ansible playbooks for installation and automated upgrades.
  • Built a lightweight agent to respond to Consul changes, triggering relevant playbooks.
  • Implemented APIs for customer HANA system orders.
  • Designed secure VPC architecture and automated backup/recovery workflows using AWS S3 and Glacier.
  • Reduced deployment time by 40% while maintaining compliance and security standards.
HashiCorp (Terraform/ Vault/ Consul) Ansible AWS (VPC/ EC2/ S3/ Glacier/ Cloud Watch/ API Gateway) Cloud Foundry Python Go (Golang) Bash
SAP
10 months
2018-07 - 2019-04

Development of an elastic caching microservice

Software Engineer Go(lang) Redis MongoDB ...
Software Engineer
Developed an elastic caching microservice in Go to accelerate analytical query performance for a multi-tenant analytics platform, implementing context-aware caching with user permissions, roles, and cube dimension metadata.


  • Designed and implemented the caching service in , exposing a clear API for cache reads/writes and integrating with upstream analytics components.
  • Modeled cache keys and data structures to account for user roles, permissions, and cube dimensions, ensuring correct and secure cache hits.
  • Implemented cache invalidation strategies (e.g., key-based and pattern-based invalidation) to handle data changes without stale results.
  • Used Redis as the primary cache store and MongoDB for underlying metadata/state where needed.
  • Deployed the service on Kubernetes with appropriate resource limits, autoscaling policies, and rolling upgrade strategies.
  • Instrumented the service with Prometheus metrics (hit rate, latency, error rates, resource usage) to optimize performance and capacity.
Go(lang) Redis MongoDB Kubernetes Prometheus 12-factor app
SAP

Aus- und Weiterbildung

Aus- und Weiterbildung

2014 - 2017
Distributed Software Systems
TU Darmstadt (Germany)
Degree: Master of Science

Position

Position

Software and DevOps engineer with focus on cloud-native development and LLM application development.

Kompetenzen

Kompetenzen

Top-Skills

Distributed systems Microservices architecture Back-End Go Python Cloud Engineer API design REST & gRPC Kubernetes Cloud-native development Observability & telemetry RAG pipelines DevOps Azure & AWS Infrastructure as Code

Schwerpunkte

AI/ML Platform Engineering & LLMOps
Fortgeschritten
Cloud-Native Observability & SRE
Experte
Distributed Systems & Platform Engineering
Fortgeschritten

AI/ML Platform Engineering & LLMOps

Deep expertise in building production-grade GenAI platforms and agentic AI systems, with comprehensive experience in LLM deployment, fine-tuning, RAG pipelines, and model orchestration. Specialized in architecting multi-tenant AI infrastructure that balances performance, cost optimization, and enterprise security requirements.


Cloud-Native Observability and SRE

Expert in designing end-to-end observability solutions for distributed systems and AI/ML workloads using OpenTelemetry, Prometheus, and Grafana. Proven ability to instrument complex environments from token-level metrics to infrastructure telemetry, enabling proactive incident management, anomaly detection, and data-driven optimization of high-throughput systems.


Distributed Systems and Platform Engineering

Strong foundation in building scalable, cloud-native platforms with expertise in Kubernetes ecosystem, control-plane architecture, and microservices orchestration. Skilled in implementing GitOps workflows, CI/CD automation, and infrastructure-as-code practices to deliver reliable, self-service platforms for enterprise-scale deployments.

Aufgabenbereiche

System Architecture
Fortgeschritten
Software Engineering
Experte
DevOps
Fortgeschritten
  • Architecture and implementation of enterprise GenAI platforms supporting RAG, fine-tuning, model routing, and content moderation workflows
  • Design and deployment of observability frameworks for AI/ML systems, including distributed tracing, metrics pipelines, and SLO/SLA monitoring
  • Development of autonomous agent systems with tool integration, memory modules, and safety/governance controls
  • Building cloud-native microservices and APIs for multi-tenant SaaS offerings with focus on scalability and reliability
  • Infrastructure automation using GitOps, CI/CD pipelines, and infrastructure-as-code across AWS and Azure environments
  • Implementation of control-plane architectures for container orchestration and resource provisioning at scale
  • Establishment of LLMOps practices including experiment tracking, model versioning, drift detection, and compliance enforcement
  • Performance optimization through caching strategies, autoscaling policies, and resource utilization monitoring
  • Cross-functional collaboration with ML Ops, SRE, and platform engineering teams to accelerate AI adoption
  • Security implementation including RBAC, multi-tenant isolation, and content moderation pipelines

Produkte / Standards / Erfahrungen / Methoden

DevOps
Fortgeschritten
Software
Experte
AWS
Fortgeschritten
OpenAI
Experte
Kubernetes
Fortgeschritten
Observability
Fortgeschritten
GenAI
Experte
Development
Fortgeschritten
Profile
As a freelance software engineer, I deliver tailored, scalable software solutions for enterprise systems. My focus spans software architecture and development, DevOps and LLM-powered AI applications, with strong expertise in building reliable, observable and cost-efficient distributed systems on AWS and Azure.

AI/ML & GenAI
OpenAI, Azure OpenAI, LangChain, LlamaIndex, Semantic Kernel, RAG (Retrieval-Augmented Generation), Prompt Engineering, LLM Fine-tuning, Model Inference, Vector Databases, MLflow, Kserve, Knative, MCP (Model Context Protocol), Chatbot Development, Agentic AI Systems


Cloud Platforms & Services

AWS (VPC, EC2, S3, Glacier, CloudWatch, API Gateway), Azure DevOps, Azure Cognitive Services, Cloud Foundry, Multi-cloud Architecture


Container Orchestration & Infrastructure

Kubernetes, Helm, ArgoCD, Argo Workflows, Docker, StatefulSets, DaemonSets, Custom Operators, Control Plane Architecture


Observability & Monitoring

OpenTelemetry, Prometheus, Grafana, Dynatrace, Promitor, Loki, Jaeger, Tempo, Alertmanager, Distributed Tracing, Metrics Engineering, SLO/SLA Monitoring


Programming Languages

Python, Go (Golang), Node.js, Java, Bash


Data Storage & Caching

Redis, MongoDB, PostgreSQL, Firebase Firestore, Vector Databases, OpenSearch, AWS S3


DevOps & Automation

GitOps, Jenkins, Terraform, Ansible, HashiCorp (Vault, Consul, Terraform), LocalStack, CI/CD Pipelines, GitHub Actions, Infrastructure-as-Code (IaC)


Networking & Communication

REST APIs, WebSocket, API Gateway, RBAC, Service Mesh


Development Practices & Patterns

Microservices Architecture, 12-Factor App Principles, SRE Practices, LLMOps, MLOps, Multi-tenant Design, Distributed Systems, Event-Driven Architecture, KEDA (Kubernetes Event-Driven Autoscaling)


Security & Compliance

RBAC (Role-Based Access Control), Content Moderation, Prompt Injection Defense, Multi-tenant Isolation, Policy Enforcement, Adversarial Testing


Data & ML Tools

DVC (Data Version Control), Weights & Biases, Model Registries, Experiment Tracking, Dataset Versioning

Betriebssysteme

Linux

Programmiersprachen

Go (Golang)
Python
Java
Node.js
Postgres
MongoDB
Firebase

Datenbanken

PostgresSQL
MongoDB
Redis
Firestore
Elasticsearch

Einsatzorte

Einsatzorte

Heidelberg (+150km)
Deutschland, Schweiz
möglich

Projekte

Projekte

1 year 11 months
2024-01 - 2025-11

Observability Engineering (AI/ML)

Observability Engineer (AI/ML) Prometheus Grafana Kubernetes ...
Observability Engineer (AI/ML)

Designed and implemented an observability and telemetry framework for production LLM workloads (hosted and proxied) in enterprise environments. Focused on backend architecture, metrics pipelines, and standardized telemetry to give engineering and SRE teams insight into performance, reliability, and cost of AI services across multi-cloud deployments.

  • Designed the overall observability architecture for LLM gateways and supporting microservices running on Kubernetes, with a clear separation of concerns for metrics, logging, and tracing.
  • Instrumented services end?to?end using OpenTelemetry to capture traces, latency, token usage, and error profiles for LLM requests and downstream dependencies.
  • Built metrics pipelines backed by Prometheus and integrated dashboards/alerts in Grafana and Promitor to support SLO/SLA tracking, anomaly detection, and incident response.
  • Defined telemetry schemas and APIs for internal consumers (platform, product, SRE teams), enabling consistent querying and aggregation across services and environments.
  • Worked with SRE and platform teams to tune sampling, retention, and alert thresholds to balance observability depth with system performance and operational cost.

Prometheus Grafana Kubernetes Helm Python OpenTelemetry Promitor Dynatrace Go Loki Jaeger Tempo KEDA HelmArgoCD MLflow
SAP
1 year 11 months
2024-01 - 2025-11

AI/ML Platform Engineering

Backend Services Engineer LLM as Backend SystemArchitektur Back-End ...
Backend Services Engineer
Built a scalable GenAI platform for enterprise workloads, integrating advanced LLM-as-a-Backend capabilities with support for RAG pipelines, model routing/orchestration, fine-tuning, and content-moderation workflows. Delivered cross-language SDKs, internal CLI tools, and fully automated CI/CD pipelines to streamline AI adoption for product teams while optimizing for cost efficiency, security, and operational reliability.


Responsibilities:

  • Architected and implemented backend microservices in Go and Python for RAG pipelines, model routing, fine-tuning jobs, and content moderation workflows.
  • Designed and documented multi-tenant REST/gRPC APIs to provide consistent access to LLM capabilities for multiple internal products.
  • Implemented RAG workflows, including ingestion, chunking, vectorization, and retrieval, abstracted behind stable service interfaces.
  • Built SDKs and automation tooling in Python and Go to standardize authentication, request patterns, and observability across consuming teams.
  • Containerized services and implemented GitOps-style deployments using ArgoCD and CI pipelines in Jenkins, with environment-specific configuration and rollout strategies.
  • Introduced metrics and logging around request volumes, latency, and model usage to optimize cost and performance of the platform.
LLM as Backend SystemArchitektur Back-End Python Prompt-Engineering Azure OpenAI RAG ArgoCD Jenkins GitOps Go Kserve Knative Kuberntes Inference Azure Devops
SAP
10 months
2023-03 - 2023-12

Agentic AI Platform & RAG Excellence Framework

GenAI Engineer Python OpenAI LangChain ...
GenAI Engineer

Designed and delivered a production-grade agentic AI platform supporting full-stack development, orchestration, deployment, and observability of LLM-driven autonomous agents across multiple enterprise business units. Implemented standardized blueprints for agent topologies, tool invocation layers, RAG/Indexing pipelines, and LLMOps workflows, ensuring horizontal scalability, fault tolerance, auditability, and regulatory alignment within a highly controlled financial-services environment. Built comprehensive evaluation, governance, and safety frameworks that accelerated organizational adoption of AI copilots and significantly reduced time-to-market for new intelligent-automation workloads.

Responsibilities:
  • Owned full-lifecycle delivery of agentic AI and ML initiatives?from problem scoping, feature engineering, and dataset curation to model development, quantitative/qualitative evaluation, deployment, and post-production monitoring.
  • Architected advanced agent systems (e.g., planner?executor, hierarchical/multi-agent, tool-augmented agents) leveraging memory modules, reflection loops, action-scoring policies, and controllability/safety constraints.
  • Designed and optimized RAG pipelines, including document ingestion, chunking heuristics, embedding generation, vector store configuration, hybrid retrieval, reranking, caching, and evaluation frameworks for precision/recall, hallucination rate, and latency SLAs.
  • ? Implemented MCP for standardized, permissioned tool integrations and orchestrated heterogeneous LLM workloads using LangChain, LlamaIndex and custom microservices for tool execution.
  • Established enterprise-grade LLMOps practices including experiment tracking (MLflow/Weights & Biases), dataset/prompt versioning (DVC/Git), CI/CD pipelines (GitHub Actions/Azure DevOps), model registries, workload autoscaling, telemetry, drift detection, and incident-response runbooks.
  • Enforced reliability, safety, and compliance controls through prompt-injection defenses, schema validation, content-moderation pipelines, differential access controls, policy enforcement layers, adversarial/red-teaming evaluations, and pre-production quality gates.

Python OpenAI LangChain Kubernetes RAG Chatbot Go (Golang) Firebase Firestore LlamaIndex Semantic Kernel Vector Databases Redis MCP AWS FastAPI async asynchronous programming
Remote
11 months
2022-05 - 2023-03

Cloud Orchestration Platform (Go / Distributed Systems)

Senior Software Developer Go(lang) WebSocket OpenSearch ...
Senior Software Developer
Contributed to the design and implementation of a custom orchestration system, inspired by Kubernetes, to automate provisioning of hundreds of products across thousands of customers. Focused on distributed control-plane components, APIs, and operators for high scalability and availability.


Responsibilities:

  • Co-designed the control-plane architecture, including components equivalent to an API server, controller manager, and namespace isolation model.
  • Implemented operators and controllers in Go, following the Kubernetes Operator pattern to automate provisioning workflows for different products and teams.
  • Designed and exposed APIs for provisioning requests, status tracking, and lifecycle management, with clear contracts and versioning.
  • Built asynchronous communication and progress streaming to clients using WebSockets, ensuring responsiveness during long-running operations.
  • Integrated Redis and OpenSearch for state management, queuing, and indexing of provisioning data and audit logs.
  • Deployed the orchestration system on Kubernetes in AWS, using S3 for artifact storage, and instrumented the platform with Prometheus and Grafana for metrics and dashboards.
  • Collaborated across teams to onboard new operators, define SLAs, and ensure the system scaled with growing customer and product counts.

Go(lang) WebSocket OpenSearch LocalStack Redis Prometheus Grafana AWS S3 Kubernetes RBAC Distributed Systems Webhook
SAP
Walldorf
1 year 8 months
2020-09 - 2022-04

Enablement of SAP Analytics Cloud?s SaaS offering

Senior cloud engineer Cloud Foundry Prometheus API gateway ...
Senior cloud engineer

Developed backend microservices to offer SAP Analytics Cloud as a SaaS product via the Cloud Foundry marketplace. Work focused on service broker APIs, metering, and billing services that integrate with downstream capacity management and system provisioning systems.


  • Designed and implemented the Cloud Foundry service broker, providing lifecycle APIs (provision, bind, update, deprovision) for SAC instances.
  • Developed metering and billing microservices in Go, exposing REST APIs to capture usage, calculate charges, and integrate with internal billing and capacity systems.
  • Introduced an API gateway to standardize authentication, routing, and rate limiting across the broker, metering, and billing endpoints.
  • Used Redis and Postgres to persist service instances, usage records, and billing data with appropriate indexing, constraints, and migration strategies.
  • Instrumented services with Prometheus metrics and logging to monitor availability, latency, and throughput, and to support troubleshooting and capacity planning.
  • Collaborated with product and finance stakeholders to align technical metering models with commercial pricing structures.
Cloud Foundry Prometheus API gateway Redis Postgres Grafana Go
Walldorf
1 year 5 months
2019-04 - 2020-08

Developed cloud infra. for the SAP HANA-as-a-Service

DevOps engineer HashiCorp (Terraform/ Vault/ Consul) Ansible AWS (VPC/ EC2/ S3/ Glacier/ Cloud Watch/ API Gateway) ...
DevOps engineer
Built robust cloud infrastructure on AWS for HANA-as-a-Service, automating deployments, upgrades, and multi-region provisioning.
  • Developed Terraform and Ansible playbooks for installation and automated upgrades.
  • Built a lightweight agent to respond to Consul changes, triggering relevant playbooks.
  • Implemented APIs for customer HANA system orders.
  • Designed secure VPC architecture and automated backup/recovery workflows using AWS S3 and Glacier.
  • Reduced deployment time by 40% while maintaining compliance and security standards.
HashiCorp (Terraform/ Vault/ Consul) Ansible AWS (VPC/ EC2/ S3/ Glacier/ Cloud Watch/ API Gateway) Cloud Foundry Python Go (Golang) Bash
SAP
10 months
2018-07 - 2019-04

Development of an elastic caching microservice

Software Engineer Go(lang) Redis MongoDB ...
Software Engineer
Developed an elastic caching microservice in Go to accelerate analytical query performance for a multi-tenant analytics platform, implementing context-aware caching with user permissions, roles, and cube dimension metadata.


  • Designed and implemented the caching service in , exposing a clear API for cache reads/writes and integrating with upstream analytics components.
  • Modeled cache keys and data structures to account for user roles, permissions, and cube dimensions, ensuring correct and secure cache hits.
  • Implemented cache invalidation strategies (e.g., key-based and pattern-based invalidation) to handle data changes without stale results.
  • Used Redis as the primary cache store and MongoDB for underlying metadata/state where needed.
  • Deployed the service on Kubernetes with appropriate resource limits, autoscaling policies, and rolling upgrade strategies.
  • Instrumented the service with Prometheus metrics (hit rate, latency, error rates, resource usage) to optimize performance and capacity.
Go(lang) Redis MongoDB Kubernetes Prometheus 12-factor app
SAP

Aus- und Weiterbildung

Aus- und Weiterbildung

2014 - 2017
Distributed Software Systems
TU Darmstadt (Germany)
Degree: Master of Science

Position

Position

Software and DevOps engineer with focus on cloud-native development and LLM application development.

Kompetenzen

Kompetenzen

Top-Skills

Distributed systems Microservices architecture Back-End Go Python Cloud Engineer API design REST & gRPC Kubernetes Cloud-native development Observability & telemetry RAG pipelines DevOps Azure & AWS Infrastructure as Code

Schwerpunkte

AI/ML Platform Engineering & LLMOps
Fortgeschritten
Cloud-Native Observability & SRE
Experte
Distributed Systems & Platform Engineering
Fortgeschritten

AI/ML Platform Engineering & LLMOps

Deep expertise in building production-grade GenAI platforms and agentic AI systems, with comprehensive experience in LLM deployment, fine-tuning, RAG pipelines, and model orchestration. Specialized in architecting multi-tenant AI infrastructure that balances performance, cost optimization, and enterprise security requirements.


Cloud-Native Observability and SRE

Expert in designing end-to-end observability solutions for distributed systems and AI/ML workloads using OpenTelemetry, Prometheus, and Grafana. Proven ability to instrument complex environments from token-level metrics to infrastructure telemetry, enabling proactive incident management, anomaly detection, and data-driven optimization of high-throughput systems.


Distributed Systems and Platform Engineering

Strong foundation in building scalable, cloud-native platforms with expertise in Kubernetes ecosystem, control-plane architecture, and microservices orchestration. Skilled in implementing GitOps workflows, CI/CD automation, and infrastructure-as-code practices to deliver reliable, self-service platforms for enterprise-scale deployments.

Aufgabenbereiche

System Architecture
Fortgeschritten
Software Engineering
Experte
DevOps
Fortgeschritten
  • Architecture and implementation of enterprise GenAI platforms supporting RAG, fine-tuning, model routing, and content moderation workflows
  • Design and deployment of observability frameworks for AI/ML systems, including distributed tracing, metrics pipelines, and SLO/SLA monitoring
  • Development of autonomous agent systems with tool integration, memory modules, and safety/governance controls
  • Building cloud-native microservices and APIs for multi-tenant SaaS offerings with focus on scalability and reliability
  • Infrastructure automation using GitOps, CI/CD pipelines, and infrastructure-as-code across AWS and Azure environments
  • Implementation of control-plane architectures for container orchestration and resource provisioning at scale
  • Establishment of LLMOps practices including experiment tracking, model versioning, drift detection, and compliance enforcement
  • Performance optimization through caching strategies, autoscaling policies, and resource utilization monitoring
  • Cross-functional collaboration with ML Ops, SRE, and platform engineering teams to accelerate AI adoption
  • Security implementation including RBAC, multi-tenant isolation, and content moderation pipelines

Produkte / Standards / Erfahrungen / Methoden

DevOps
Fortgeschritten
Software
Experte
AWS
Fortgeschritten
OpenAI
Experte
Kubernetes
Fortgeschritten
Observability
Fortgeschritten
GenAI
Experte
Development
Fortgeschritten
Profile
As a freelance software engineer, I deliver tailored, scalable software solutions for enterprise systems. My focus spans software architecture and development, DevOps and LLM-powered AI applications, with strong expertise in building reliable, observable and cost-efficient distributed systems on AWS and Azure.

AI/ML & GenAI
OpenAI, Azure OpenAI, LangChain, LlamaIndex, Semantic Kernel, RAG (Retrieval-Augmented Generation), Prompt Engineering, LLM Fine-tuning, Model Inference, Vector Databases, MLflow, Kserve, Knative, MCP (Model Context Protocol), Chatbot Development, Agentic AI Systems


Cloud Platforms & Services

AWS (VPC, EC2, S3, Glacier, CloudWatch, API Gateway), Azure DevOps, Azure Cognitive Services, Cloud Foundry, Multi-cloud Architecture


Container Orchestration & Infrastructure

Kubernetes, Helm, ArgoCD, Argo Workflows, Docker, StatefulSets, DaemonSets, Custom Operators, Control Plane Architecture


Observability & Monitoring

OpenTelemetry, Prometheus, Grafana, Dynatrace, Promitor, Loki, Jaeger, Tempo, Alertmanager, Distributed Tracing, Metrics Engineering, SLO/SLA Monitoring


Programming Languages

Python, Go (Golang), Node.js, Java, Bash


Data Storage & Caching

Redis, MongoDB, PostgreSQL, Firebase Firestore, Vector Databases, OpenSearch, AWS S3


DevOps & Automation

GitOps, Jenkins, Terraform, Ansible, HashiCorp (Vault, Consul, Terraform), LocalStack, CI/CD Pipelines, GitHub Actions, Infrastructure-as-Code (IaC)


Networking & Communication

REST APIs, WebSocket, API Gateway, RBAC, Service Mesh


Development Practices & Patterns

Microservices Architecture, 12-Factor App Principles, SRE Practices, LLMOps, MLOps, Multi-tenant Design, Distributed Systems, Event-Driven Architecture, KEDA (Kubernetes Event-Driven Autoscaling)


Security & Compliance

RBAC (Role-Based Access Control), Content Moderation, Prompt Injection Defense, Multi-tenant Isolation, Policy Enforcement, Adversarial Testing


Data & ML Tools

DVC (Data Version Control), Weights & Biases, Model Registries, Experiment Tracking, Dataset Versioning

Betriebssysteme

Linux

Programmiersprachen

Go (Golang)
Python
Java
Node.js
Postgres
MongoDB
Firebase

Datenbanken

PostgresSQL
MongoDB
Redis
Firestore
Elasticsearch

Vertrauen Sie auf Randstad

Im Bereich Freelancing
Im Bereich Arbeitnehmerüberlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.