Machine Learning, Data Engineering, Data Scientist, Azure Cloud Umgebung, NLP, GenAI
Aktualisiert am 16.04.2026
Profil
Mitarbeiter eines Dienstleisters
Remote-Arbeit
Verfügbar ab: 16.04.2026
Verfügbar zu: 100%
davon vor Ort: 30%
Skill-Profil eines fest angestellten Mitarbeiters des Dienstleisters

Einsatzorte

Einsatzorte

Deutschland
möglich

Projekte

Projekte

3 Monate
2026-01 - 2026-03

Advise and enable the enterprise-wide migration from GitLab/Jenkins to GitHub/GitHub Actions, establishing secure DevOps standards, self-service CI/CD building blocks, and an AI-supported first-level support model.

Senior DevOps Integration Engineer
Senior DevOps Integration Engineer

  • Led the end-to-end migration advisory for delivery teams transitioning from GitLab and Jenkins to GitHub and GitHub Actions, including target operating model, rollout planning, and technical enablement. 
  • Defined and implemented reusable GitHub Actions ?golden path? workflows (build/test/release) aligned to Git Flow, enabling standardized deployments across Maven- and NPM-based services. 
  • Designed IaC patterns and landing-zone conventions using Terraform to provision consistent environments and pipelines within an on-prem Cloud Foundry data center and associated platform services. 
  • Established security-by-default CI/CD controls by integrating Mend, Fortify, and SonarQube scans into pipeline templates, driving compliance-ready automation and traceable quality gates. 
  • Built an AI-assisted support chatbot using Azure AI Foundry to reduce first-level support load, accelerate issue triage, and provide guided answers for common migration and pipeline questions. 
  • Implemented ?docs-as-code? governance: maintained operational/runbook documentation in Markdown, automated publishing to Confluence via GitHub Actions, and introduced review workflows for documentation quality. 
  • Implemented observability foundations by creating Grafana dashboards for pipeline/platform monitoring and operational KPIs, improving transparency for engineering leadership and customer success stakeholders. 
  • Acted as senior integration point between platform, security, and product teams?facilitating workshops, technical decision records, and stakeholder alignment to accelerate adoption and reduce delivery friction. 

Azure AI Foundry Cloud Foundry Terraform GitHub GitHub Actions Git Flow Maven NPM Docker Grafana Confluence Mend Fortify SonarQube
4 Monate
2025-09 - 2025-12

Design and implementation of an intelligent matching platform (Databricks/AI) to align Engineering Ground Truth with Supplier Data.

Lead Software Architect
Lead Software Architect

  • Architected and implemented an Azure Databricks platform to reconcile PDM (Ground Truth) with SRM (Supplier) data, ensuring data integrity for liability and compliance.
  • Engineered advanced matching logic utilizing Neo4j, Vector Embeddings, and LLMs to align discordant data structures and identify critical components (semiconductors, rare earths).
  • Built automated ELT pipelines (Medallion Architecture) orchestrated via Databricks Asset Bundles for reliable daily data processing.
  • Deployed high-quality data products serving both downstream analytics and a custom frontend hosted on Azure App Services.

Azure App Service Atlassian JIRA / Confluence SQL Python AI Search Neo4j Azure Databricks AKS
6 Monate
2025-05 - 2025-10

Development of a scalable Data and DevOps infrastructure for the automated identification and utilization of customer-segmented potentials for product launches.

Senior Data Engineer
Senior Data Engineer

  • Built pipelines to identify and consistently assign customers with the highest potential for purchasing new products and to support product launches.
  • Established a Databricks / DevOps / Azure infrastructure for a team of 12 employees.
  • Created notebooks in a Medallion architecture to extract, load and transform source data (12 sources).
  • Set up orchestration in Azure Data Factory (ADF).
  • Implemented a process to send customer datasets to Emarsys (CRM system).
  • Introduced monitoring and error handling.
  • Deployed an Azure Data Platform (ADF, ADLS Gen3, Functions, DevOps) and Databricks to implement data pipelines, ETL processes and machine learning workflows.

Atlassian Jira Confluence Azure Databricks ADLS Gen3 ADF Functions DevOps Terraform SQL Server Emarsys
7 Monate
2025-01 - 2025-07

Design and build a proof of concept

Senior ML Engineer Python Scrum GenAi ...
Senior ML Engineer

  • Engineered a data pipeline to process and filter over 60,000 PubMed abstracts from an initial corpus of 38 million.
  • Architected a multi-stage extraction system utilizing LLMs for relationship extraction, specialized NER models, and a custom model for mapping entities to normalized gene identifiers.
  • Populated a Neo4j graph database with the extracted entities (genes, proteins) and their relationships to serve as the single source of truth.
  • Developed a web-based UI using Databricks Apps, featuring a conversational AI that answers user queries by generating and executing Cypher queries against the knowledge graph in real-time.
  • Implemented a ?power-user? mode that exposed the underlying Cypher queries and rendered interactive graph visualizations directly in the UI.

Neo4j Git Databricks Jira Confluence (GitHub Actions)
Python Scrum GenAi RAG LLMS SAFe
4 Monate
2025-01 - 2025-04

Formulate and formalize the company-wide AI strategy, providing a clear roadmap for AI adoption, governance, and value creation.

AI Strategy Consultant
AI Strategy Consultant

  • Authored the official, company-wide AI strategy, defining the long-term vision, governance framework, and strategic roadmap for leveraging AI across the organization in alignment with VIG Group directives.

  • Conducted a comprehensive analysis of business processes and led stakeholder workshops to identify, prioritize, and create business cases for high-impact AI initiatives.
  • Established a clear framework for the responsible and ethical use of AI, including guidelines on data privacy, model transparency, and risk management to ensure compliant and secure adoption.
  • Collaborated closely with senior leadership and board advisors to ensure tight integration and synergy between the newly developed AI and Cloud strategies, creating a unified technology vision.

AI Strategy Frameworks AI Governance Responsible AI Risk Assessment Use Case Prioritization Business Case Development Stakeholder Mangagement
1 Monat
2025-01 - 2025-01

Design and implement a robust evaluation framework for an LLM-based agent system to ensure the accuracy and reliability of information extraction from complex reinsurance contracts.

Senior Data Scientist
Senior Data Scientist

  • Architected and built an end-to-end LLM evaluation framework on the Databricks platform to quantitatively measure the performance of an agent-based extraction system for key contractual data points (e.g., inclusions, exclusions, counterparties).
  • Developed interactive Databricks Dashboards and utilized MLflow to track experiment metrics, providing stakeholders with a clear, real-time view of the agent's accuracy and consistency.
  • Executed a proof-of-concept for performance improvement by fine-tuning a Hugging Face model using the QLoRA methodology, demonstrating a significant enhancement in extraction accuracy on domain-specific terminology.
  • The resulting framework provided the basis for a data-driven approach to iteratively improve the LLM agent, ensuring high-quality, reliable outputs for simplifying contract analysis.

Databricks Azure Python PySpark SQL LLMs Hugging Face Transformers QLoRA LLM Evalutation MLflow Delta Lake Unity Catalog Databricks Apps LangChain
2 Jahre
2023-01 - 2024-12

End-to-end architectural responsibility for the company's transition to AI-driven products, leading the technical roadmap from initial PoCs to production-grade Kubernetes deployments.

Lead AI Engineer Python GenAI RAG
Lead AI Engineer

Project 1: 

  • Developed a solution proposal and created a detailed project plan.
  • Designed a pipeline for data preprocessing and provisioning using Azure AI Search
  • AI-assisted optimization of the existing search system by implementing relevance-based hybrid search.
  • Integrated generative AI (GPT-4o) with LangGraph to answer user-specific questions based on help documents.
  • Conducted training and onboarding for employees to ensure the effective use and maintenance of the system.

Project 2:

  • Developed a solution proposal and created a detailed project plan.
  • Designed a pipeline for data preprocessing and provisioning using Azure AI Search
  • AI-assisted optimization of the existing search system by implementing relevance-based hybrid search.
  • Integrated generative AI (GPT-4o) with LangGraph to answer user-specific questions based on help documents.
  • Conducted training and onboarding for employees to ensure the effective use and maintenance of the system.

Project 3:

  • Designed and built a proof-of-concept data pipeline in Azure Data Factory to process unstructured documents (e.g., publisher data, help articles) following the Medallion (Bronze-Silver-Gold) architecture.
  • Orchestrated data transformation workflows by integrating Azure Functions for lightweight processing and Azure Synapse Notebooks for complex, Spark-based transformations of the raw data.
  • Implemented the final data loading stage into Azure AI Search, ensuring the data was properly structured and indexed for downstream RAG applications.
  • Delivered a comprehensive technical evaluation and cost analysis of the ADF-based solution, which directly informed the strategic decision to leverage the existing Kubernetes cluster with Argo Workflows for the final production implementation.

Project 4:

  • Designed and developed deep learning models based on Recurrent Neural Networks (LSTMs, GRUs) to analyze and predict anomalies in multivariate time-series data from server infrastructure.
  • Engineered a data pipeline to ingest and process real-time server metrics, including CPU utilization, RAM usage, and network traffic, leveraging Azure Monitor and Grafana for data sourcing and visualization.
  • Trained and validated the models, achieving a 90% predictive accuracy in identifying critical failure patterns within a simulated test environment.
  • Delivered a comprehensive proof-of-concept that successfully demonstrated technical feasibility and provided key data for a strategic cost-benefit analysis regarding on-premise infrastructure versus cloud migration.

Azure AI Search AKS Kubernetes Terraform Docker Git Argo Workflows LangChain LangGraph Ragas Transformers Hugging Face Elasticsearch Jira Confluence
Python GenAI RAG
2 Jahre 10 Monate
2020-03 - 2022-12

Develop and implement a pipeline for automated image retouching

Data Scientist Python Machine Learning Neural Network ...
Data Scientist

  • Evaluated and validated various approaches to automate image retouching, including the development of a concept for neural networks.

  • Transferred knowledge through training and documentation to enable project partners to apply and further develop the system.

  • Trained and optimized Generative Adversarial Networks (GANs) to implement sub-processes of image editing, such as automated image segmentation.

  • Gathered and preprocessed suitable training data in close collaboration with the companies involved in the project.

PyTorch Git Photoshop Lightroom Flask
Python Machine Learning Neural Network HTML JavaScript CSS Generative Adversarial Networks
5 Monate
2019-10 - 2020-02

Improve the color representation of 3D scanners through calibration and the use of machine learning models.

Data Scientist Python Machine Learning Neural Network ...
Data Scientist

  • Calibrated 3D scanners to ensure a more precise capture of color and depth information.

  • Designed and conducted employee training sessions to facilitate the integration of the new technologies into existing workflows.

  • Developed, trained, and validated machine learning models (including neural networks, XGBoost, and LightGBM) to enhance the color rendering of 3D scans

XGBoost LightGBM Git TensorFlow/keras
Python Machine Learning Neural Network R
1 Jahr 10 Monate
2018-05 - 2020-02

Leverage machine learning to classify and predict the fundamental origin of theoretical physics models from a vast and complex dataset.

Data Scientist Python R Machine Learning
Data Scientist

  • Architected and trained a predictive classification model using boosted decision trees (LightGBM) on a large-scale, highly imbalanced dataset of over 126,000 string theory models.
  • Systematically evaluated and benchmarked the performance of various ML algorithms, including Random Forests, SVMs, and Neural Networks, to identify and validate the optimal approach.
  • Performed in-depth feature importance analysis to identify the key phenomenological properties used by the model for classification, providing critical insights into the underlying model structure.
  • Developed a predictive tool capable of extrapolating from the training data to predict the most probable origin of the MSSM, effectively narrowing a complex search landscape.
  • Co-authored and successfully published the complete methodology and findings in the peer-reviewed journal "Progress of Physics," validating the scientific impact of the results.

Scikit-learn PyTorch TensorFlow Keras LightGBM XGBoost Bash Git
Python R Machine Learning

Kompetenzen

Kompetenzen

Schwerpunkte

GenAi
Experte
LLM
Experte
NLP
Experte
RAG
Experte

Einsatzorte

Einsatzorte

Deutschland
möglich

Projekte

Projekte

3 Monate
2026-01 - 2026-03

Advise and enable the enterprise-wide migration from GitLab/Jenkins to GitHub/GitHub Actions, establishing secure DevOps standards, self-service CI/CD building blocks, and an AI-supported first-level support model.

Senior DevOps Integration Engineer
Senior DevOps Integration Engineer

  • Led the end-to-end migration advisory for delivery teams transitioning from GitLab and Jenkins to GitHub and GitHub Actions, including target operating model, rollout planning, and technical enablement. 
  • Defined and implemented reusable GitHub Actions ?golden path? workflows (build/test/release) aligned to Git Flow, enabling standardized deployments across Maven- and NPM-based services. 
  • Designed IaC patterns and landing-zone conventions using Terraform to provision consistent environments and pipelines within an on-prem Cloud Foundry data center and associated platform services. 
  • Established security-by-default CI/CD controls by integrating Mend, Fortify, and SonarQube scans into pipeline templates, driving compliance-ready automation and traceable quality gates. 
  • Built an AI-assisted support chatbot using Azure AI Foundry to reduce first-level support load, accelerate issue triage, and provide guided answers for common migration and pipeline questions. 
  • Implemented ?docs-as-code? governance: maintained operational/runbook documentation in Markdown, automated publishing to Confluence via GitHub Actions, and introduced review workflows for documentation quality. 
  • Implemented observability foundations by creating Grafana dashboards for pipeline/platform monitoring and operational KPIs, improving transparency for engineering leadership and customer success stakeholders. 
  • Acted as senior integration point between platform, security, and product teams?facilitating workshops, technical decision records, and stakeholder alignment to accelerate adoption and reduce delivery friction. 

Azure AI Foundry Cloud Foundry Terraform GitHub GitHub Actions Git Flow Maven NPM Docker Grafana Confluence Mend Fortify SonarQube
4 Monate
2025-09 - 2025-12

Design and implementation of an intelligent matching platform (Databricks/AI) to align Engineering Ground Truth with Supplier Data.

Lead Software Architect
Lead Software Architect

  • Architected and implemented an Azure Databricks platform to reconcile PDM (Ground Truth) with SRM (Supplier) data, ensuring data integrity for liability and compliance.
  • Engineered advanced matching logic utilizing Neo4j, Vector Embeddings, and LLMs to align discordant data structures and identify critical components (semiconductors, rare earths).
  • Built automated ELT pipelines (Medallion Architecture) orchestrated via Databricks Asset Bundles for reliable daily data processing.
  • Deployed high-quality data products serving both downstream analytics and a custom frontend hosted on Azure App Services.

Azure App Service Atlassian JIRA / Confluence SQL Python AI Search Neo4j Azure Databricks AKS
6 Monate
2025-05 - 2025-10

Development of a scalable Data and DevOps infrastructure for the automated identification and utilization of customer-segmented potentials for product launches.

Senior Data Engineer
Senior Data Engineer

  • Built pipelines to identify and consistently assign customers with the highest potential for purchasing new products and to support product launches.
  • Established a Databricks / DevOps / Azure infrastructure for a team of 12 employees.
  • Created notebooks in a Medallion architecture to extract, load and transform source data (12 sources).
  • Set up orchestration in Azure Data Factory (ADF).
  • Implemented a process to send customer datasets to Emarsys (CRM system).
  • Introduced monitoring and error handling.
  • Deployed an Azure Data Platform (ADF, ADLS Gen3, Functions, DevOps) and Databricks to implement data pipelines, ETL processes and machine learning workflows.

Atlassian Jira Confluence Azure Databricks ADLS Gen3 ADF Functions DevOps Terraform SQL Server Emarsys
7 Monate
2025-01 - 2025-07

Design and build a proof of concept

Senior ML Engineer Python Scrum GenAi ...
Senior ML Engineer

  • Engineered a data pipeline to process and filter over 60,000 PubMed abstracts from an initial corpus of 38 million.
  • Architected a multi-stage extraction system utilizing LLMs for relationship extraction, specialized NER models, and a custom model for mapping entities to normalized gene identifiers.
  • Populated a Neo4j graph database with the extracted entities (genes, proteins) and their relationships to serve as the single source of truth.
  • Developed a web-based UI using Databricks Apps, featuring a conversational AI that answers user queries by generating and executing Cypher queries against the knowledge graph in real-time.
  • Implemented a ?power-user? mode that exposed the underlying Cypher queries and rendered interactive graph visualizations directly in the UI.

Neo4j Git Databricks Jira Confluence (GitHub Actions)
Python Scrum GenAi RAG LLMS SAFe
4 Monate
2025-01 - 2025-04

Formulate and formalize the company-wide AI strategy, providing a clear roadmap for AI adoption, governance, and value creation.

AI Strategy Consultant
AI Strategy Consultant

  • Authored the official, company-wide AI strategy, defining the long-term vision, governance framework, and strategic roadmap for leveraging AI across the organization in alignment with VIG Group directives.

  • Conducted a comprehensive analysis of business processes and led stakeholder workshops to identify, prioritize, and create business cases for high-impact AI initiatives.
  • Established a clear framework for the responsible and ethical use of AI, including guidelines on data privacy, model transparency, and risk management to ensure compliant and secure adoption.
  • Collaborated closely with senior leadership and board advisors to ensure tight integration and synergy between the newly developed AI and Cloud strategies, creating a unified technology vision.

AI Strategy Frameworks AI Governance Responsible AI Risk Assessment Use Case Prioritization Business Case Development Stakeholder Mangagement
1 Monat
2025-01 - 2025-01

Design and implement a robust evaluation framework for an LLM-based agent system to ensure the accuracy and reliability of information extraction from complex reinsurance contracts.

Senior Data Scientist
Senior Data Scientist

  • Architected and built an end-to-end LLM evaluation framework on the Databricks platform to quantitatively measure the performance of an agent-based extraction system for key contractual data points (e.g., inclusions, exclusions, counterparties).
  • Developed interactive Databricks Dashboards and utilized MLflow to track experiment metrics, providing stakeholders with a clear, real-time view of the agent's accuracy and consistency.
  • Executed a proof-of-concept for performance improvement by fine-tuning a Hugging Face model using the QLoRA methodology, demonstrating a significant enhancement in extraction accuracy on domain-specific terminology.
  • The resulting framework provided the basis for a data-driven approach to iteratively improve the LLM agent, ensuring high-quality, reliable outputs for simplifying contract analysis.

Databricks Azure Python PySpark SQL LLMs Hugging Face Transformers QLoRA LLM Evalutation MLflow Delta Lake Unity Catalog Databricks Apps LangChain
2 Jahre
2023-01 - 2024-12

End-to-end architectural responsibility for the company's transition to AI-driven products, leading the technical roadmap from initial PoCs to production-grade Kubernetes deployments.

Lead AI Engineer Python GenAI RAG
Lead AI Engineer

Project 1: 

  • Developed a solution proposal and created a detailed project plan.
  • Designed a pipeline for data preprocessing and provisioning using Azure AI Search
  • AI-assisted optimization of the existing search system by implementing relevance-based hybrid search.
  • Integrated generative AI (GPT-4o) with LangGraph to answer user-specific questions based on help documents.
  • Conducted training and onboarding for employees to ensure the effective use and maintenance of the system.

Project 2:

  • Developed a solution proposal and created a detailed project plan.
  • Designed a pipeline for data preprocessing and provisioning using Azure AI Search
  • AI-assisted optimization of the existing search system by implementing relevance-based hybrid search.
  • Integrated generative AI (GPT-4o) with LangGraph to answer user-specific questions based on help documents.
  • Conducted training and onboarding for employees to ensure the effective use and maintenance of the system.

Project 3:

  • Designed and built a proof-of-concept data pipeline in Azure Data Factory to process unstructured documents (e.g., publisher data, help articles) following the Medallion (Bronze-Silver-Gold) architecture.
  • Orchestrated data transformation workflows by integrating Azure Functions for lightweight processing and Azure Synapse Notebooks for complex, Spark-based transformations of the raw data.
  • Implemented the final data loading stage into Azure AI Search, ensuring the data was properly structured and indexed for downstream RAG applications.
  • Delivered a comprehensive technical evaluation and cost analysis of the ADF-based solution, which directly informed the strategic decision to leverage the existing Kubernetes cluster with Argo Workflows for the final production implementation.

Project 4:

  • Designed and developed deep learning models based on Recurrent Neural Networks (LSTMs, GRUs) to analyze and predict anomalies in multivariate time-series data from server infrastructure.
  • Engineered a data pipeline to ingest and process real-time server metrics, including CPU utilization, RAM usage, and network traffic, leveraging Azure Monitor and Grafana for data sourcing and visualization.
  • Trained and validated the models, achieving a 90% predictive accuracy in identifying critical failure patterns within a simulated test environment.
  • Delivered a comprehensive proof-of-concept that successfully demonstrated technical feasibility and provided key data for a strategic cost-benefit analysis regarding on-premise infrastructure versus cloud migration.

Azure AI Search AKS Kubernetes Terraform Docker Git Argo Workflows LangChain LangGraph Ragas Transformers Hugging Face Elasticsearch Jira Confluence
Python GenAI RAG
2 Jahre 10 Monate
2020-03 - 2022-12

Develop and implement a pipeline for automated image retouching

Data Scientist Python Machine Learning Neural Network ...
Data Scientist

  • Evaluated and validated various approaches to automate image retouching, including the development of a concept for neural networks.

  • Transferred knowledge through training and documentation to enable project partners to apply and further develop the system.

  • Trained and optimized Generative Adversarial Networks (GANs) to implement sub-processes of image editing, such as automated image segmentation.

  • Gathered and preprocessed suitable training data in close collaboration with the companies involved in the project.

PyTorch Git Photoshop Lightroom Flask
Python Machine Learning Neural Network HTML JavaScript CSS Generative Adversarial Networks
5 Monate
2019-10 - 2020-02

Improve the color representation of 3D scanners through calibration and the use of machine learning models.

Data Scientist Python Machine Learning Neural Network ...
Data Scientist

  • Calibrated 3D scanners to ensure a more precise capture of color and depth information.

  • Designed and conducted employee training sessions to facilitate the integration of the new technologies into existing workflows.

  • Developed, trained, and validated machine learning models (including neural networks, XGBoost, and LightGBM) to enhance the color rendering of 3D scans

XGBoost LightGBM Git TensorFlow/keras
Python Machine Learning Neural Network R
1 Jahr 10 Monate
2018-05 - 2020-02

Leverage machine learning to classify and predict the fundamental origin of theoretical physics models from a vast and complex dataset.

Data Scientist Python R Machine Learning
Data Scientist

  • Architected and trained a predictive classification model using boosted decision trees (LightGBM) on a large-scale, highly imbalanced dataset of over 126,000 string theory models.
  • Systematically evaluated and benchmarked the performance of various ML algorithms, including Random Forests, SVMs, and Neural Networks, to identify and validate the optimal approach.
  • Performed in-depth feature importance analysis to identify the key phenomenological properties used by the model for classification, providing critical insights into the underlying model structure.
  • Developed a predictive tool capable of extrapolating from the training data to predict the most probable origin of the MSSM, effectively narrowing a complex search landscape.
  • Co-authored and successfully published the complete methodology and findings in the peer-reviewed journal "Progress of Physics," validating the scientific impact of the results.

Scikit-learn PyTorch TensorFlow Keras LightGBM XGBoost Bash Git
Python R Machine Learning

Kompetenzen

Kompetenzen

Schwerpunkte

GenAi
Experte
LLM
Experte
NLP
Experte
RAG
Experte

Vertrauen Sie auf Randstad

Im Bereich Freelancing
Im Bereich Arbeitnehmerüberlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.