Freelancer: Data and ML Engineering

Freiberufler / Selbstständiger

Remote-Arbeit

Verfügbar ab: 19.11.2024

Verfügbar zu: 100%

davon vor Ort: 40%

Top-Skills

Python

AWS

Data Engineering

GCP

Kubernetes

Kubeflow

Scala

seldon

sql

BigQuery

Postgres

scikit-learn

Apache Spark

pandas

Sprachen

Albanian

English

German

Italian

Einsatzorte

Länder

Deutschland, Schweiz, Österreich

Remote-Arbeit

möglich

Projekte

2 Jahre 4 Monate

2022-12 - heute

Market Share Prediction

Senior Data/ML Engineer GCP Services Kubernetes Kubeflow ...

Rolle

Senior Data/ML Engineer

Projektinhalte

The project aims to create a Machine Learning workflow that helps TV planners in their work using data from multiple sources.

Deploy Machine Learning models to production workflow.
Build reliable and scalable Machine Learning workflows.
Build reliable data ingestion and processing pipelines.
Reduce model response time for real-time predictions.
Build a modern Python package release process.
Optimize CI/CD pipelines and container image building.
Integrated code quality checks.
Build CI pipeline for automated Python package releases to a private registry.

Kenntnisse

GCP Services Kubernetes Kubeflow Airflow DeepChecks Vertex AI Prometheus Grafana Python Argo CD BigQuery MLflow ArgoCD etc

Kunde

TV Broadcaster

5 Monate

2022-08 - 2022-12

Connected Data Analytics

Senior Data Engineer AWS Services Airflow Spark ...

Rolle

Senior Data Engineer

Projektinhalte

Fix, optimize, and improve daily data processing pipelines that suffered from reliability issues and performance bottlenecks. The changes improved data decision-making and developer productivity

Identify performance issues and prepare an optimization roadmap.
Fix data pipeline bottlenecks.
Integrate automated testing and CI/CD pipelines.
Integrated code quality checks.

Kenntnisse

AWS Services Airflow Spark EMR Bamboo SonarQube Scala Python etc.

Kunde

Household Appliances Manufacturer

9 Monate

2021-12 - 2022-08

Data Platform Task Force

Senior Data Engineer GCP Services Airflow Python ...

Rolle

Senior Data Engineer

Projektinhalte

Initially, the data infrastructure was unreliable, with frequent interruptions and delays in data delivery. In this project, we reworked the data infrastructure, data processing, and data storage to improve reliability and performance.

Upgrade Airflow deployment and introduce automated deployments on Kubernetes with Helm.
Implement logging to keep track of events in data pipelines.
Refactor data processing and introduce partitioning.
Decouple processing power and data storage to make data available earlier.
Introduce unit, integration and end-to-end test automation and code formatting.

Kenntnisse

GCP Services Airflow Python dbt Stitch Kubernetes Postgres BigQuery Helm Looker Metabase etc.

Kunde

FitTech

9 Monate

2021-05 - 2022-01

Companion Data

Senior ML Engineer AWS Services Airflow Spark ...

Rolle

Senior ML Engineer

Projektinhalte

After developing a mobile application, the company extracts value from the collected data using technologies like Spark, Airflow, MLflow. Intelligent end-to-end solutions are provided to stakeholder requirements.

Validate and integrate in the architecture, data ingestion frameworks like Airbyte.
Implement automation and ML model training orchestration.
Serve ML models using serverless infrastructure.
Automate data ingestion, and model deployment.
Unit, integration, and end-to-end test automation.

Kenntnisse

AWS Services Airflow Spark Airbyte MLflow Kafka Kafka-connect dbt etc.

Kunde

DIY Retailer

1 Jahr

2020-04 - 2021-03

Sales Gap Analysis based on publicly available datasets and transactional data

Tech Lead Kubernetes Kubeflow Apache Spark ...

Rolle

Tech Lead

Projektinhalte

Using Machine Learning and NLP techniques, to infer what the customers are not buying for the supplier and target these gaps with personalized sales promotions. The product is used to assist Sales Manages in increasing basket sizes for their customers.

Starting from locally running MVP, move to a production solution running on Kubernetes on GCP Cloud.
Lead implementation of incremental performance improvements, to fit the demands of production load.
Spearheaded the implementation of modern ML application deployment technologies and frameworks to the project.
Automate data ingestion, model training, evaluation and deployment.
Unit, integration and end-to-end test automation.

Kenntnisse

Kubernetes Kubeflow Apache Spark Python BigQuery Helm scikit-learn Docker

Kunde

Multinational wholesale chain

Einsatzort

Düsseldorf

5 Monate

2019-11 - 2020-03

Precalculation Layer in Analytics Platform

Senior Data Engineer Kubernetes SQL Python ...

Rolle

Senior Data Engineer

Projektinhalte

In stock exchange companies, the volume of collected data increases daily. Also market value for these data increases. Considering these dynamic features of the project scope, we created an architecture where historical data could be provided to users and allow them to enrich this data in their own scope.

Creating an architecture for fast and cost-efficient access to historical data.
Development of data transformation strategy to persist extensions of precalculation layer.
Development of schema evolution strategy.

Kenntnisse

Kubernetes SQL Python Presto Scala Apache Spark AWS S3 Hive Docker Spark

Kunde

Stock Exchange Broker

Einsatzort

Frankfurt am Main

4 Monate

2019-07 - 2019-10

Processing of Cash Register Data

Senior Data Engineer GCP Python BigQuery ...

Rolle

Senior Data Engineer

Projektinhalte

Project aims to create a decoupled architecture to process and persist data on a daily basis, collected into GCP. Also, in the scope of this project, we defined the CI/CD strategy for all the components of the data pipeline.

Create an ingestion layer to make data available in BigQuery on a daily basis.
Create the CI/CD pipeline for all components.
Spearheaded the implementation of modern ML application deployment technologies and frameworks to the project.
E2E-Integration testing (design, data generation, execution)
Documentation

Kenntnisse

GCP Python BigQuery Docker Gitlab Java Node.js Google Cloud Platform (GCS Cloud Function Dataflow App Engine Big Query) GitLab CI/CD

Kunde

Retailer

Einsatzort

Salzburg

1 Jahr 3 Monate

2018-05 - 2019-07

Metadata and data management application on top of a Data lake environment

Senior Data Engineer Scala Python Apache Spark ...

Rolle

Senior Data Engineer

Projektinhalte

Create a self-service framework for ingesting data from different sources and preprocessing, keeping in mind requirements from the end consumer of these data. The framework is designed to work in a cloud-native environment. A set of HTTP Rest APIs was made available to users to provide necessary functionality around on-demand AWS EMR clusters.

Development of data transformation and aggregation algorithms (Scala) to facilitate new load types.
Development of the metadata management and data transformation orchestration application, written in Python, to support new destination environments.
Development of unit, integration, and end-to-end test for all components of the application, Scala and Python.

Kenntnisse

Scala Python Apache Spark Jenkins Bitbucket AWS S3 AWS EMR Docker Hive Spark

Kunde

Globals Sports Brand

Einsatzort

Herzogenaurath

1 Jahr 5 Monate

2017-01 - 2018-05

Information Marketplace

Data Engineer Apache Spark Docker Python ...

Rolle

Data Engineer

Projektinhalte

Implemented a Big Data Platform for ingestion, storage, and processing of batch and real-time data from a variety of data sources within the customer.
As part of that, a centralized search engine was designed and implemented for company-wide knowledge and documentation of data sources.
In another project, we aimed to transfer the persistence layer of one analytic application from Oracle to Impala.
Data was provided on a regular basis, and the result of processing with Spark and Scala was a set of tables with more than 10000 columns.
These jobs were scheduled on Apache Airflow.

Kenntnisse

Apache Spark Docker Python Scala Airflow Hive Kafka Spark Apache Airflow GitLab CI Sqoop Mesos Marathon Elastic Search Java

Kunde

Truck Manufacturer

Einsatzort

Munich

2 Jahre 10 Monate

2010-10 - 2013-07

Various Projects

Full-Stack Software Developer .Net MS SQL Server Crystal Reports

Rolle

Full-Stack Software Developer

Projektinhalte

Full-Stack software Developer in several projects used by up to 1500 users.

Kenntnisse

.Net MS SQL Server Crystal Reports

Kunde

Albanian Post

Aus- und Weiterbildung

2 Jahre 8 Monate

2013-10 - 2016-05

Software Systems Engineering

Master of Science, RWTH Aachen

Abschluss

Master of Science

Institution, Ort

RWTH Aachen

Schwerpunkt

Data and Information Management

2 Jahre 10 Monate

2010-10 - 2013-07

Computer Science, Bachelor of Science

Graduated with Honors, University of Tirana, Tirana, Albania

Abschluss

Graduated with Honors

Institution, Ort

University of Tirana, Tirana, Albania

Position

Data Engineer or Machine Learning Engineer building production grade data pipeline or model training and deployment pipelines

Kompetenzen

Top-Skills

Python AWS Data Engineering GCP Kubernetes Kubeflow Scala seldon sql BigQuery Postgres scikit-learn Apache Spark pandas

Produkte / Standards / Erfahrungen / Methoden

PROFILE

Experienced technology consultant with a wide range of skills and cross-industry know-how. Focused on designing and implementing customized production grade solutions in Data and Machine Learning Engineering space.

PROFESSIONAL SUMMARY

12+ years of experience in Software Engineering
8+ years of experience in Data and Information Management
6+ years of experience in Cloud
Productionazing data pipelines
Deploy user facing ML use cases to production
Optimizing for cost and time Machine Learning workflows
Automating all the steps in Machine Learning workflows
Modernizing data ingestion and data quality checking workflows
Lead projects with up to 6 team members
Certified Cloud Solutions Architect

TOOLS

Machine Learning:

scikit?learn
pandas
seldon?core
Kubeflow
MLflow
Ray

DBMS:

BigQuery
Presto
Hive
Redshift
MySQL
MongoDB
dbt
Postgres

Big Data:

Spark
Kafka
ElasticSearch
Airbyte
Stitch

Cloud Technologies:

GCP
AWS
Azure

Platform:

Kubernetes
Helm
Docker
Terraform
sops
ArgoCD

Orchestration:

Airflow
Kubeflow

CI/CD:

Jenkins
GitLab CI/CD
Bamboo

Methodologies:

Scrum
Kanban

WORK EXPERIENCE:

03/2021 ? today

Place of Work: Munich, Germany

Role: Senior Consultant, ML/Data Engineer

Customer: Freelance

12/2016 ? 03/2021

Place of Work: Munich, Germany

Role: Senior Consultant, ML/Data Engineer

Customer: Data Reply

EARLIER WORK EXPERIENCES:

01/2014 ? 07/2015

Role: Student Helper

Customer: RWTH Aachen University, Informatik Zentrum, Lehrstuhl für Informatik 5

Tasks:

Development of Web Services to allow storing and annotating multimedia data using relational and non-relational databases in microservice architecture.

09/2010 ? 08/2013

Role: Software Developer

Customer: Helius Systems, Tirana, Albania

Tasks:

Full-stack software developer in several projects used by up to 1500 users. Languages used: VB, SQL, C# and ASP.NET, Crystal Reports.

Cloud Providers:

AWS
GCP
Azure

PERSONAL PROJECTS

2023 - 2023:

cost-efficient-gpu-platform

Ray, Python, Kubernetes, Kubeflow, KubeRay, Helm
Build a Multi-GPU Kubernetes cluster to run distribute machine learning training with Ray and Kubeflow.

2022 - 2022:

ab-testing-for-mlops

Python, Kubernetes, Helm, Seldon-core, Grafana, Prometheus, Streamlit
Implement A/B Testing in a microservice architecture for MLOps.

2021 - 2021:

kubeflow-spark

Python, Kubeflow, Spark, Kubernetes, Helm
Orchestrate Spark Jobs using Kubeflow.

2021 - 2021:

spark-dockerfile-multi-stage

Spark, Docker
Multi stage Dockerfile for Spark jobs.

2021 - 2021:

analytics-platform-diy

Python, MySQL, Hive, Trino, Spark, Kubernetes
Deploying an analytics platform on a Kubernetes cluster with automated deployment tools.

2020 - 2020:

immoscout-bot

Python, Flask, chat bot
Build a chat bot system that automatically crawls different websites to find apartments, and quickly send notifications in Telegram.

Betriebssysteme

AWS

GCP

Programmiersprachen

Docker

Helm

Java

Jenkins

Kubeflow

Kubernetes

Python

Scala

scikit-learn

Spark

SQL

Datenbanken

Apache Spark

AWS S3

BigQuery

Hive

MongoDB

MySQL

Presto

Redshift

SQL

Branchen

Retail
Finance
Automotive
Manufacturing
Sports Brand
FitTech
HealthTech

Einsatzorte

Länder

Deutschland, Schweiz, Österreich

Remote-Arbeit

möglich

Projekte

2 Jahre 4 Monate

2022-12 - heute

Market Share Prediction

Senior Data/ML Engineer GCP Services Kubernetes Kubeflow ...

Rolle

Senior Data/ML Engineer

Projektinhalte

The project aims to create a Machine Learning workflow that helps TV planners in their work using data from multiple sources.

Deploy Machine Learning models to production workflow.
Build reliable and scalable Machine Learning workflows.
Build reliable data ingestion and processing pipelines.
Reduce model response time for real-time predictions.
Build a modern Python package release process.
Optimize CI/CD pipelines and container image building.
Integrated code quality checks.
Build CI pipeline for automated Python package releases to a private registry.

Kenntnisse

GCP Services Kubernetes Kubeflow Airflow DeepChecks Vertex AI Prometheus Grafana Python Argo CD BigQuery MLflow ArgoCD etc

Kunde

TV Broadcaster

5 Monate

2022-08 - 2022-12

Connected Data Analytics

Senior Data Engineer AWS Services Airflow Spark ...

Rolle

Senior Data Engineer

Projektinhalte

Fix, optimize, and improve daily data processing pipelines that suffered from reliability issues and performance bottlenecks. The changes improved data decision-making and developer productivity

Identify performance issues and prepare an optimization roadmap.
Fix data pipeline bottlenecks.
Integrate automated testing and CI/CD pipelines.
Integrated code quality checks.

Kenntnisse

AWS Services Airflow Spark EMR Bamboo SonarQube Scala Python etc.

Kunde

Household Appliances Manufacturer

9 Monate

2021-12 - 2022-08

Data Platform Task Force

Senior Data Engineer GCP Services Airflow Python ...

Rolle

Senior Data Engineer

Projektinhalte

Upgrade Airflow deployment and introduce automated deployments on Kubernetes with Helm.
Implement logging to keep track of events in data pipelines.
Refactor data processing and introduce partitioning.
Decouple processing power and data storage to make data available earlier.
Introduce unit, integration and end-to-end test automation and code formatting.

Kenntnisse

GCP Services Airflow Python dbt Stitch Kubernetes Postgres BigQuery Helm Looker Metabase etc.

Kunde

FitTech

9 Monate

2021-05 - 2022-01

Companion Data

Senior ML Engineer AWS Services Airflow Spark ...

Rolle

Senior ML Engineer

Projektinhalte

Validate and integrate in the architecture, data ingestion frameworks like Airbyte.
Implement automation and ML model training orchestration.
Serve ML models using serverless infrastructure.
Automate data ingestion, and model deployment.
Unit, integration, and end-to-end test automation.

Kenntnisse

AWS Services Airflow Spark Airbyte MLflow Kafka Kafka-connect dbt etc.

Kunde

DIY Retailer

1 Jahr

2020-04 - 2021-03

Sales Gap Analysis based on publicly available datasets and transactional data

Tech Lead Kubernetes Kubeflow Apache Spark ...

Rolle

Tech Lead

Projektinhalte

Starting from locally running MVP, move to a production solution running on Kubernetes on GCP Cloud.
Lead implementation of incremental performance improvements, to fit the demands of production load.
Spearheaded the implementation of modern ML application deployment technologies and frameworks to the project.
Automate data ingestion, model training, evaluation and deployment.
Unit, integration and end-to-end test automation.

Kenntnisse

Kubernetes Kubeflow Apache Spark Python BigQuery Helm scikit-learn Docker

Kunde

Multinational wholesale chain

Einsatzort

Düsseldorf

5 Monate

2019-11 - 2020-03

Precalculation Layer in Analytics Platform

Senior Data Engineer Kubernetes SQL Python ...

Rolle

Senior Data Engineer

Projektinhalte

Creating an architecture for fast and cost-efficient access to historical data.
Development of data transformation strategy to persist extensions of precalculation layer.
Development of schema evolution strategy.

Kenntnisse

Kubernetes SQL Python Presto Scala Apache Spark AWS S3 Hive Docker Spark

Kunde

Stock Exchange Broker

Einsatzort

Frankfurt am Main

4 Monate

2019-07 - 2019-10

Processing of Cash Register Data

Senior Data Engineer GCP Python BigQuery ...

Rolle

Senior Data Engineer

Projektinhalte

Create an ingestion layer to make data available in BigQuery on a daily basis.
Create the CI/CD pipeline for all components.
Spearheaded the implementation of modern ML application deployment technologies and frameworks to the project.
E2E-Integration testing (design, data generation, execution)
Documentation

Kenntnisse

GCP Python BigQuery Docker Gitlab Java Node.js Google Cloud Platform (GCS Cloud Function Dataflow App Engine Big Query) GitLab CI/CD

Kunde

Retailer

Einsatzort

Salzburg

1 Jahr 3 Monate

2018-05 - 2019-07

Metadata and data management application on top of a Data lake environment

Senior Data Engineer Scala Python Apache Spark ...

Rolle

Senior Data Engineer

Projektinhalte

Development of data transformation and aggregation algorithms (Scala) to facilitate new load types.
Development of the metadata management and data transformation orchestration application, written in Python, to support new destination environments.
Development of unit, integration, and end-to-end test for all components of the application, Scala and Python.

Kenntnisse

Scala Python Apache Spark Jenkins Bitbucket AWS S3 AWS EMR Docker Hive Spark

Kunde

Globals Sports Brand

Einsatzort

Herzogenaurath

1 Jahr 5 Monate

2017-01 - 2018-05

Information Marketplace

Data Engineer Apache Spark Docker Python ...

Rolle

Data Engineer

Projektinhalte

Implemented a Big Data Platform for ingestion, storage, and processing of batch and real-time data from a variety of data sources within the customer.
As part of that, a centralized search engine was designed and implemented for company-wide knowledge and documentation of data sources.
In another project, we aimed to transfer the persistence layer of one analytic application from Oracle to Impala.
Data was provided on a regular basis, and the result of processing with Spark and Scala was a set of tables with more than 10000 columns.
These jobs were scheduled on Apache Airflow.

Kenntnisse

Apache Spark Docker Python Scala Airflow Hive Kafka Spark Apache Airflow GitLab CI Sqoop Mesos Marathon Elastic Search Java

Kunde

Truck Manufacturer

Einsatzort

Munich

2 Jahre 10 Monate

2010-10 - 2013-07

Various Projects

Full-Stack Software Developer .Net MS SQL Server Crystal Reports

Rolle

Full-Stack Software Developer

Projektinhalte

Full-Stack software Developer in several projects used by up to 1500 users.

Kenntnisse

.Net MS SQL Server Crystal Reports

Kunde

Albanian Post

Aus- und Weiterbildung

2 Jahre 8 Monate

2013-10 - 2016-05

Software Systems Engineering

Master of Science, RWTH Aachen

Abschluss

Master of Science

Institution, Ort

RWTH Aachen

Schwerpunkt

Data and Information Management

2 Jahre 10 Monate

2010-10 - 2013-07

Computer Science, Bachelor of Science

Graduated with Honors, University of Tirana, Tirana, Albania

Abschluss

Graduated with Honors

Institution, Ort

University of Tirana, Tirana, Albania

Position

Data Engineer or Machine Learning Engineer building production grade data pipeline or model training and deployment pipelines

Kompetenzen

Top-Skills

Python AWS Data Engineering GCP Kubernetes Kubeflow Scala seldon sql BigQuery Postgres scikit-learn Apache Spark pandas

Produkte / Standards / Erfahrungen / Methoden

PROFILE

Experienced technology consultant with a wide range of skills and cross-industry know-how. Focused on designing and implementing customized production grade solutions in Data and Machine Learning Engineering space.

PROFESSIONAL SUMMARY

12+ years of experience in Software Engineering
8+ years of experience in Data and Information Management
6+ years of experience in Cloud
Productionazing data pipelines
Deploy user facing ML use cases to production
Optimizing for cost and time Machine Learning workflows
Automating all the steps in Machine Learning workflows
Modernizing data ingestion and data quality checking workflows
Lead projects with up to 6 team members
Certified Cloud Solutions Architect

TOOLS

Machine Learning:

scikit?learn
pandas
seldon?core
Kubeflow
MLflow
Ray

DBMS:

BigQuery
Presto
Hive
Redshift
MySQL
MongoDB
dbt
Postgres

Big Data:

Spark
Kafka
ElasticSearch
Airbyte
Stitch

Cloud Technologies:

GCP
AWS
Azure

Platform:

Kubernetes
Helm
Docker
Terraform
sops
ArgoCD

Orchestration:

Airflow
Kubeflow

CI/CD:

Jenkins
GitLab CI/CD
Bamboo

Methodologies:

Scrum
Kanban

WORK EXPERIENCE:

03/2021 ? today

Place of Work: Munich, Germany

Role: Senior Consultant, ML/Data Engineer

Customer: Freelance

12/2016 ? 03/2021

Place of Work: Munich, Germany

Role: Senior Consultant, ML/Data Engineer

Customer: Data Reply

EARLIER WORK EXPERIENCES:

01/2014 ? 07/2015

Role: Student Helper

Customer: RWTH Aachen University, Informatik Zentrum, Lehrstuhl für Informatik 5

Tasks:

Development of Web Services to allow storing and annotating multimedia data using relational and non-relational databases in microservice architecture.

09/2010 ? 08/2013

Role: Software Developer

Customer: Helius Systems, Tirana, Albania

Tasks:

Full-stack software developer in several projects used by up to 1500 users. Languages used: VB, SQL, C# and ASP.NET, Crystal Reports.

Cloud Providers:

AWS
GCP
Azure

PERSONAL PROJECTS

2023 - 2023:

cost-efficient-gpu-platform

Ray, Python, Kubernetes, Kubeflow, KubeRay, Helm
Build a Multi-GPU Kubernetes cluster to run distribute machine learning training with Ray and Kubeflow.

2022 - 2022:

ab-testing-for-mlops

Python, Kubernetes, Helm, Seldon-core, Grafana, Prometheus, Streamlit
Implement A/B Testing in a microservice architecture for MLOps.

2021 - 2021:

kubeflow-spark

Python, Kubeflow, Spark, Kubernetes, Helm
Orchestrate Spark Jobs using Kubeflow.

2021 - 2021:

spark-dockerfile-multi-stage

Spark, Docker
Multi stage Dockerfile for Spark jobs.

2021 - 2021:

analytics-platform-diy

Python, MySQL, Hive, Trino, Spark, Kubernetes
Deploying an analytics platform on a Kubernetes cluster with automated deployment tools.

2020 - 2020:

immoscout-bot

Python, Flask, chat bot
Build a chat bot system that automatically crawls different websites to find apartments, and quickly send notifications in Telegram.

Betriebssysteme

AWS

GCP

Programmiersprachen

Docker

Helm

Java

Jenkins

Kubeflow

Kubernetes

Python

Scala

scikit-learn

Spark

SQL

Datenbanken

Apache Spark

AWS S3

BigQuery

Hive

MongoDB

MySQL

Presto

Redshift

SQL

Branchen

Retail
Finance
Automotive
Manufacturing
Sports Brand
FitTech
HealthTech

Vertrauen Sie auf Randstad

Im Bereich Freelancing

Im Bereich Arbeitnehmerüberlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Name E-Mail-Adresse Ihre Frage

Telefonnummer Unternehmen

Ich habe die Datenschutzbestimmungen gelesen und bin damit einverstanden.

Einsatzorte

Projekte

Aus- und Weiterbildung

Position

Kompetenzen

Top-Skills

Produkte / Standards / Erfahrungen / Methoden

Betriebssysteme

Programmiersprachen

Datenbanken

Branchen

Einsatzorte

Projekte

Aus- und Weiterbildung

Position

Kompetenzen

Top-Skills

Produkte / Standards / Erfahrungen / Methoden

Betriebssysteme

Programmiersprachen

Datenbanken

Branchen

Vertrauen Sie auf Randstad

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.