Freelancer: Data & AI Software Engineer: Java/Python/Scala, Spring Boot, AWS, Docker, Airflow, AI/Machine Learning, Algorithms, Apache Spark

Freiberufler / Selbstst�ndiger

Remote-Arbeit

Verf�gbar ab: 03.06.2024

Verf�gbar zu: 90%

davon vor Ort: 5%

Top-Skills

Java

Machine Learning

Python

Scala

Docker

Artifical Intelligence

Algorithms

Design Pattern

Apache Spark

AWS

OOP

Clean Code

Agile Softwareentwicklung

Spring Boot

Sprachen

deutsch

englisch

Einsatzorte

L�nder

Deutschland, �sterreich, Schweiz

Remote-Arbeit

m�glich

Projekte

Selected project list - the complete list will be made available upon request.

Senior Machine Learning Engineer / Consultant, N.N.

07/2022 - 09/2022, Karlsruhe, Germany

o In depth analysis, evaluation and improvements of the current onboarding process for remote workers

o Need analysis of automated testing in real-time large-scale machine learning data pipelines

o Evaluation of the level and quality of important internal technical documentation

o Device efficient knowledge transfer strategies between the Data Science and Software Engineering teams

o Gradient boosted-tree (xGBoost), K-means clustering, Pandas, SQLAlchemy, GCP, BigQuery, Kubeflow, Apache Spark, Scala, Java 15, Python, podman, gitLab, git, slack, Jira, Confluence, IntelliJ, MacOS

Scientific Full Stack Developer/Consultant, Covestro Digital R&D

07/2020 ? 11/2020 Leverkusen, Germany

Portation of existing R&D high-performance compute scripts from Bash/Slurm to Python3
Containerization of these R&D high-performance Python codes using Docker/Podman for use in the AWS Cloud
Compilation of HPC code packages like e.g. LAMMPS, Gaussian etc in various compute platforms
Exploration of Apache Airflow as orchestration tool in a complex automated multi-scale quantum chemistry workflow
Consulting with respect to professional software development practices in HPC environments: git, gitLab, coding conventions, testing, DevOps (CI/CD) concepts etc
Python 3.x, Java, AWS Cloud, AWS Batch, Anaconda 3, Apache Airflow, Jupyter, Docker, Podman, Kubernetes (minikube, microK8s), git, gitLab, CentOS

Big Data Full Stack Developer, N.N.
03/2019 ? 09/2019 M�nchen, Germany

Evaluation of existing machine-learning/ETL pipeplines to an Apache Airflow-based system
Exploration of the python package Dask/Numba for parallel machine learning on Big Data sets
Evaluation of Apache Arrow as fast in-memory Big Data processing layer in heterogeneous Hadoop/ Spark analytics pipelines
Quality and performance evaluation of various novel machine learning algorithms like LightGMB (decision trees) and Genetic Programming gplearn (Symbolic Regression) etc
Eploration/setup of Docker-based PySpark Juypter notebook containters for use in Hadoop/Spark clusters
Python 3.7, Java, PySpark, C++, Anaconda 3, Airflow, Apache Arrow, Jupyter, Spyder, Docker/Podman, Kubernetes (minikube), git, github, Ubuntu 18.04, LTS

Big Data Science Consultant, N.N. AG
01/2018 ? 12/2018 M�nchen, Germany

Consulting/Evaluation in the area of Big Data Engineering, Search and Analytic of heterogenous data sets in a combined Hadoop-Spark + Elasticsearch cluster environment
Elasticsearch, Hadoop + Spark 2.2.0, Cloudera 5.14, Elasticsearch-Hadoop connector, SparkR, sparklyr, Apache Zeppelin, Jupyter, Java, Scala, R, Python, JSON, git, JIRA, Confluence, SCRUM
extensive benchmarking and performance optimization of various Big Data ETL, data engineering, data analysis and data visualization use cases
explicit exploration and performance benchmark of the Elasticsearch-Hadoop (EH) connector for use in a combined EH Big Data analysis platform
exploration of generated Scala code submission onto a Apache Spark cluster using a programmatic API (Apache Livy)
Consulting in using/employing Artificial Intelligence/Machine Learning ( AI / ML ) algorithms within exisiting R&D projects

Big Data Scientist & Machine Learning Software Engineer, Voith Digital Solutions
10/2016 - 04/2017 Heidenheim, Germany

Development of a large-scale Internet-of-Things (IoT, Industrie 4.0) platform using the Hadoop stack: Cloudera 5.9, HDFS, Apache Spark Streaming & MLlib, HBase, Impala, Python, Pandas, Apache Kafka, Hue, Java 8, Spring Boot, Scala, JAXP, git, IntelliJ, JIRA, SCRUM etc
Porting a complex outlier detection analysis algorithm in real-time for sensor-based time series data from Python scripts to object-oriented Java 8 within the Lambda architecture paradigm.
Performance optimization of the Java-based machine-learning algorithm on the Hadoop cluster
Consulting in various in-house Hadoop/Data Science/Machine-learning projects

Einsatzorte

L�nder

Deutschland, �sterreich, Schweiz

Remote-Arbeit

m�glich

Projekte

Selected project list - the complete list will be made available upon request.

Senior Machine Learning Engineer / Consultant, N.N.

07/2022 - 09/2022, Karlsruhe, Germany

o In depth analysis, evaluation and improvements of the current onboarding process for remote workers

o Need analysis of automated testing in real-time large-scale machine learning data pipelines

o Evaluation of the level and quality of important internal technical documentation

o Device efficient knowledge transfer strategies between the Data Science and Software Engineering teams

o Gradient boosted-tree (xGBoost), K-means clustering, Pandas, SQLAlchemy, GCP, BigQuery, Kubeflow, Apache Spark, Scala, Java 15, Python, podman, gitLab, git, slack, Jira, Confluence, IntelliJ, MacOS

Scientific Full Stack Developer/Consultant, Covestro Digital R&D

07/2020 ? 11/2020 Leverkusen, Germany

Portation of existing R&D high-performance compute scripts from Bash/Slurm to Python3
Containerization of these R&D high-performance Python codes using Docker/Podman for use in the AWS Cloud
Compilation of HPC code packages like e.g. LAMMPS, Gaussian etc in various compute platforms
Exploration of Apache Airflow as orchestration tool in a complex automated multi-scale quantum chemistry workflow
Consulting with respect to professional software development practices in HPC environments: git, gitLab, coding conventions, testing, DevOps (CI/CD) concepts etc
Python 3.x, Java, AWS Cloud, AWS Batch, Anaconda 3, Apache Airflow, Jupyter, Docker, Podman, Kubernetes (minikube, microK8s), git, gitLab, CentOS

Big Data Full Stack Developer, N.N.
03/2019 ? 09/2019 M�nchen, Germany

Evaluation of existing machine-learning/ETL pipeplines to an Apache Airflow-based system
Exploration of the python package Dask/Numba for parallel machine learning on Big Data sets
Evaluation of Apache Arrow as fast in-memory Big Data processing layer in heterogeneous Hadoop/ Spark analytics pipelines
Quality and performance evaluation of various novel machine learning algorithms like LightGMB (decision trees) and Genetic Programming gplearn (Symbolic Regression) etc
Eploration/setup of Docker-based PySpark Juypter notebook containters for use in Hadoop/Spark clusters
Python 3.7, Java, PySpark, C++, Anaconda 3, Airflow, Apache Arrow, Jupyter, Spyder, Docker/Podman, Kubernetes (minikube), git, github, Ubuntu 18.04, LTS

Big Data Science Consultant, N.N. AG
01/2018 ? 12/2018 M�nchen, Germany

Consulting/Evaluation in the area of Big Data Engineering, Search and Analytic of heterogenous data sets in a combined Hadoop-Spark + Elasticsearch cluster environment
Elasticsearch, Hadoop + Spark 2.2.0, Cloudera 5.14, Elasticsearch-Hadoop connector, SparkR, sparklyr, Apache Zeppelin, Jupyter, Java, Scala, R, Python, JSON, git, JIRA, Confluence, SCRUM
extensive benchmarking and performance optimization of various Big Data ETL, data engineering, data analysis and data visualization use cases
explicit exploration and performance benchmark of the Elasticsearch-Hadoop (EH) connector for use in a combined EH Big Data analysis platform
exploration of generated Scala code submission onto a Apache Spark cluster using a programmatic API (Apache Livy)
Consulting in using/employing Artificial Intelligence/Machine Learning ( AI / ML ) algorithms within exisiting R&D projects

Big Data Scientist & Machine Learning Software Engineer, Voith Digital Solutions
10/2016 - 04/2017 Heidenheim, Germany

Development of a large-scale Internet-of-Things (IoT, Industrie 4.0) platform using the Hadoop stack: Cloudera 5.9, HDFS, Apache Spark Streaming & MLlib, HBase, Impala, Python, Pandas, Apache Kafka, Hue, Java 8, Spring Boot, Scala, JAXP, git, IntelliJ, JIRA, SCRUM etc
Porting a complex outlier detection analysis algorithm in real-time for sensor-based time series data from Python scripts to object-oriented Java 8 within the Lambda architecture paradigm.
Performance optimization of the Java-based machine-learning algorithm on the Hadoop cluster
Consulting in various in-house Hadoop/Data Science/Machine-learning projects

Vertrauen Sie auf Randstad

Im Bereich Freelancing

Im Bereich Arbeitnehmer�berlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Name E-Mail-Adresse Ihre Frage

Telefonnummer Unternehmen

Ich habe die Datenschutzbestimmungen gelesen und bin damit einverstanden.

Einsatzorte

Einsatzorte

Projekte

Projekte

Einsatzorte

Einsatzorte

Projekte

Projekte

Vertrauen Sie auf Randstad

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.