Data Architect & Data Engineer, Databricks, Python, Spark, Azure Data Engineer Associate, GCP, Software Development, BI, Machine Learning Engineering
Aktualisiert am 08.08.2024
Profil
Freiberufler / Selbstständiger
Remote-Arbeit
Verfügbar ab: 01.10.2024
Verfügbar zu: 60%
davon vor Ort: 5%
Python
Apache Spark
SQL
Databricks
Data Lakehouse
BigQuery
Airflow
PostgreSQL
Hadoop
Elastic Search
Bourne-again-shell
Docker
Jenkins
GCP
Flask
Kafka
Azure
Azure Data Factory
Synapse
Kubernetes
OpenShift
AWS
German
native
Serbo-Croatian
native
English
fluent

Einsatzorte

Einsatzorte

Deutschland, Schweiz, Österreich
möglich

Projekte

Projekte

1 Jahr 3 Monate
2023-06 - heute

Databricks Data Lakehouse solution

Data Architect
Data Architect

- Lead the consulting on new data architecture (Data Warehouse vs Data Lakehouse) for a German manufacturing company & facilitated process for adoption of DLH architecture on Azure + Databricks

- Planned and developed a proof of concept logical data warehouse on Azure (Synapse, ADLS)

Databricks
2 Jahre 8 Monate
2022-01 - heute

building own online business

Solo Founder
Solo Founder
  • Designed, built and deployed (VPS: Debian, Nginx) internet scavenger hunt app: (on Request)
  • Currently before beta launch: schedule generation web app based on user?s google calendar availabilities and todos (based on Flask, Redis, Celery, Postgres)  
5 Monate
2023-01 - 2023-05

Monitoring Migration to OpenShift

SRE Engineer and Airflow Architect Python
SRE Engineer and Airflow Architect

  • Automating Kibana Monitor generation as well as generation of Grafana Dashboards
  • Setting up Airflow on OpenShift


Kubernetes OpenShift Elastic Search Airflow
Python
ECC AG / Deutsche Börse AG
1 Jahr 6 Monate
2020-06 - 2021-11

Designed and built a data mart for subscriber activity metrics

Data Engineer
Data Engineer
Zattoo is one of the leading TV streaming providers in Europe and was acquired by TX Ventures. 

  • Designed and built a data mart for subscriber activity metrics as part of company wide effort to consolidate company success metrics (with Airflow and BigQuery). 
  • Contributed to GCP based data warehouse redesign: introducing Kafka, a data lake and BigQuery
  • Set up Airflow as the new main orchestration tool along with best practices as well as a Docker based development environment

Zattoo ? Berlin, Germany
1 Jahr 7 Monate
2018-04 - 2019-10

Designed and implemented a query engine using PySpark

Data Engineer
Data Engineer
Motionlogic was a Deutsche Telekom owned startup offering traffic & location reports. 

  • Designed and implemented a query engine using PySpark, HDFS, Redis & MongoDB to produce individually billable reports which largely expanded the product line 
  • Collaborated closely with Data Science team on quality and performance improvements of the core business algorithm for trip/activity extraction from movement chains and thereby ensured product satisfaction of some of Europe?s largest telecommunications companies. Implemented in PySpark to process more than 3TB per day for on-premise cluster with >1000 cores, 60 servers, >10 TB RAM.

Motionlogic ? Berlin, Germany
7 Monate
2015-04 - 2015-10

Implemented extensive integration

Quality Control Analyst Intern
Quality Control Analyst Intern
i4i is a VC-funded startup providing structured content apps for the life sciences industry to solve compliance.

  • Implemented extensive integration and regression testing in Python and maintained documentation

i4i ? Toronto, Canada

Aus- und Weiterbildung

Aus- und Weiterbildung

Kompetenzen

Kompetenzen

Top-Skills

Python Apache Spark SQL Databricks Data Lakehouse BigQuery Airflow PostgreSQL Hadoop Elastic Search Bourne-again-shell Docker Jenkins GCP Flask Kafka Azure Azure Data Factory Synapse Kubernetes OpenShift AWS

Produkte / Standards / Erfahrungen / Methoden

PROFILE 

Big Data Engineer with 5+ years of international industry experience and a proven track record of designing data intensive pipelines as well as data mining algorithms. I am a Databricks certified Spark developer who is enthusiastic about scalable data architectures that drive measurable business value.  


Skills

Data Engineering

Hadoop (HDFS, YARN), Spark, SQL, BigQuery, Vertica, Postgres, Elasticsearch, MongoDB, Scylla, Celery, Redis, bash scripting, Airflow, Jenkins, Docker, Flask, Django 


Data Science

Tensorflow, Jupyter, gensim, spaCy, Hugging Face, Pandas, Numpy


BI

Tableau 


DevOps

GCP, AWS, Azure, Debian, Ubuntu, Nginx, Terraform  


SIDE PROJECTS & HACKATHONS 

2019 - 2020 

Kunde:  rssBriefing 


Tasks:

Built an briefing web app in Python powered by NLP models: (URL auf Anfrage)


2019 - 2019

Kunde: DEEP BERLIN hackathon 


Tasks:

Contributed to CV object detection and classification team, built a sliding window module


Programmiersprachen

Python
JavaScript
Go

Einsatzorte

Einsatzorte

Deutschland, Schweiz, Österreich
möglich

Projekte

Projekte

1 Jahr 3 Monate
2023-06 - heute

Databricks Data Lakehouse solution

Data Architect
Data Architect

- Lead the consulting on new data architecture (Data Warehouse vs Data Lakehouse) for a German manufacturing company & facilitated process for adoption of DLH architecture on Azure + Databricks

- Planned and developed a proof of concept logical data warehouse on Azure (Synapse, ADLS)

Databricks
2 Jahre 8 Monate
2022-01 - heute

building own online business

Solo Founder
Solo Founder
  • Designed, built and deployed (VPS: Debian, Nginx) internet scavenger hunt app: (on Request)
  • Currently before beta launch: schedule generation web app based on user?s google calendar availabilities and todos (based on Flask, Redis, Celery, Postgres)  
5 Monate
2023-01 - 2023-05

Monitoring Migration to OpenShift

SRE Engineer and Airflow Architect Python
SRE Engineer and Airflow Architect

  • Automating Kibana Monitor generation as well as generation of Grafana Dashboards
  • Setting up Airflow on OpenShift


Kubernetes OpenShift Elastic Search Airflow
Python
ECC AG / Deutsche Börse AG
1 Jahr 6 Monate
2020-06 - 2021-11

Designed and built a data mart for subscriber activity metrics

Data Engineer
Data Engineer
Zattoo is one of the leading TV streaming providers in Europe and was acquired by TX Ventures. 

  • Designed and built a data mart for subscriber activity metrics as part of company wide effort to consolidate company success metrics (with Airflow and BigQuery). 
  • Contributed to GCP based data warehouse redesign: introducing Kafka, a data lake and BigQuery
  • Set up Airflow as the new main orchestration tool along with best practices as well as a Docker based development environment

Zattoo ? Berlin, Germany
1 Jahr 7 Monate
2018-04 - 2019-10

Designed and implemented a query engine using PySpark

Data Engineer
Data Engineer
Motionlogic was a Deutsche Telekom owned startup offering traffic & location reports. 

  • Designed and implemented a query engine using PySpark, HDFS, Redis & MongoDB to produce individually billable reports which largely expanded the product line 
  • Collaborated closely with Data Science team on quality and performance improvements of the core business algorithm for trip/activity extraction from movement chains and thereby ensured product satisfaction of some of Europe?s largest telecommunications companies. Implemented in PySpark to process more than 3TB per day for on-premise cluster with >1000 cores, 60 servers, >10 TB RAM.

Motionlogic ? Berlin, Germany
7 Monate
2015-04 - 2015-10

Implemented extensive integration

Quality Control Analyst Intern
Quality Control Analyst Intern
i4i is a VC-funded startup providing structured content apps for the life sciences industry to solve compliance.

  • Implemented extensive integration and regression testing in Python and maintained documentation

i4i ? Toronto, Canada

Aus- und Weiterbildung

Aus- und Weiterbildung

Kompetenzen

Kompetenzen

Top-Skills

Python Apache Spark SQL Databricks Data Lakehouse BigQuery Airflow PostgreSQL Hadoop Elastic Search Bourne-again-shell Docker Jenkins GCP Flask Kafka Azure Azure Data Factory Synapse Kubernetes OpenShift AWS

Produkte / Standards / Erfahrungen / Methoden

PROFILE 

Big Data Engineer with 5+ years of international industry experience and a proven track record of designing data intensive pipelines as well as data mining algorithms. I am a Databricks certified Spark developer who is enthusiastic about scalable data architectures that drive measurable business value.  


Skills

Data Engineering

Hadoop (HDFS, YARN), Spark, SQL, BigQuery, Vertica, Postgres, Elasticsearch, MongoDB, Scylla, Celery, Redis, bash scripting, Airflow, Jenkins, Docker, Flask, Django 


Data Science

Tensorflow, Jupyter, gensim, spaCy, Hugging Face, Pandas, Numpy


BI

Tableau 


DevOps

GCP, AWS, Azure, Debian, Ubuntu, Nginx, Terraform  


SIDE PROJECTS & HACKATHONS 

2019 - 2020 

Kunde:  rssBriefing 


Tasks:

Built an briefing web app in Python powered by NLP models: (URL auf Anfrage)


2019 - 2019

Kunde: DEEP BERLIN hackathon 


Tasks:

Contributed to CV object detection and classification team, built a sliding window module


Programmiersprachen

Python
JavaScript
Go

Vertrauen Sie auf Randstad

Im Bereich Freelancing
Im Bereich Arbeitnehmerüberlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.