SENIOR DATA SCIENTIST / build machine learning applications
Aktualisiert am 02.04.2020
Profil
Mitarbeiter eines Dienstleisters
Verfügbar ab: 01.05.2020
Verfügbar zu: 100%
davon vor Ort: 100%
Skill-Profil eines fest angestellten Mitarbeiters des Dienstleisters
Chinese
Beginner
English
Native
German
Intermediate
Japanese
Fluent

Einsatzorte

Einsatzorte

Deutschland
nicht möglich

Projekte

Projekte

24 Jahre 3 Monate
2000-01 - heute

Predict the tool breakages

Senior Data Scientist MQTT AWS S3 AWS EMR ...
Senior Data Scientist

To predict the tool breakages in a cold forming machinery line for a Tier 2 Automobile Supplier by collecting the relevant and fruitful data from various IT systems such as SAP ERP, PP, WM etc and colle cting sensors data from the Edge Devices.

  • Conceptualized a Semantic Business Rule Engine for Edge devices so that huge chunk of processing can be done on edge device itself and creating only fruitful data for analytics purpose
  • Implemented the rules to harmonize the data in data lake using ETL tools such as Hive and Kafka
  • Worked on generating the ranking of various input parameters for given business use case by utilizing the algorithms such as Boruta, RFE, BE etc
  • Helped plant users to have clea r visibility about the scrap coming out of machine and further giving them recommendations to reduce the same
MQTT AWS S3 AWS EMR Lambda Boruta RFE FFT Transformation XGB Grafana
Automobile Industry
4 Monate
2019-10 - 2020-01

Developed AI solution to optimize ambient conditions to increase sales

Senior Data Scientist FIware Spark Hadoop Hbase ...
Senior Data Scientist
  • Gathered requirements of retail domain experts and set the expectations in terms of FP, FN, TP, TN
  • Optimized SQL queries using complex joins to reduce data retrieval time
  • Implemented the data pipeline for computing KPIs from consumer data base, environmental sensors. Computing device and log data
  • Implemented data preparation and parallelized mass training of forecast models (multiple 100k models trained)
  • Built custom dashboard for quick assessment by controllers
FIware Spark Hadoop Hbase Solr Apache Range Scikit Optimize Tableau
Retail
8 Monate
2018-11 - 2019-06

Vorhersage des Qualitätsstatus von Aluminiumoxid

Senior Data Scientist H2O Scikit Data Shift Algorithms ...
Senior Data Scientist

To predict the quality status of Alumina’s (AL2O3) Specific Surface Area (SSA) for upcoming batches and taking corrective action on input process parameters to prevent and reduce the number of faults (slides) and anomalies.

  • Created data pipelines and workflows to extract data from machines like KILN, Combin Rotators, Cooling Tower, Multiclone and Polyclone Precipitators, Pan Filter
  • Implemented Mahalanobis Distance function based Data Cleanup Algorithms given the data distribution
  • Selected decision tree based algorithms by analyzing the distribution of given dataset and further Benchmarked Scikit based RF, XGB and H2O based XGB algorithms to come up with hi gh accuracy deployable model
  • Built a 2 hours in advance prediction of SSA quality attributes and ranking of input process parameters in the order in which they influence a given quality attribute 
H2O Scikit Data Shift Algorithms Apache NiFi Kafka Stream Analytics
Chemical Industry
6 Monate
2018-06 - 2018-11

Predicting the quality status of aluminium coil

Senior Data Scientist Azure Bucket Python Pandas ...
Senior Data Scientist

Predicting the quality status of aluminium coil coming from TCF Casting Machine for upcoming batches and taking corrective action on input process parameters to reduce the number of defective pieces

  • Gathered requirements of metal domain experts and collecting the data from machines like twin chamber furnace, caster, m illing, VDU units, and further from quality lab test data
  • Researched Bayesian Optimization for the given data and setup open source implementations thereof
  • Created GLM models to handle scaling of various process parameters data
  • Built web app for visualizing experiment data and surrogate function as well as proposing new formulations to test next
Azure Bucket Python Pandas Plotly Dash Scikit Optimize GPyOpt Grafana
Aluminium Foundry
6 Monate
2017-12 - 2018-05

Predicting the maintenance of electric motors to increase uptime and reduce inventory costs

Senior Data Scientist SCADA PHP Myadmin POJO/MOJO ...
Senior Data Scientist
  • Gathered requirements of printing press domain experts and set the expectations in terms of FP, FN, TP, TN
  • Developed algorithms to identify anomalies and investigate patterns in the existing parameters like motor speed, voltage etc.
  • Developed AI engine using machine learning algorithms for predictive analytics with evaluation and risk matrix
  • Built customized dashboard and mobile based application for quick and easy access to the data and KPIs
SCADA PHP Myadmin POJO/MOJO ARM Cortex EDGE AI
Printing Industry
7 Monate
2017-06 - 2017-12

Reduce inventory cost

Data Scientist R language Airflow Kibana ...
Data Scientist

To reduce inventory cost by predicting whether OEM needs a backup battery in the truck or not.

  • Collecting requirements from asset management company which gives trucks on a rental model basis
  • Created multivariate regression model to predict the health of battery by using parameters such as distance covered, fuel consumption in eco/start mode, battery current, voltage etc.
  • Deployed the model in real time on OEM dashboard showing the health of bat tery of all trucks in the fleet
R language Airflow Kibana On premise Intel servers and technology
Original equipment manufacturers
5 Monate
2016-12 - 2017-04

Optimize long term and intraday contracts

Data Scientist Trendminer Arima univariate Taylor series ...
Data Scientist

To optimize long term and intraday contracts by predicting energy consumption and generation uncertainties.

  • Gathered requirements of energy domain experts and set the expectations in terms of FP, FN, TP, TN
  • Creat ed data pipelines and workflows to extract data from SCADA systems, grid operator, log files, market and relevant indexes
  • Developed AI/ML engine to predict energy supply and demand uncertainties
  • Built web based application for visualization of relevan t KPIs inbuilt with all the simulation and prediction models for different energy trading positions
Trendminer Arima univariate Taylor series Markov chain CRTM ELTK framework
Energy Trading

Aus- und Weiterbildung

Aus- und Weiterbildung

2010 - 2015

Institution: TU Darmstadt & Indian Institute of Technology, Bombay, India

Education: Major Signal Processing Engineering

Graduation: Bachelor?s degree & M aster?s degree in engineering (Integrated 5 years)

Kompetenzen

Kompetenzen

Produkte / Standards / Erfahrungen / Methoden

Apache Spark
AWS
(S3, EMR, SageMaker, Farget, Autoscaling, Lambda)
Azure
(Generation 2, Event Hubs, Stream Analytics, IoT Hub, DevOps, CosmosDB)
Grafana
Power BI
SAP ERP
Tableau

Profile

With a Masters in Signal Processing from TU Darmstadt and IIT Bombay, and further extensive experience in building machine learning applications, he spans the entire AI value chain, from use case identification and feasibility analysis to implementation of custom made statistical models and applications. Throughout projects, he stays focused on solving the business problem at hand and creating value from data.

Technical:

  • Machine Learning
  • Data Manipulation
  • Data Extraction
  • Feature engineering
  • Anomaly Detection
  • Prescriptive Analytics
  • Recommendation Engines

Professional Experience

12/2016 - today

Role: Senior Data Scientist

Customer: on request

Place: Frankfurt

Tasks:

Built highly profitable and world class industrial Artificial Intelligence software products

  • Designed and developed efficient ETL pipelines for efficient data extraction and transformation
  • Built AI platform for manufacturing data analytics inbuilt with self developed 100+ modules for automated data clean up, data transformation, data quality check and automated metric calculation
  • Setup NoSQL data warehouses like Hadoop, Elasticsearch, Influx dB clusters for clients
  • Specified and implemented Kafka and Flume data streaming architectures
  • Built AI/ML software solutions for predictive maintenance, predicting quality and production planning

07/2015 - 11/2016

Role: Cloud Design Engineer

Customer: NTT Communications

Place: Tokyo

Tasks:

  • Designed cloud computing software utilizing distributed systems architecture for automotive customers including Honda & Toyota; generating cumulative revenue of 1.2mn USD
  • Adapted Discrete Wavelet Transformation along with Pre-Emphas is, De-Emphasis and K-means clustering by exploring various feature extraction algorithms
  • Evaluated this on 5*7 LED matrix, showing alphabet voice commands with 92% accuracy
  • Created production environment utilizing agile tools like Git, Travis, Jenkins, Rundeck
  • Identified 60,000 fraudulent transactions via building multivariate classification algorithms for financial institutions like Mitsubishi, Mizuho

04/2014 - 04/2015

Role: Data Engineer, Self driving Rover

Customer: Mars Society Australia

Place: Melbourne

Tasks:

  • Represented India in simulated Mars Mission of 24 multidisciplinary experts from 3 countries
  • Engineered robust suspension for rugged terrains and robotic arm for soil collection
  • Algorithms for ingesting and pre processing data for labelling and segmentation tasks
  • Worked on 3D disparity algorithms, mapping whole terrain using lidar and camera
  • Designed communication hub for data collection and radio telemetry establishment
  • Achieved high speed internet via Markov Chain Bandwidth Estimation modelling
  • Obtained HD quality video using H.265 high efficiency video coding technique

Programmiersprachen

Bash
C
C++
Golang
Matlab
Perl
Python
R
Spark

Datenbanken

Elasticsearch
Hadoop
Influx DB
MySQL
Postgres

Einsatzorte

Einsatzorte

Deutschland
nicht möglich

Projekte

Projekte

24 Jahre 3 Monate
2000-01 - heute

Predict the tool breakages

Senior Data Scientist MQTT AWS S3 AWS EMR ...
Senior Data Scientist

To predict the tool breakages in a cold forming machinery line for a Tier 2 Automobile Supplier by collecting the relevant and fruitful data from various IT systems such as SAP ERP, PP, WM etc and colle cting sensors data from the Edge Devices.

  • Conceptualized a Semantic Business Rule Engine for Edge devices so that huge chunk of processing can be done on edge device itself and creating only fruitful data for analytics purpose
  • Implemented the rules to harmonize the data in data lake using ETL tools such as Hive and Kafka
  • Worked on generating the ranking of various input parameters for given business use case by utilizing the algorithms such as Boruta, RFE, BE etc
  • Helped plant users to have clea r visibility about the scrap coming out of machine and further giving them recommendations to reduce the same
MQTT AWS S3 AWS EMR Lambda Boruta RFE FFT Transformation XGB Grafana
Automobile Industry
4 Monate
2019-10 - 2020-01

Developed AI solution to optimize ambient conditions to increase sales

Senior Data Scientist FIware Spark Hadoop Hbase ...
Senior Data Scientist
  • Gathered requirements of retail domain experts and set the expectations in terms of FP, FN, TP, TN
  • Optimized SQL queries using complex joins to reduce data retrieval time
  • Implemented the data pipeline for computing KPIs from consumer data base, environmental sensors. Computing device and log data
  • Implemented data preparation and parallelized mass training of forecast models (multiple 100k models trained)
  • Built custom dashboard for quick assessment by controllers
FIware Spark Hadoop Hbase Solr Apache Range Scikit Optimize Tableau
Retail
8 Monate
2018-11 - 2019-06

Vorhersage des Qualitätsstatus von Aluminiumoxid

Senior Data Scientist H2O Scikit Data Shift Algorithms ...
Senior Data Scientist

To predict the quality status of Alumina’s (AL2O3) Specific Surface Area (SSA) for upcoming batches and taking corrective action on input process parameters to prevent and reduce the number of faults (slides) and anomalies.

  • Created data pipelines and workflows to extract data from machines like KILN, Combin Rotators, Cooling Tower, Multiclone and Polyclone Precipitators, Pan Filter
  • Implemented Mahalanobis Distance function based Data Cleanup Algorithms given the data distribution
  • Selected decision tree based algorithms by analyzing the distribution of given dataset and further Benchmarked Scikit based RF, XGB and H2O based XGB algorithms to come up with hi gh accuracy deployable model
  • Built a 2 hours in advance prediction of SSA quality attributes and ranking of input process parameters in the order in which they influence a given quality attribute 
H2O Scikit Data Shift Algorithms Apache NiFi Kafka Stream Analytics
Chemical Industry
6 Monate
2018-06 - 2018-11

Predicting the quality status of aluminium coil

Senior Data Scientist Azure Bucket Python Pandas ...
Senior Data Scientist

Predicting the quality status of aluminium coil coming from TCF Casting Machine for upcoming batches and taking corrective action on input process parameters to reduce the number of defective pieces

  • Gathered requirements of metal domain experts and collecting the data from machines like twin chamber furnace, caster, m illing, VDU units, and further from quality lab test data
  • Researched Bayesian Optimization for the given data and setup open source implementations thereof
  • Created GLM models to handle scaling of various process parameters data
  • Built web app for visualizing experiment data and surrogate function as well as proposing new formulations to test next
Azure Bucket Python Pandas Plotly Dash Scikit Optimize GPyOpt Grafana
Aluminium Foundry
6 Monate
2017-12 - 2018-05

Predicting the maintenance of electric motors to increase uptime and reduce inventory costs

Senior Data Scientist SCADA PHP Myadmin POJO/MOJO ...
Senior Data Scientist
  • Gathered requirements of printing press domain experts and set the expectations in terms of FP, FN, TP, TN
  • Developed algorithms to identify anomalies and investigate patterns in the existing parameters like motor speed, voltage etc.
  • Developed AI engine using machine learning algorithms for predictive analytics with evaluation and risk matrix
  • Built customized dashboard and mobile based application for quick and easy access to the data and KPIs
SCADA PHP Myadmin POJO/MOJO ARM Cortex EDGE AI
Printing Industry
7 Monate
2017-06 - 2017-12

Reduce inventory cost

Data Scientist R language Airflow Kibana ...
Data Scientist

To reduce inventory cost by predicting whether OEM needs a backup battery in the truck or not.

  • Collecting requirements from asset management company which gives trucks on a rental model basis
  • Created multivariate regression model to predict the health of battery by using parameters such as distance covered, fuel consumption in eco/start mode, battery current, voltage etc.
  • Deployed the model in real time on OEM dashboard showing the health of bat tery of all trucks in the fleet
R language Airflow Kibana On premise Intel servers and technology
Original equipment manufacturers
5 Monate
2016-12 - 2017-04

Optimize long term and intraday contracts

Data Scientist Trendminer Arima univariate Taylor series ...
Data Scientist

To optimize long term and intraday contracts by predicting energy consumption and generation uncertainties.

  • Gathered requirements of energy domain experts and set the expectations in terms of FP, FN, TP, TN
  • Creat ed data pipelines and workflows to extract data from SCADA systems, grid operator, log files, market and relevant indexes
  • Developed AI/ML engine to predict energy supply and demand uncertainties
  • Built web based application for visualization of relevan t KPIs inbuilt with all the simulation and prediction models for different energy trading positions
Trendminer Arima univariate Taylor series Markov chain CRTM ELTK framework
Energy Trading

Aus- und Weiterbildung

Aus- und Weiterbildung

2010 - 2015

Institution: TU Darmstadt & Indian Institute of Technology, Bombay, India

Education: Major Signal Processing Engineering

Graduation: Bachelor?s degree & M aster?s degree in engineering (Integrated 5 years)

Kompetenzen

Kompetenzen

Produkte / Standards / Erfahrungen / Methoden

Apache Spark
AWS
(S3, EMR, SageMaker, Farget, Autoscaling, Lambda)
Azure
(Generation 2, Event Hubs, Stream Analytics, IoT Hub, DevOps, CosmosDB)
Grafana
Power BI
SAP ERP
Tableau

Profile

With a Masters in Signal Processing from TU Darmstadt and IIT Bombay, and further extensive experience in building machine learning applications, he spans the entire AI value chain, from use case identification and feasibility analysis to implementation of custom made statistical models and applications. Throughout projects, he stays focused on solving the business problem at hand and creating value from data.

Technical:

  • Machine Learning
  • Data Manipulation
  • Data Extraction
  • Feature engineering
  • Anomaly Detection
  • Prescriptive Analytics
  • Recommendation Engines

Professional Experience

12/2016 - today

Role: Senior Data Scientist

Customer: on request

Place: Frankfurt

Tasks:

Built highly profitable and world class industrial Artificial Intelligence software products

  • Designed and developed efficient ETL pipelines for efficient data extraction and transformation
  • Built AI platform for manufacturing data analytics inbuilt with self developed 100+ modules for automated data clean up, data transformation, data quality check and automated metric calculation
  • Setup NoSQL data warehouses like Hadoop, Elasticsearch, Influx dB clusters for clients
  • Specified and implemented Kafka and Flume data streaming architectures
  • Built AI/ML software solutions for predictive maintenance, predicting quality and production planning

07/2015 - 11/2016

Role: Cloud Design Engineer

Customer: NTT Communications

Place: Tokyo

Tasks:

  • Designed cloud computing software utilizing distributed systems architecture for automotive customers including Honda & Toyota; generating cumulative revenue of 1.2mn USD
  • Adapted Discrete Wavelet Transformation along with Pre-Emphas is, De-Emphasis and K-means clustering by exploring various feature extraction algorithms
  • Evaluated this on 5*7 LED matrix, showing alphabet voice commands with 92% accuracy
  • Created production environment utilizing agile tools like Git, Travis, Jenkins, Rundeck
  • Identified 60,000 fraudulent transactions via building multivariate classification algorithms for financial institutions like Mitsubishi, Mizuho

04/2014 - 04/2015

Role: Data Engineer, Self driving Rover

Customer: Mars Society Australia

Place: Melbourne

Tasks:

  • Represented India in simulated Mars Mission of 24 multidisciplinary experts from 3 countries
  • Engineered robust suspension for rugged terrains and robotic arm for soil collection
  • Algorithms for ingesting and pre processing data for labelling and segmentation tasks
  • Worked on 3D disparity algorithms, mapping whole terrain using lidar and camera
  • Designed communication hub for data collection and radio telemetry establishment
  • Achieved high speed internet via Markov Chain Bandwidth Estimation modelling
  • Obtained HD quality video using H.265 high efficiency video coding technique

Programmiersprachen

Bash
C
C++
Golang
Matlab
Perl
Python
R
Spark

Datenbanken

Elasticsearch
Hadoop
Influx DB
MySQL
Postgres

Vertrauen Sie auf GULP

Im Bereich Freelancing
Im Bereich Arbeitnehmerüberlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das GULP Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.