Entwicklung einer LLM-basiertern Anwendung inkl. Konzeption KI-Anwendungsarchitektur in Zusammenarbeit mit dem Architekten
Prompt Engineering- Entwicklung und Optimierung von Prompt Anbindung von Anthropic Claude und OpenAI
Iterativ Chunking-Methode zur Model Effizienz
Prompt Techniken Beispiel Chain-of-thought, Few Shot Learning
Die GenAI-App durch die AWS Bedrock API integriert, um präzise Prompts zu senden und die entsprechende Antwort für die jeweiligen Anwendungsfälle zu erhalten
Neuesten Trends in der AI-Technologie integriert
Entwicklung der UI für die webbasierte Anwendung
PythonSQLUMLMagicdraw
Eschborn, Taunus
4 Monate
2023-07 - 2023-10
Kundensegmentierung und Kernziel Ermittlung zur Überprüfung des Kundenverhaltens
Business Analyst - Project LeadPythonMS ExcelPandas...
Business Analyst - Project Lead
Data pre-processing: Extensive data preparation performed, including exploratory data analysis, data cleansing, feature engineering, and selection to optimize model performance
Utilized Python libraries (Pandas, NumPy) for efficient data analysis, transformation, and manipulation using optimized array and DataFrame structures.
Model development: Implementation of core target profiling through cluster modeling using RFM analysis
Comparative analysis: Analyzed variations and parallels in consumer behavior before and after the Covid pandemic
Machine learning model to predict future customer behavior
PythonMS ExcelPandasNumpy
3 Monate
2023-04 - 2023-06
Maschinelles Lernmodell zur Ermittlung umfassenderer und genauerer geplanter Einnahmen für eine faire Bewertung der Leistung von Vertriebsmitarbeitern
Data ScientistPythonPandasNumpy
Data Scientist
Evaluation of the performance of the sales staff taking into account their sales characteristics, e.g. Multi-channel engagement, revenue generation, and customer churn.
Comprehensive data preparation, including exploratory data analysis, data cleansing, feature engineering, and feature selection
Applied Pandas and NumPy for structured data transformation and numerical analysis, leveraging DataFrames and multidimensional arrays in quantitative data workflows
Regression modeling using different attributes
Feature selection via a correlation matrix to uncover relationships between different features
Random Forest and XGBoost modeling for fair target revenue planning
Evaluate the new framework against the old framework using a confusion matrix
PythonPandasNumpy
3 Monate
2023-03 - 2023-05
Predicting Kickstarter Campaign Success
Data ScientistPythonPandasNumpy...
Data Scientist
Built a predictive machine learning model to assess campaign success using text-derived features, demonstrating an end-to-end AI pipeline from data ingestion to interpretable model output ? aligned with generative AI use cases in finance
Applied natural language processing (NLP) techniques to extract advanced linguistic features (e.g., readability, sentiment, lexical diversity, emotion) from project descriptions using tools like Afinn lexicon and transformer-based models (DistilRoBERTa)
Utilised Python libraries including Pandas and NumPy for large-scale data processing, feature engineering, and statistical analysis across 160K+ Kickstarter campaigns involving 121 structured attributes and text-based features
Performed econometric modeling (logistic, linear, and Poisson regressions) with feature selection and multicollinearity handling, achieving a 63% increase in explained variance when including text-based features ? analogous to enhancing financial factor models with AI insights
Jupyter Notebook
PythonPandasNumpyNatural Language Processing
1 Jahr 3 Monate
2020-03 - 2021-05
Identifizierung von Hotspots für Fahrzeugdiebstahl
Analysis of FIR data from police departments and marking of the riskiest pin codes in Delhi
Pandas and NumPy for data cleaning, transformation, and spatial pattern analysis in vehicle theft hotspot detection
Data preparation: data collection, data cleansing, exploratory data analysis
Filtering and extraction of PIN codes using regex
Identify the 8 pin codes with the highest incidence, including the most stolen vehicles and the most common times for thefts
Fragmented datasets were organized by creating data frames
Using a correlation matrix, a trend analysis was conducted to identify the most frequently stolen vehicle models and determine the peak times for vehicle thefts during the day.
ML-Web-App mit dem Python-Framework Dash erstellt
PythonMS ExcelPandasNumpyDash
7 Monate
2019-08 - 2020-02
Identifizierung von Kriminalitätskrisen und Hotspot-Analyse
Program AnalystPythonMS ExcelPandas...
Program Analyst
Led end-to-end hotspot analysis of Delhi crime data using Python to identify high-risk zones for targeted resource allocation
Ingested and cleaned multi-format datasets computed per-capita crime rates and year-over-year growth for theft, assault, burglary, etc
Leveraged Pandas and NumPy for structured data processing and statistical analysis to identify crime hotspots from large geospatial datasets.
Engineered a composite crime risk index in Python by min?max normalizing per-capita rates (theft, assault, burglary) and applying stakeholder-derived severity weights
Grouped zones into risk clusters using both (top 20% = High, next 40% = Medium, bottom 40% = Low) and on composite scores
Cross validated clusters to detect multicollinearity and avoided double-counting of highly correlated metrics
Conducted sensitivity analyses on weight assignments and outlier handling (percentile clipping) to ensure model robustness
and stakeholder confidence .
Translated composite scores into a (High/Medium/Low) framework, enabling clear prioritization of resource allocation
Entwicklung einer LLM-basiertern Anwendung inkl. Konzeption KI-Anwendungsarchitektur in Zusammenarbeit mit dem Architekten
Prompt Engineering- Entwicklung und Optimierung von Prompt Anbindung von Anthropic Claude und OpenAI
Iterativ Chunking-Methode zur Model Effizienz
Prompt Techniken Beispiel Chain-of-thought, Few Shot Learning
Die GenAI-App durch die AWS Bedrock API integriert, um präzise Prompts zu senden und die entsprechende Antwort für die jeweiligen Anwendungsfälle zu erhalten
Neuesten Trends in der AI-Technologie integriert
Entwicklung der UI für die webbasierte Anwendung
PythonSQLUMLMagicdraw
Eschborn, Taunus
4 Monate
2023-07 - 2023-10
Kundensegmentierung und Kernziel Ermittlung zur Überprüfung des Kundenverhaltens
Business Analyst - Project LeadPythonMS ExcelPandas...
Business Analyst - Project Lead
Data pre-processing: Extensive data preparation performed, including exploratory data analysis, data cleansing, feature engineering, and selection to optimize model performance
Utilized Python libraries (Pandas, NumPy) for efficient data analysis, transformation, and manipulation using optimized array and DataFrame structures.
Model development: Implementation of core target profiling through cluster modeling using RFM analysis
Comparative analysis: Analyzed variations and parallels in consumer behavior before and after the Covid pandemic
Machine learning model to predict future customer behavior
PythonMS ExcelPandasNumpy
3 Monate
2023-04 - 2023-06
Maschinelles Lernmodell zur Ermittlung umfassenderer und genauerer geplanter Einnahmen für eine faire Bewertung der Leistung von Vertriebsmitarbeitern
Data ScientistPythonPandasNumpy
Data Scientist
Evaluation of the performance of the sales staff taking into account their sales characteristics, e.g. Multi-channel engagement, revenue generation, and customer churn.
Comprehensive data preparation, including exploratory data analysis, data cleansing, feature engineering, and feature selection
Applied Pandas and NumPy for structured data transformation and numerical analysis, leveraging DataFrames and multidimensional arrays in quantitative data workflows
Regression modeling using different attributes
Feature selection via a correlation matrix to uncover relationships between different features
Random Forest and XGBoost modeling for fair target revenue planning
Evaluate the new framework against the old framework using a confusion matrix
PythonPandasNumpy
3 Monate
2023-03 - 2023-05
Predicting Kickstarter Campaign Success
Data ScientistPythonPandasNumpy...
Data Scientist
Built a predictive machine learning model to assess campaign success using text-derived features, demonstrating an end-to-end AI pipeline from data ingestion to interpretable model output ? aligned with generative AI use cases in finance
Applied natural language processing (NLP) techniques to extract advanced linguistic features (e.g., readability, sentiment, lexical diversity, emotion) from project descriptions using tools like Afinn lexicon and transformer-based models (DistilRoBERTa)
Utilised Python libraries including Pandas and NumPy for large-scale data processing, feature engineering, and statistical analysis across 160K+ Kickstarter campaigns involving 121 structured attributes and text-based features
Performed econometric modeling (logistic, linear, and Poisson regressions) with feature selection and multicollinearity handling, achieving a 63% increase in explained variance when including text-based features ? analogous to enhancing financial factor models with AI insights
Jupyter Notebook
PythonPandasNumpyNatural Language Processing
1 Jahr 3 Monate
2020-03 - 2021-05
Identifizierung von Hotspots für Fahrzeugdiebstahl
Analysis of FIR data from police departments and marking of the riskiest pin codes in Delhi
Pandas and NumPy for data cleaning, transformation, and spatial pattern analysis in vehicle theft hotspot detection
Data preparation: data collection, data cleansing, exploratory data analysis
Filtering and extraction of PIN codes using regex
Identify the 8 pin codes with the highest incidence, including the most stolen vehicles and the most common times for thefts
Fragmented datasets were organized by creating data frames
Using a correlation matrix, a trend analysis was conducted to identify the most frequently stolen vehicle models and determine the peak times for vehicle thefts during the day.
ML-Web-App mit dem Python-Framework Dash erstellt
PythonMS ExcelPandasNumpyDash
7 Monate
2019-08 - 2020-02
Identifizierung von Kriminalitätskrisen und Hotspot-Analyse
Program AnalystPythonMS ExcelPandas...
Program Analyst
Led end-to-end hotspot analysis of Delhi crime data using Python to identify high-risk zones for targeted resource allocation
Ingested and cleaned multi-format datasets computed per-capita crime rates and year-over-year growth for theft, assault, burglary, etc
Leveraged Pandas and NumPy for structured data processing and statistical analysis to identify crime hotspots from large geospatial datasets.
Engineered a composite crime risk index in Python by min?max normalizing per-capita rates (theft, assault, burglary) and applying stakeholder-derived severity weights
Grouped zones into risk clusters using both (top 20% = High, next 40% = Medium, bottom 40% = Low) and on composite scores
Cross validated clusters to detect multicollinearity and avoided double-counting of highly correlated metrics
Conducted sensitivity analyses on weight assignments and outlier handling (percentile clipping) to ensure model robustness
and stakeholder confidence .
Translated composite scores into a (High/Medium/Low) framework, enabling clear prioritization of resource allocation