Test Manager | QA Lead specializing in AI & GenAI Testing | LLM Evaluation | RAG Testing | AI Output Validation | Quality & Safety of AI Systems
Aktualisiert am 18.03.2026
Profil
Freiberufler / Selbstständiger
Remote-Arbeit
Verfügbar ab: 16.03.2026
Verfügbar zu: 100%
davon vor Ort: 25%
Testanalyst
Künstliche Intelligenz
Qualitätsengineering
Tester
Test Manager
QA Lead
AI & GenAI Testing
LLM Evaluation
RAG Testing
AI Output Validation
Quality & Safety of AI Systems
Hallucination detection
Prompt injection testing
LLM Security Testing
LLM Jailbreak Testing
AI Red Teaming
Generative AI Testing
Manual Tester
AI Evaluation Trainer
Quality Assurace
Polish
Muttersprache
English
Fortgeschritten
German
Grundkenntnisse

Einsatzorte

Einsatzorte

Wroc?aw (+150km)
Deutschland
möglich

Projekte

Projekte

9 months
2025-05 - 2026-01

Amsterdam Contract project

AI QA Lead/ Generative AI Test Lead LLM Generative AI Prompt Engineering ...
AI QA Lead/ Generative AI Test Lead
  • Led testing and quality assurance for AI-powered features, including text summarization, knowledge item generation and semantic search
  • Designed and executed manual and automated tests for generative AI systems, validating LLM responses for accuracy, relevance, consistency and safety
  • Performed prompt engineering and adversarial testing, including prompt injection and code injection scenarios
  • Conducted testing focused on hallucination detection, PII/data privacy protection, response quality evaluation and multilingual behaviour
  • Designed custom evaluation methodologies for AI outputs, enabling systematic assessment of model performance and behavior
  • Built and maintained automated AI evaluation tests using Promptfoo, integrated with CI/CD workflows in with developers
  • Created advanced test scenarios including edge cases, robustness and stress testing
  • Led test strategy and quality assurance processes across the AI product lifecycle, from early prototyping to production deployment
  • Delivered internal training ?Introduction to AI Testing?, training 15+ manual testers on testing approaches for LLM-based systems
  • Collaborated with AI engineers, developers and product teams to analyze model behavior and improve reliability of AI-generated outputs
LLM Generative AI Prompt Engineering Promptfoo AI Evaluation NLP Systems CI/CD
Scenifi & TOPdesk
5 years 10 months
2019-09 - 2025-06

various

Test Consultant ? Test Manager/ QA Engineer
Test Consultant ? Test Manager/ QA Engineer
  • Performed manual testing of enterprise applications, supporting automated testing by preparing test data, configuration files and analysing automated test execution results
  • Executed module, integration, system, API, end-to-end (E2E) and user acceptance tests (UAT)
  • Conducted regression, exploratory, smoke and stability testing, including edge-case scenarios and UX validation
  • Participated in full project lifecycle activities ? requirements analysis, test design, execution, defect tracking and release validation
  • Tested complex distributed systems, including voucher platforms, ticketing microservices and data analysis tools for international clients (Germany, Japan)
  • Designed test cases, test scenarios and test data, ensuring broad functional coverage and reliability of critical system components
  • Collaborated with developers, product owners and international stakeholders to ensure high software quality and smooth release cycles
  • Analysed defects, investigated root causes and contributed to process improvements and quality optimization
GlobalLogic
1 year
2024-06 - 2025-05

Independent research & testing

Generative AI & LLM Testing | UX and Behavior Analysis
Generative AI & LLM Testing | UX and Behavior Analysis
  • Conducted manual testing of large language models (LLMs) including GPT-4 and Gemini, as well as generative AI platforms such as ChatGPT and Midjourney
  • Evaluated LLM outputs for factual accuracy, hallucinations, context consistency and logical reasoning
  • Performed prompt engineering and scenario-based testing to explore edge cases and model robustness
  • Conducted comparative evaluation of generative AI models, analysing differences in response behavior and quality
  • Documented interaction logs and maintained structured feedback loops for model behaviour analysis
  • Analysed context memory, narrative consistency and conversational coherence in long-form interactions
  • Evaluated AI-human interaction quality, tone and UX alignment from a user-centered perspective
  • Tracked recurring model issues and behavioral changes across model updates and versions
1 year 10 months
2017-12 - 2019-09

Telecommunications infrastructure migration

Quality Assurance Leader
Quality Assurance Leader
  • Acted as Manual QA Tester and Test Lead, leading frontend testing activities across large-scale telecommunications projects and distributed QA teams
  • Led testing within national telecom infrastructure migration programs (Deutsche Telekom ? copper to fiber), supporting complex multi-system deployments
  • Owned the full test lifecycle, including test planning, scope definition, test case design, execution, defect management, reporting and release validation
  • Defined test strategies, quality metrics, timelines and risk mitigation plans in fast-paced, high-complexity environments
  • Conducted accessibility testing (ARIA compliance) using tools such as JAWS screen reader and contrast analyzers, contributing to two successful ARIA audits
  • Coordinated distributed QA teams, providing guidance and support in international, cross-functional environments
  • Collaborated closely with system architects, business analysts, product owners and development teams to ensure software quality and release readiness
  • Supported release planning and deployment validation, ensuring system stability and production readiness
  • Contributed to the Capgemini NTC testing community, co-creating knowledge-sharing initiatives and delivering internal and external QA trainings
Capgemini
1 year 2 months
2016-12 - 2018-01

various

Senior Test Analyst/ QA Lead & Project Coordinator HP ALM Zephyr Jira ...
Senior Test Analyst/ QA Lead & Project Coordinator
  • Progressed from Senior Test Analyst/ Test Team Lead to Quality Assurance Test Lead within large-scale international programs
  • Led end-to-end testing processes, including test planning, scope definition, test design, execution, defect management, reporting and release validation
  • Defined test strategies, quality metrics, timelines, resource estimation and risk mitigation plans to ensure product reliability
  • Coordinated distributed QA teams, providing leadership and maintaining consistent testing standards across projects
  • Conducted frontend, integration, regression and user acceptance testing (UAT) across enterprise systems in the insurance and telecommunications domains
  • Performed accessibility testing (ARIA compliance) using tools such as JAWS screen reader and contrast analyzers
  • Collaborated with system architects, product owners, business analysts and development teams to ensure alignment of quality, scope and delivery
HP ALM Zephyr Jira Confluence Bitbucket
Capgemini
1 year 4 months
2015-09 - 2016-12

Guidewire implementation

Manual Tester
Manual Tester
Contributed to one of the largest global Guidewire implementations (Policy Center, Claim Center, Billing Center) within a program involving 400+ contributors
Capgemini
1 year 1 month
2014-06 - 2015-06

Real estate marketing

Senior Real Estate Commercialization Specialist & Tester? PKP S.A.
Senior Real Estate Commercialization Specialist & Tester? PKP S.A.
Polish Railway Company
6 months
2012-09 - 2013-02

Management Intern

Hamilton Health Center
Harrisburg, Pennsylvania (USA)

Aus- und Weiterbildung

Aus- und Weiterbildung

Studies - Computer Science? and Finance/ Human Resource Management
Wroc?aw University of Economics, Faculty of Management
Degree: Master and Bachelor

Certifications
  • ISTQB Foundation Level
  • Scrum Master

Kompetenzen

Kompetenzen

Top-Skills

Testanalyst Künstliche Intelligenz Qualitätsengineering Tester Test Manager QA Lead AI & GenAI Testing LLM Evaluation RAG Testing AI Output Validation Quality & Safety of AI Systems Hallucination detection Prompt injection testing LLM Security Testing LLM Jailbreak Testing AI Red Teaming Generative AI Testing Manual Tester AI Evaluation Trainer Quality Assurace

Produkte / Standards / Erfahrungen / Methoden

  • Methodologies
    • Scrum (Agile)
    • Kanban
    • Waterfall
  • AI & Generative AI Tools
    • ChatGPT 
    • GPT-4 
    • OpenAI Playground 
    • Claude 
    • Midjourney
    • Promptfoo
  • Testing & QA Tools
    • Postman 
    • Swagger 
    • TestRail 
    • Zephyr 
    • HP ALM
  • Collaboration & Development Tools
    • Jira 
    • Confluence 
    • Atlassian ecosystem 
    • Visual Studio Code
  • Platforms/ Systems
    • ?Guidewire

Experience and knowledge
  • Test Lead 
  • LLM Testing 
  • LLM Jailbreak Testing 
  • Generative AI Testing
  • Prompt Engineering 
  • AI Output Validation 
  • Prompt Injection Testing 
  • LLM Security Testing
    • QA professional with 12+ years of experience in international IT projects, specializing in software testing and quality assurance for standard and generative AI systems
    • Strong expertise in manual testing, exploratory testing and end-to-end testing in Agile and hybrid environments (Scrum, Kanban, Waterfall)
    • Proven ability to lead testing activities and coordinate cross-functional teams in complex software delivery environments
    • Hands-on experience in LLM testing, AI output validation and prompt engineering, focusing on response quality, reliability and system behaviour
    • Experience identifying AI-specific issues, including hallucinations, prompt vulnerabilities, context inconsistencies and response quality problems
  • Key Skills
    • Generative AI testing 
    • LLM testing 
    • Prompt engineering 
    • AI output validation 
    • Prompt injection testing
    • Test strategy 
    • API testing 
    • End-to-end testing 
    • Exploratory testing 
    • Manual testing
  • AI Testing & Evaluation Areas
    • LLM output validation ? accuracy, relevance, consistency and reasoning quality
    • Hallucination detection ? identification of fabricated or unsupported information in model outputs
    • Prompt engineering & robustness testing ? testing prompt variations, edge cases and model stability
    • Prompt injection & adversarial testing ? identifying vulnerabilities caused by malicious or manipulative inputs
    • AI evaluation methodologies ? designing structured evaluation frameworks for generative AI outputs
    • Context consistency & memory behaviour testing ? validation of long-context interactions and conversational coherence
    • Multilingual AI testing ? evaluating model behaviour across languages and localization scenarios
    • Bias & fairness testing ? identifying biased outputs and fairness risks in generative systems
    • AI safety & reliability testing ? testing guardrails, safety responses and system resilience
    • Human-AI interaction & UX evaluation ? assessing tone, usability and user alignment in AI-generated responses

Einsatzorte

Einsatzorte

Wroc?aw (+150km)
Deutschland
möglich

Projekte

Projekte

9 months
2025-05 - 2026-01

Amsterdam Contract project

AI QA Lead/ Generative AI Test Lead LLM Generative AI Prompt Engineering ...
AI QA Lead/ Generative AI Test Lead
  • Led testing and quality assurance for AI-powered features, including text summarization, knowledge item generation and semantic search
  • Designed and executed manual and automated tests for generative AI systems, validating LLM responses for accuracy, relevance, consistency and safety
  • Performed prompt engineering and adversarial testing, including prompt injection and code injection scenarios
  • Conducted testing focused on hallucination detection, PII/data privacy protection, response quality evaluation and multilingual behaviour
  • Designed custom evaluation methodologies for AI outputs, enabling systematic assessment of model performance and behavior
  • Built and maintained automated AI evaluation tests using Promptfoo, integrated with CI/CD workflows in with developers
  • Created advanced test scenarios including edge cases, robustness and stress testing
  • Led test strategy and quality assurance processes across the AI product lifecycle, from early prototyping to production deployment
  • Delivered internal training ?Introduction to AI Testing?, training 15+ manual testers on testing approaches for LLM-based systems
  • Collaborated with AI engineers, developers and product teams to analyze model behavior and improve reliability of AI-generated outputs
LLM Generative AI Prompt Engineering Promptfoo AI Evaluation NLP Systems CI/CD
Scenifi & TOPdesk
5 years 10 months
2019-09 - 2025-06

various

Test Consultant ? Test Manager/ QA Engineer
Test Consultant ? Test Manager/ QA Engineer
  • Performed manual testing of enterprise applications, supporting automated testing by preparing test data, configuration files and analysing automated test execution results
  • Executed module, integration, system, API, end-to-end (E2E) and user acceptance tests (UAT)
  • Conducted regression, exploratory, smoke and stability testing, including edge-case scenarios and UX validation
  • Participated in full project lifecycle activities ? requirements analysis, test design, execution, defect tracking and release validation
  • Tested complex distributed systems, including voucher platforms, ticketing microservices and data analysis tools for international clients (Germany, Japan)
  • Designed test cases, test scenarios and test data, ensuring broad functional coverage and reliability of critical system components
  • Collaborated with developers, product owners and international stakeholders to ensure high software quality and smooth release cycles
  • Analysed defects, investigated root causes and contributed to process improvements and quality optimization
GlobalLogic
1 year
2024-06 - 2025-05

Independent research & testing

Generative AI & LLM Testing | UX and Behavior Analysis
Generative AI & LLM Testing | UX and Behavior Analysis
  • Conducted manual testing of large language models (LLMs) including GPT-4 and Gemini, as well as generative AI platforms such as ChatGPT and Midjourney
  • Evaluated LLM outputs for factual accuracy, hallucinations, context consistency and logical reasoning
  • Performed prompt engineering and scenario-based testing to explore edge cases and model robustness
  • Conducted comparative evaluation of generative AI models, analysing differences in response behavior and quality
  • Documented interaction logs and maintained structured feedback loops for model behaviour analysis
  • Analysed context memory, narrative consistency and conversational coherence in long-form interactions
  • Evaluated AI-human interaction quality, tone and UX alignment from a user-centered perspective
  • Tracked recurring model issues and behavioral changes across model updates and versions
1 year 10 months
2017-12 - 2019-09

Telecommunications infrastructure migration

Quality Assurance Leader
Quality Assurance Leader
  • Acted as Manual QA Tester and Test Lead, leading frontend testing activities across large-scale telecommunications projects and distributed QA teams
  • Led testing within national telecom infrastructure migration programs (Deutsche Telekom ? copper to fiber), supporting complex multi-system deployments
  • Owned the full test lifecycle, including test planning, scope definition, test case design, execution, defect management, reporting and release validation
  • Defined test strategies, quality metrics, timelines and risk mitigation plans in fast-paced, high-complexity environments
  • Conducted accessibility testing (ARIA compliance) using tools such as JAWS screen reader and contrast analyzers, contributing to two successful ARIA audits
  • Coordinated distributed QA teams, providing guidance and support in international, cross-functional environments
  • Collaborated closely with system architects, business analysts, product owners and development teams to ensure software quality and release readiness
  • Supported release planning and deployment validation, ensuring system stability and production readiness
  • Contributed to the Capgemini NTC testing community, co-creating knowledge-sharing initiatives and delivering internal and external QA trainings
Capgemini
1 year 2 months
2016-12 - 2018-01

various

Senior Test Analyst/ QA Lead & Project Coordinator HP ALM Zephyr Jira ...
Senior Test Analyst/ QA Lead & Project Coordinator
  • Progressed from Senior Test Analyst/ Test Team Lead to Quality Assurance Test Lead within large-scale international programs
  • Led end-to-end testing processes, including test planning, scope definition, test design, execution, defect management, reporting and release validation
  • Defined test strategies, quality metrics, timelines, resource estimation and risk mitigation plans to ensure product reliability
  • Coordinated distributed QA teams, providing leadership and maintaining consistent testing standards across projects
  • Conducted frontend, integration, regression and user acceptance testing (UAT) across enterprise systems in the insurance and telecommunications domains
  • Performed accessibility testing (ARIA compliance) using tools such as JAWS screen reader and contrast analyzers
  • Collaborated with system architects, product owners, business analysts and development teams to ensure alignment of quality, scope and delivery
HP ALM Zephyr Jira Confluence Bitbucket
Capgemini
1 year 4 months
2015-09 - 2016-12

Guidewire implementation

Manual Tester
Manual Tester
Contributed to one of the largest global Guidewire implementations (Policy Center, Claim Center, Billing Center) within a program involving 400+ contributors
Capgemini
1 year 1 month
2014-06 - 2015-06

Real estate marketing

Senior Real Estate Commercialization Specialist & Tester? PKP S.A.
Senior Real Estate Commercialization Specialist & Tester? PKP S.A.
Polish Railway Company
6 months
2012-09 - 2013-02

Management Intern

Hamilton Health Center
Harrisburg, Pennsylvania (USA)

Aus- und Weiterbildung

Aus- und Weiterbildung

Studies - Computer Science? and Finance/ Human Resource Management
Wroc?aw University of Economics, Faculty of Management
Degree: Master and Bachelor

Certifications
  • ISTQB Foundation Level
  • Scrum Master

Kompetenzen

Kompetenzen

Top-Skills

Testanalyst Künstliche Intelligenz Qualitätsengineering Tester Test Manager QA Lead AI & GenAI Testing LLM Evaluation RAG Testing AI Output Validation Quality & Safety of AI Systems Hallucination detection Prompt injection testing LLM Security Testing LLM Jailbreak Testing AI Red Teaming Generative AI Testing Manual Tester AI Evaluation Trainer Quality Assurace

Produkte / Standards / Erfahrungen / Methoden

  • Methodologies
    • Scrum (Agile)
    • Kanban
    • Waterfall
  • AI & Generative AI Tools
    • ChatGPT 
    • GPT-4 
    • OpenAI Playground 
    • Claude 
    • Midjourney
    • Promptfoo
  • Testing & QA Tools
    • Postman 
    • Swagger 
    • TestRail 
    • Zephyr 
    • HP ALM
  • Collaboration & Development Tools
    • Jira 
    • Confluence 
    • Atlassian ecosystem 
    • Visual Studio Code
  • Platforms/ Systems
    • ?Guidewire

Experience and knowledge
  • Test Lead 
  • LLM Testing 
  • LLM Jailbreak Testing 
  • Generative AI Testing
  • Prompt Engineering 
  • AI Output Validation 
  • Prompt Injection Testing 
  • LLM Security Testing
    • QA professional with 12+ years of experience in international IT projects, specializing in software testing and quality assurance for standard and generative AI systems
    • Strong expertise in manual testing, exploratory testing and end-to-end testing in Agile and hybrid environments (Scrum, Kanban, Waterfall)
    • Proven ability to lead testing activities and coordinate cross-functional teams in complex software delivery environments
    • Hands-on experience in LLM testing, AI output validation and prompt engineering, focusing on response quality, reliability and system behaviour
    • Experience identifying AI-specific issues, including hallucinations, prompt vulnerabilities, context inconsistencies and response quality problems
  • Key Skills
    • Generative AI testing 
    • LLM testing 
    • Prompt engineering 
    • AI output validation 
    • Prompt injection testing
    • Test strategy 
    • API testing 
    • End-to-end testing 
    • Exploratory testing 
    • Manual testing
  • AI Testing & Evaluation Areas
    • LLM output validation ? accuracy, relevance, consistency and reasoning quality
    • Hallucination detection ? identification of fabricated or unsupported information in model outputs
    • Prompt engineering & robustness testing ? testing prompt variations, edge cases and model stability
    • Prompt injection & adversarial testing ? identifying vulnerabilities caused by malicious or manipulative inputs
    • AI evaluation methodologies ? designing structured evaluation frameworks for generative AI outputs
    • Context consistency & memory behaviour testing ? validation of long-context interactions and conversational coherence
    • Multilingual AI testing ? evaluating model behaviour across languages and localization scenarios
    • Bias & fairness testing ? identifying biased outputs and fairness risks in generative systems
    • AI safety & reliability testing ? testing guardrails, safety responses and system resilience
    • Human-AI interaction & UX evaluation ? assessing tone, usability and user alignment in AI-generated responses

Vertrauen Sie auf Randstad

Im Bereich Freelancing
Im Bereich Arbeitnehmerüberlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.