Freelancer: Test Manager | QA Lead specializing in AI & GenAI Testing | LLM Evaluation | RAG Testing | AI Output Validation | Quality & Safety of AI Systems

Freiberufler / Selbstst�ndiger

Remote-Arbeit

Verf�gbar ab: 16.03.2026

Verf�gbar zu: 100%

davon vor Ort: 25%

Top-Skills

Testanalyst

K�nstliche Intelligenz

Qualit�tsengineering

Tester

Test Manager

QA Lead

AI & GenAI Testing

LLM Evaluation

RAG Testing

AI Output Validation

Quality & Safety of AI Systems

Hallucination detection

Prompt injection testing

LLM Security Testing

LLM Jailbreak Testing

AI Red Teaming

Generative AI Testing

Manual Tester

AI Evaluation Trainer

Quality Assurace

Sprachen

Polish

English

German

Einsatzorte

St�dte

Wroc?aw (+150km)

L�nder

Deutschland

Remote-Arbeit

m�glich

Projekte

9 months

2025-05 - 2026-01

Amsterdam Contract project

AI QA Lead/ Generative AI Test Lead LLM Generative AI Prompt Engineering ...

Rolle

AI QA Lead/ Generative AI Test Lead

Projektinhalte

Led testing and quality assurance for AI-powered features, including text summarization, knowledge item generation and semantic search
Designed and executed manual and automated tests for generative AI systems, validating LLM responses for accuracy, relevance, consistency and safety
Performed prompt engineering and adversarial testing, including prompt injection and code injection scenarios
Conducted testing focused on hallucination detection, PII/data privacy protection, response quality evaluation and multilingual behaviour
Designed custom evaluation methodologies for AI outputs, enabling systematic assessment of model performance and behavior
Built and maintained automated AI evaluation tests using Promptfoo, integrated with CI/CD workflows in with developers
Created advanced test scenarios including edge cases, robustness and stress testing
Led test strategy and quality assurance processes across the AI product lifecycle, from early prototyping to production deployment
Delivered internal training ?Introduction to AI Testing?, training 15+ manual testers on testing approaches for LLM-based systems
Collaborated with AI engineers, developers and product teams to analyze model behavior and improve reliability of AI-generated outputs

Kenntnisse

LLM Generative AI Prompt Engineering Promptfoo AI Evaluation NLP Systems CI/CD

Kunde

Scenifi & TOPdesk

5 years 10 months

2019-09 - 2025-06

various

Test Consultant ? Test Manager/ QA Engineer

Rolle

Test Consultant ? Test Manager/ QA Engineer

Projektinhalte

Performed manual testing of enterprise applications, supporting automated testing by preparing test data, configuration files and analysing automated test execution results
Executed module, integration, system, API, end-to-end (E2E) and user acceptance tests (UAT)
Conducted regression, exploratory, smoke and stability testing, including edge-case scenarios and UX validation
Participated in full project lifecycle activities ? requirements analysis, test design, execution, defect tracking and release validation
Tested complex distributed systems, including voucher platforms, ticketing microservices and data analysis tools for international clients (Germany, Japan)
Designed test cases, test scenarios and test data, ensuring broad functional coverage and reliability of critical system components
Collaborated with developers, product owners and international stakeholders to ensure high software quality and smooth release cycles
Analysed defects, investigated root causes and contributed to process improvements and quality optimization

Kunde

GlobalLogic

1 year

2024-06 - 2025-05

Independent research & testing

Generative AI & LLM Testing | UX and Behavior Analysis

Rolle

Generative AI & LLM Testing | UX and Behavior Analysis

Projektinhalte

Conducted manual testing of large language models (LLMs) including GPT-4 and Gemini, as well as generative AI platforms such as ChatGPT and Midjourney
Evaluated LLM outputs for factual accuracy, hallucinations, context consistency and logical reasoning
Performed prompt engineering and scenario-based testing to explore edge cases and model robustness
Conducted comparative evaluation of generative AI models, analysing differences in response behavior and quality
Documented interaction logs and maintained structured feedback loops for model behaviour analysis
Analysed context memory, narrative consistency and conversational coherence in long-form interactions
Evaluated AI-human interaction quality, tone and UX alignment from a user-centered perspective
Tracked recurring model issues and behavioral changes across model updates and versions

1 year 10 months

2017-12 - 2019-09

Telecommunications infrastructure migration

Quality Assurance Leader

Rolle

Quality Assurance Leader

Projektinhalte

Acted as Manual QA Tester and Test Lead, leading frontend testing activities across large-scale telecommunications projects and distributed QA teams
Led testing within national telecom infrastructure migration programs (Deutsche Telekom ? copper to fiber), supporting complex multi-system deployments
Owned the full test lifecycle, including test planning, scope definition, test case design, execution, defect management, reporting and release validation
Defined test strategies, quality metrics, timelines and risk mitigation plans in fast-paced, high-complexity environments
Conducted accessibility testing (ARIA compliance) using tools such as JAWS screen reader and contrast analyzers, contributing to two successful ARIA audits
Coordinated distributed QA teams, providing guidance and support in international, cross-functional environments
Collaborated closely with system architects, business analysts, product owners and development teams to ensure software quality and release readiness
Supported release planning and deployment validation, ensuring system stability and production readiness
Contributed to the Capgemini NTC testing community, co-creating knowledge-sharing initiatives and delivering internal and external QA trainings

Kunde

Capgemini

1 year 2 months

2016-12 - 2018-01

various

Senior Test Analyst/ QA Lead & Project Coordinator HP ALM Zephyr Jira ...

Rolle

Senior Test Analyst/ QA Lead & Project Coordinator

Projektinhalte

Progressed from Senior Test Analyst/ Test Team Lead to Quality Assurance Test Lead within large-scale international programs
Led end-to-end testing processes, including test planning, scope definition, test design, execution, defect management, reporting and release validation
Defined test strategies, quality metrics, timelines, resource estimation and risk mitigation plans to ensure product reliability
Coordinated distributed QA teams, providing leadership and maintaining consistent testing standards across projects
Conducted frontend, integration, regression and user acceptance testing (UAT) across enterprise systems in the insurance and telecommunications domains
Performed accessibility testing (ARIA compliance) using tools such as JAWS screen reader and contrast analyzers
Collaborated with system architects, product owners, business analysts and development teams to ensure alignment of quality, scope and delivery

Kenntnisse

HP ALM Zephyr Jira Confluence Bitbucket

Kunde

Capgemini

Rolle

Manual Tester

Projektinhalte

Contributed to one of the largest global Guidewire implementations (Policy Center, Claim Center, Billing Center) within a program involving 400+ contributors

Kunde

Capgemini

1 year 1 month

2014-06 - 2015-06

Real estate marketing

Senior Real Estate Commercialization Specialist & Tester? PKP S.A.

Rolle

Senior Real Estate Commercialization Specialist & Tester? PKP S.A.

Kunde

Polish Railway Company

Kunde

Hamilton Health Center

Einsatzort

Harrisburg, Pennsylvania (USA)

Aus- und Weiterbildung

Studies - Computer Science? and Finance/ Human Resource Management
Wroc?aw University of Economics, Faculty of Management
Degree: Master and Bachelor

Certifications

ISTQB Foundation Level
Scrum Master

Kompetenzen

Top-Skills

Testanalyst K�nstliche Intelligenz Qualit�tsengineering Tester Test Manager QA Lead AI & GenAI Testing LLM Evaluation RAG Testing AI Output Validation Quality & Safety of AI Systems Hallucination detection Prompt injection testing LLM Security Testing LLM Jailbreak Testing AI Red Teaming Generative AI Testing Manual Tester AI Evaluation Trainer Quality Assurace

Produkte / Standards / Erfahrungen / Methoden

Methodologies
- Scrum (Agile)
- Kanban
- Waterfall
AI & Generative AI Tools
- ChatGPT�
- GPT-4�
- OpenAI Playground�
- Claude�
- Midjourney
- Promptfoo
Testing & QA Tools
- Postman�
- Swagger�
- TestRail�
- Zephyr�
- HP ALM
Collaboration & Development Tools
- Jira�
- Confluence�
- Atlassian ecosystem�
- Visual Studio Code
Platforms/ Systems
- ?Guidewire

Experience and knowledge

Test Lead�
LLM Testing�
LLM Jailbreak Testing�
Generative AI Testing
Prompt Engineering�
AI Output Validation�
Prompt Injection Testing�
LLM Security Testing
- QA professional with 12+ years of experience in international IT projects, specializing in software testing and quality assurance for standard and generative AI systems
- Strong expertise in manual testing, exploratory testing and end-to-end testing in Agile and hybrid environments (Scrum, Kanban, Waterfall)
- Proven ability to lead testing activities and coordinate cross-functional teams in complex software delivery environments
- Hands-on experience in LLM testing, AI output validation and prompt engineering, focusing on response quality, reliability and system behaviour
- Experience identifying AI-specific issues, including hallucinations, prompt vulnerabilities, context inconsistencies and response quality problems
Key Skills
- Generative AI testing�
- LLM testing�
- Prompt engineering�
- AI output validation�
- Prompt injection testing
- Test strategy�
- API testing�
- End-to-end testing�
- Exploratory testing�
- Manual testing
AI Testing & Evaluation Areas
- LLM output validation ? accuracy, relevance, consistency and reasoning quality
- Hallucination detection ? identification of fabricated or unsupported information in model outputs
- Prompt engineering & robustness testing ? testing prompt variations, edge cases and model stability
- Prompt injection & adversarial testing ? identifying vulnerabilities caused by malicious or manipulative inputs
- AI evaluation methodologies ? designing structured evaluation frameworks for generative AI outputs
- Context consistency & memory behaviour testing ? validation of long-context interactions and conversational coherence
- Multilingual AI testing ? evaluating model behaviour across languages and localization scenarios
- Bias & fairness testing ? identifying biased outputs and fairness risks in generative systems
- AI safety & reliability testing ? testing guardrails, safety responses and system resilience
- Human-AI interaction & UX evaluation ? assessing tone, usability and user alignment in AI-generated responses

Einsatzorte

St�dte

Wroc?aw (+150km)

L�nder

Deutschland

Remote-Arbeit

m�glich

Projekte

9 months

2025-05 - 2026-01

Amsterdam Contract project

AI QA Lead/ Generative AI Test Lead LLM Generative AI Prompt Engineering ...

Rolle

AI QA Lead/ Generative AI Test Lead

Projektinhalte

Led testing and quality assurance for AI-powered features, including text summarization, knowledge item generation and semantic search
Designed and executed manual and automated tests for generative AI systems, validating LLM responses for accuracy, relevance, consistency and safety
Performed prompt engineering and adversarial testing, including prompt injection and code injection scenarios
Conducted testing focused on hallucination detection, PII/data privacy protection, response quality evaluation and multilingual behaviour
Designed custom evaluation methodologies for AI outputs, enabling systematic assessment of model performance and behavior
Built and maintained automated AI evaluation tests using Promptfoo, integrated with CI/CD workflows in with developers
Created advanced test scenarios including edge cases, robustness and stress testing
Led test strategy and quality assurance processes across the AI product lifecycle, from early prototyping to production deployment
Delivered internal training ?Introduction to AI Testing?, training 15+ manual testers on testing approaches for LLM-based systems
Collaborated with AI engineers, developers and product teams to analyze model behavior and improve reliability of AI-generated outputs

Kenntnisse

LLM Generative AI Prompt Engineering Promptfoo AI Evaluation NLP Systems CI/CD

Kunde

Scenifi & TOPdesk

5 years 10 months

2019-09 - 2025-06

various

Test Consultant ? Test Manager/ QA Engineer

Rolle

Test Consultant ? Test Manager/ QA Engineer

Projektinhalte

Performed manual testing of enterprise applications, supporting automated testing by preparing test data, configuration files and analysing automated test execution results
Executed module, integration, system, API, end-to-end (E2E) and user acceptance tests (UAT)
Conducted regression, exploratory, smoke and stability testing, including edge-case scenarios and UX validation
Participated in full project lifecycle activities ? requirements analysis, test design, execution, defect tracking and release validation
Tested complex distributed systems, including voucher platforms, ticketing microservices and data analysis tools for international clients (Germany, Japan)
Designed test cases, test scenarios and test data, ensuring broad functional coverage and reliability of critical system components
Collaborated with developers, product owners and international stakeholders to ensure high software quality and smooth release cycles
Analysed defects, investigated root causes and contributed to process improvements and quality optimization

Kunde

GlobalLogic

1 year

2024-06 - 2025-05

Independent research & testing

Generative AI & LLM Testing | UX and Behavior Analysis

Rolle

Generative AI & LLM Testing | UX and Behavior Analysis

Projektinhalte

Conducted manual testing of large language models (LLMs) including GPT-4 and Gemini, as well as generative AI platforms such as ChatGPT and Midjourney
Evaluated LLM outputs for factual accuracy, hallucinations, context consistency and logical reasoning
Performed prompt engineering and scenario-based testing to explore edge cases and model robustness
Conducted comparative evaluation of generative AI models, analysing differences in response behavior and quality
Documented interaction logs and maintained structured feedback loops for model behaviour analysis
Analysed context memory, narrative consistency and conversational coherence in long-form interactions
Evaluated AI-human interaction quality, tone and UX alignment from a user-centered perspective
Tracked recurring model issues and behavioral changes across model updates and versions

1 year 10 months

2017-12 - 2019-09

Telecommunications infrastructure migration

Quality Assurance Leader

Rolle

Quality Assurance Leader

Projektinhalte

Acted as Manual QA Tester and Test Lead, leading frontend testing activities across large-scale telecommunications projects and distributed QA teams
Led testing within national telecom infrastructure migration programs (Deutsche Telekom ? copper to fiber), supporting complex multi-system deployments
Owned the full test lifecycle, including test planning, scope definition, test case design, execution, defect management, reporting and release validation
Defined test strategies, quality metrics, timelines and risk mitigation plans in fast-paced, high-complexity environments
Conducted accessibility testing (ARIA compliance) using tools such as JAWS screen reader and contrast analyzers, contributing to two successful ARIA audits
Coordinated distributed QA teams, providing guidance and support in international, cross-functional environments
Collaborated closely with system architects, business analysts, product owners and development teams to ensure software quality and release readiness
Supported release planning and deployment validation, ensuring system stability and production readiness
Contributed to the Capgemini NTC testing community, co-creating knowledge-sharing initiatives and delivering internal and external QA trainings

Kunde

Capgemini

1 year 2 months

2016-12 - 2018-01

various

Senior Test Analyst/ QA Lead & Project Coordinator HP ALM Zephyr Jira ...

Rolle

Senior Test Analyst/ QA Lead & Project Coordinator

Projektinhalte

Progressed from Senior Test Analyst/ Test Team Lead to Quality Assurance Test Lead within large-scale international programs
Led end-to-end testing processes, including test planning, scope definition, test design, execution, defect management, reporting and release validation
Defined test strategies, quality metrics, timelines, resource estimation and risk mitigation plans to ensure product reliability
Coordinated distributed QA teams, providing leadership and maintaining consistent testing standards across projects
Conducted frontend, integration, regression and user acceptance testing (UAT) across enterprise systems in the insurance and telecommunications domains
Performed accessibility testing (ARIA compliance) using tools such as JAWS screen reader and contrast analyzers
Collaborated with system architects, product owners, business analysts and development teams to ensure alignment of quality, scope and delivery

Kenntnisse

HP ALM Zephyr Jira Confluence Bitbucket

Kunde

Capgemini

Rolle

Manual Tester

Projektinhalte

Contributed to one of the largest global Guidewire implementations (Policy Center, Claim Center, Billing Center) within a program involving 400+ contributors

Kunde

Capgemini

1 year 1 month

2014-06 - 2015-06

Real estate marketing

Senior Real Estate Commercialization Specialist & Tester? PKP S.A.

Rolle

Senior Real Estate Commercialization Specialist & Tester? PKP S.A.

Kunde

Polish Railway Company

Kunde

Hamilton Health Center

Einsatzort

Harrisburg, Pennsylvania (USA)

Aus- und Weiterbildung

Studies - Computer Science? and Finance/ Human Resource Management
Wroc?aw University of Economics, Faculty of Management
Degree: Master and Bachelor

Certifications

ISTQB Foundation Level
Scrum Master

Kompetenzen

Top-Skills

Produkte / Standards / Erfahrungen / Methoden

Methodologies
- Scrum (Agile)
- Kanban
- Waterfall
AI & Generative AI Tools
- ChatGPT�
- GPT-4�
- OpenAI Playground�
- Claude�
- Midjourney
- Promptfoo
Testing & QA Tools
- Postman�
- Swagger�
- TestRail�
- Zephyr�
- HP ALM
Collaboration & Development Tools
- Jira�
- Confluence�
- Atlassian ecosystem�
- Visual Studio Code
Platforms/ Systems
- ?Guidewire

Experience and knowledge

Test Lead�
LLM Testing�
LLM Jailbreak Testing�
Generative AI Testing
Prompt Engineering�
AI Output Validation�
Prompt Injection Testing�
LLM Security Testing
- QA professional with 12+ years of experience in international IT projects, specializing in software testing and quality assurance for standard and generative AI systems
- Strong expertise in manual testing, exploratory testing and end-to-end testing in Agile and hybrid environments (Scrum, Kanban, Waterfall)
- Proven ability to lead testing activities and coordinate cross-functional teams in complex software delivery environments
- Hands-on experience in LLM testing, AI output validation and prompt engineering, focusing on response quality, reliability and system behaviour
- Experience identifying AI-specific issues, including hallucinations, prompt vulnerabilities, context inconsistencies and response quality problems
Key Skills
- Generative AI testing�
- LLM testing�
- Prompt engineering�
- AI output validation�
- Prompt injection testing
- Test strategy�
- API testing�
- End-to-end testing�
- Exploratory testing�
- Manual testing
AI Testing & Evaluation Areas
- LLM output validation ? accuracy, relevance, consistency and reasoning quality
- Hallucination detection ? identification of fabricated or unsupported information in model outputs
- Prompt engineering & robustness testing ? testing prompt variations, edge cases and model stability
- Prompt injection & adversarial testing ? identifying vulnerabilities caused by malicious or manipulative inputs
- AI evaluation methodologies ? designing structured evaluation frameworks for generative AI outputs
- Context consistency & memory behaviour testing ? validation of long-context interactions and conversational coherence
- Multilingual AI testing ? evaluating model behaviour across languages and localization scenarios
- Bias & fairness testing ? identifying biased outputs and fairness risks in generative systems
- AI safety & reliability testing ? testing guardrails, safety responses and system resilience
- Human-AI interaction & UX evaluation ? assessing tone, usability and user alignment in AI-generated responses

Vertrauen Sie auf Randstad

Im Bereich Freelancing

Im Bereich Arbeitnehmer�berlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Name E-Mail-Adresse Ihre Frage

Telefonnummer Unternehmen

Ich habe die Datenschutzbestimmungen gelesen und bin damit einverstanden.

Einsatzorte

Projekte

Aus- und Weiterbildung

Kompetenzen

Top-Skills

Produkte / Standards / Erfahrungen / Methoden

Einsatzorte

Projekte

Aus- und Weiterbildung

Kompetenzen

Top-Skills

Produkte / Standards / Erfahrungen / Methoden

Vertrauen Sie auf Randstad

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.