This client had a bottleneck on producing high-quality and up-to-date research for their website. The existing manual process was too resource-intensive and error-prone to scale to the necessary volume of content.
ACTION
I built a multi-modal Retrieval-Augmented Generation (RAG) system for automated production of research, with direct website content injection. I also integrated 5+ data sources, including web search and an in-house content farm, and periodic reflection mechanisms, using a modular Large Language Model (LLM) framework with fine-grained oversight.
OUTCOME
Was able to ramp-up research production to an unbounded scale, while maintaining cost-effectiveness of the process. Also ensured continual content relevance and iterative accuracy improvement.
The client required an industry-grade OCR tool capable of handling a large volume and wide variety of internal documents in real time. The data set included 1M+ multilingual documents across 30+ languages and 20+ formats.
ACTION
Architect and deploy AWS/Azure multi-cloud solution, while managing a cross-functional team of Data Scientists and MLOps engineers.
OUTCOME
The solution was not only able to accommodate all of the client's requirements, but also provide a robust monitoring system that recorded all file entries and respective statuses across the pipeline, for fine-grained oversight over the process.
The Research and Development (R&D) had trouble organising and visualising large volumes of research papers, due to their verbose nature. They also struggled to understand how research papers related to each other on a topic level, beyond their references and citations.
ACTION
Develop an innovative NLP solution, using Topic Modelling and Search Engine techniques to plot research papers onto a user-friendly 2D graph (zoom out), while also allowing for users to search and find particular information (zoom in), within the large dataset.
OUTCOME
Visualisation of 10k+ papers in an interactive plot which clusters research papers by topic, allowing for an unprecedented eagles-eye view on the data and accelerating discovery of research.
The client's team had a bottleneck on manually extracting critical information from large volumes of documents.
ACTION
Create a NER microservice for performing real-time data extraction, via detecting patterns on those documents, and optimising for the most important templates in the dataset.
OUTCOME
Reduced manual document search time by 100+ hours annually. Service was also implemented with a validation step to ensure reliability, and modular design for cross-departmental re-usability.
The Human Resources (HR) team saw that standard Applicant Tracking Systems (ATS) lacked customisation, and their effectiveness fell short on use-cases where metrics needed tailored for the specific technical offers.
ACTION
Built an in-house ATS tool with NLP-powered analytics, that can accept any number of resumes and recruiter requirements, and display multiple metrics tailored to the specific technical offer.
OUTCOME
An interactive dashboard with matching results, 40% more customisable than standard tools, for streamlining recruitment processes and providing critical insights for hiring decisions.
In the aviation sector, human factors are the primary cause of safety incidents. Intelligent prediction systems, which are capable of evaluating human state and managing risk, have been developed over the years to identify and prevent human factors. However, the lack of large useful labelled data has often been a drawback to the development of these systems.
ACTION
Present a methodology to identify and classify human factor categories from aviation incident reports, introducing a novel classification framework, and pioneering methods linking Machine Learning (ML) to aviation safety.
OUTCOME
The best predictive models achieved a Micro F1 score of 0.900, 0.779, and 0.875, for each level of the taxonomic framework, proving that favourable predicting performances can be achieved for the classification of human factors based on text data. The published research also influenced subsequent academic work, driving innovations in minimising aviation incidents through advanced human factors analysis.
Profile:
He is a seasoned NLP Data Scientist with a history of developing AI solutions spanning both academic research and industry applications. He has proven proficiency in architecting and implementing high-performance systems, while collaborating with cross-functional teams. Tomás excels at aligning technological innovations with business objectives. He continuously expands his knowledge to stay current with AI advancements.
SKILLS
Work Experience
2023 - 2024
Role: Generative AI Engineer
Customer: BIG Ai
Tasks:
2021 - 2023
Role: NLP Data Scientist
Customer: Siemens Energy
Tasks:
2020 - 2021
Role: NLP Data Science Researcher
Customer: Instituto Superior Técnico
Tasks:
This client had a bottleneck on producing high-quality and up-to-date research for their website. The existing manual process was too resource-intensive and error-prone to scale to the necessary volume of content.
ACTION
I built a multi-modal Retrieval-Augmented Generation (RAG) system for automated production of research, with direct website content injection. I also integrated 5+ data sources, including web search and an in-house content farm, and periodic reflection mechanisms, using a modular Large Language Model (LLM) framework with fine-grained oversight.
OUTCOME
Was able to ramp-up research production to an unbounded scale, while maintaining cost-effectiveness of the process. Also ensured continual content relevance and iterative accuracy improvement.
The client required an industry-grade OCR tool capable of handling a large volume and wide variety of internal documents in real time. The data set included 1M+ multilingual documents across 30+ languages and 20+ formats.
ACTION
Architect and deploy AWS/Azure multi-cloud solution, while managing a cross-functional team of Data Scientists and MLOps engineers.
OUTCOME
The solution was not only able to accommodate all of the client's requirements, but also provide a robust monitoring system that recorded all file entries and respective statuses across the pipeline, for fine-grained oversight over the process.
The Research and Development (R&D) had trouble organising and visualising large volumes of research papers, due to their verbose nature. They also struggled to understand how research papers related to each other on a topic level, beyond their references and citations.
ACTION
Develop an innovative NLP solution, using Topic Modelling and Search Engine techniques to plot research papers onto a user-friendly 2D graph (zoom out), while also allowing for users to search and find particular information (zoom in), within the large dataset.
OUTCOME
Visualisation of 10k+ papers in an interactive plot which clusters research papers by topic, allowing for an unprecedented eagles-eye view on the data and accelerating discovery of research.
The client's team had a bottleneck on manually extracting critical information from large volumes of documents.
ACTION
Create a NER microservice for performing real-time data extraction, via detecting patterns on those documents, and optimising for the most important templates in the dataset.
OUTCOME
Reduced manual document search time by 100+ hours annually. Service was also implemented with a validation step to ensure reliability, and modular design for cross-departmental re-usability.
The Human Resources (HR) team saw that standard Applicant Tracking Systems (ATS) lacked customisation, and their effectiveness fell short on use-cases where metrics needed tailored for the specific technical offers.
ACTION
Built an in-house ATS tool with NLP-powered analytics, that can accept any number of resumes and recruiter requirements, and display multiple metrics tailored to the specific technical offer.
OUTCOME
An interactive dashboard with matching results, 40% more customisable than standard tools, for streamlining recruitment processes and providing critical insights for hiring decisions.
In the aviation sector, human factors are the primary cause of safety incidents. Intelligent prediction systems, which are capable of evaluating human state and managing risk, have been developed over the years to identify and prevent human factors. However, the lack of large useful labelled data has often been a drawback to the development of these systems.
ACTION
Present a methodology to identify and classify human factor categories from aviation incident reports, introducing a novel classification framework, and pioneering methods linking Machine Learning (ML) to aviation safety.
OUTCOME
The best predictive models achieved a Micro F1 score of 0.900, 0.779, and 0.875, for each level of the taxonomic framework, proving that favourable predicting performances can be achieved for the classification of human factors based on text data. The published research also influenced subsequent academic work, driving innovations in minimising aviation incidents through advanced human factors analysis.
Profile:
He is a seasoned NLP Data Scientist with a history of developing AI solutions spanning both academic research and industry applications. He has proven proficiency in architecting and implementing high-performance systems, while collaborating with cross-functional teams. Tomás excels at aligning technological innovations with business objectives. He continuously expands his knowledge to stay current with AI advancements.
SKILLS
Work Experience
2023 - 2024
Role: Generative AI Engineer
Customer: BIG Ai
Tasks:
2021 - 2023
Role: NLP Data Scientist
Customer: Siemens Energy
Tasks:
2020 - 2021
Role: NLP Data Science Researcher
Customer: Instituto Superior Técnico
Tasks: