Rita Clemente Pascual
Scroll
About me
Hey! My name is Rita Clemente, I’m a Data Science graduate driven by curiosity, creativity and the desire to make things better.
What truly inspires me is exploring complex problems from different angles — whether it’s technology, design, communication or strategy. I enjoy moving across disciplines, translating data into stories, ideas into action, and insights into meaningful impact.
I thrive in dynamic environments where I can learn fast, adapt, and contribute with both analytical thinking and an intuitive sense of what works. For me, innovation is not just about models or tools — it’s about asking the right questions, understanding context, and creating value that lasts.
Skills
Programming & Data Science Skills
Python & R – Proficient in data analysis, machine learning (scikit-learn, caret), clustering and text processing (SBERT, tidytext). Extensive use in academic and applied projects, including startup segmentation and advanced visualization.
SQL & Pandas – Management, querying and transformation of structured databases for exploratory analysis and dashboard creation.
MATLAB – Used in automation, scripting and technical prototyping tasks during undergrad.
SQL – Experience with relational databases and query optimization.
Other tools – Tableau, Power BI, advanced Excel, Jupyter Notebooks, Anaconda, Git.
Languages
Spanish – Native proficiency.
Valencian – Native proficiency. (C1 level)
English – Very fluent, both spoken and written.
Open to learning new languages
Education
Education
Bachelor’s Degree in Data Science
Universitat Politècnica de València (UPV)
Valencia, Spain | 2021 – 2025
Specialized in data analysis, atificial intelligence and machine learning
Final Project: Market and investment analysis in preventive medicine using unsupervised learning on multimodal data
Exchange Semester – Erasmus+ Programme
Högskolan i Skövde
Skövde, Sweden | 2024 – 2025
Gained intercultural skills and expanded academic perspective
Bachillerato Tecnológico
IES Penyagolosa
Castellón, Spain | 2019 – 2021
Oriented towards mathematics, physics, and technical drawing
Professional and Elementary Music Studies – Violin
Conservatorio Profesional de Música Mestre Tàrrega
Castellón, Spain | 2011 – 2021
Completed 10 years of classical music education with specialization in violin
Developed discipline, creativity and artistic sensitivity
Projects – LookAlike
Understanding fashion styles is crucial for personal identity and market trends. Our
research focuses on developing an unsupervised machine learning model that analyzes
and categorizes outfits based on stylistic patterns. By leveraging vast data resources,
our model, «Style Matchmaker: Find Similar, Shop Smarter,» aims to enhance the
prediction of fashion preferences, providing valuable insights for the industry.
We employed advanced techniques such as unsupervised feature learning, clustering,
autoencoders, and Natural Language Processing (NLP) to analyze and group outfits.
Significant effort was devoted to data preparation, including embedding textual
descriptions using sBERT and processing images with state-of-the-art models like
ResNet50 and Vision Transformers (ViT). These embeddings were used to cluster outfits
using methods like K-means++ and Gaussian Mixture Models, evaluated through
internal metrics and visual validation.
Our model successfully identifies distinct fashion styles, offering benefits such as
personalized recommendations, improved inventory management, and refined
marketing strategies. This comprehensive approach bridges the gap between
consumer preferences and market offerings, providing a robust foundation for future
research and practical applications in fashion technology. Early adopters, including
online buyers and fashion professionals, can utilize these insights to stay ahead in a
competitive market and deliver a more personalized shopping experience.

If interested in any of this projects, please contact via email
Projects – PCOS
By using advanced statistical techniques such as Principal Component Analysis (PCA), Multiple Correspondence Factor Analysis (MCA) and Clustering techniques, this research deepens the knowledge of PCOS and the search for biomarkers that can improve the diagnostic and treatment protocols of this disease. In addition to these techniques, supervised classification methods such as Linear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA) and logistic regression were applied, allowing effective discrimination between patients affected and not affected by PCOS.
Our results show significant variations in hormonal and anthropometric profiles between the groups, highlighting the complexity of PCOS and the diversity of its symptoms. Furthermore, the study examines how
different symptomatic and metabolic variables are related to PCOS, providing new insights into the potential underlying mechanisms influencing this pathology. The use of machine learning methods and multivariate analysis makes it possible to better classify the different types of PCOS, facilitating a more individualized and efficient approach to its management.

If interested in any of this projects, please contact via email
Projects – Final Tesis
The aging population and the growing need for solutions that improve our quality of life have driven the development of sectors such as longevity and preventive medicine. In this scenario, emerging technologies such as “digital twins” and artificial intelligence are revolutionizing the way we diagnose, monitor and intervene in the field of health.
This Final Degree Project has been organized in three stages. First, an exploratory analysis of these sectors was carried out, highlighting their main trends, technological applications and market projections. Then, a market analysis was carried out with a multimodal database with 316 samples and 37 variables. This included the mapping of startups and investment rounds, which allowed contextualizing entrepreneurial activity and capital flow in the ecosystem. Finally, unsupervised learning techniques were applied on multimodal data (numerical, categorical and textual), using SBERT embeddings, dimensionality reduction (PCA, t-SNE, UMAP) and clustering algorithms such as K-means, in order to identify coherent groups of startups according to their technological and strategic focus.
The results were able to segment the ecosystem into seven complementary groups, validated and contrasted with a manual mapping by industry experts. This segmentation provides a structured view of the market, facilitating the identification of investment opportunities and the design of strategies in the field of preventive medicine and longevity. It also highlights the potential of unsupervised methods for organizing complex information and extracting valuable knowledge in contexts of innovation and technological investment.

If interested in any of this projects, please contact via email
Projects – Risk Management
DigiNotar, a digital certificate provider founded in 2017, was responsible for issuing SSL and PKI certificates. In 2021, it was hacked, allowing attackers to issue over 500 fake certificates. This breach severely damaged the company’s reputation, leading to customer distrust and investigations.
This report conducts a risk analysis for the company DigiNotar to systematically identify its security vulnerabilities and develop appropriate measures for risk mitigation. The purpose of risk management is to detect potential threats early, reduce the impact of security incidents, and ensure long-term business continuity. This analysis aims to help DigiNotar better defend against future attacks and restore customer trust.
The report is structured as follows: First, the company’s context is established, and the fundamental criteria for the analysis are defined. Then, the key assets, potential threats, existing and planned controls, vulnerabilities, and possible consequences are identified and listed. Based
on this, a comprehensive risk analysis is conducted, and various approaches to risk mitigation are presented. This step-by-step approach ensures a thorough assessment of DigiNotar’s security posture and provides valuable recommendations to enhance its risk management framework.

If interested in any of this projects, please contact via email
Projects – Natural Language and Information Retrieval
Conspiracy theories are a kind of stories that try to explain why important events happen
by blaming powerful hidden groups. Natural Language Processing is a field of computer
science that deals with the interaction between computer and human language, enabling
computers to understand, interpret and generate human language in a meaningful and
natural way.
In this project, NLP models will be used to try to identify automatically either conspiracy
or critical narratives. The dataset used for the project comes from the Telegram platform
related to the COVID-19 pandemic, in English and Spanish, with their respective labels.
This will be handed as a binary classification problem, where it is necessary to determine
whether a given text belongs to a conspiracy theory or a critical thinking category.

If interested in any of this projects, please contact via email
Hobbies
In my free time, I love combining creativity and curiosity across many areas. I enjoy graphic design, digital fashion, and experimenting with new tools that mix art and tech. I also explore marketing and visual storytelling — from branding ideas to analysing how people interact with content.
Outside of screen time, I enjoy discovering new cities, planning aesthetic trips, and capturing everyday beauty through photos and writing.
Music has also been part of my life for over a decade — I studied violin for 10 years at a professional conservatory, which taught me discipline, sensitivity and teamwork.
I’m always looking for new things to learn, try or connect — and I never close the door to a surprising interest.