My Career

Work Experience

4+ years of building AI systems, ML pipelines, and data solutions across various industries.

October 2023 - Present

Consultant Data Scientist

Nature&Decouvertes

France

Design and deployment of a hybrid recommendation system (collaborative filtering, content-based filtering, and hybrid approaches) to enhance user experience, with large-scale production deployment in a cloud environment.

Key Achievements

Designed and implemented a recommendation engine in Python combining collaborative filtering, content-based filtering, and hybrid approaches.
Built real-time user data ingestion and processing pipelines using Apache Kafka and Spark MLlib.
Performed large-scale data preparation, cleaning, and transformation using Pandas and Spark, with advanced indexing and search capabilities via Elasticsearch.
Developed, trained, and optimized Machine Learning and Deep Learning models using Scikit-learn, TensorFlow, and PyTorch.
Led industrialization and MLOps practices, including Docker containerization, experiment tracking with MLflow, CI/CD integration, monitoring, and deployment on AWS SageMaker.
Explored and integrated Large Language Models (LLMs) to enrich the recommendation system.
Automatic generation of personalized content tailored to user profiles.
Results : Improved recommendation relevance with an 18% increase in Click-Through Rate (CTR). Reduced inference latency by 35% through model optimization.Achieved reliable and scalable production deployment through MLOps and cloud infrastructure (AWS).

Technologies

C++PythonKafkaSpark MLlibPandasTensorFlowPyTorchPostgreSQLMongoDBElasticsearchDockerAWS SageMakerLLMsJira

September 2022 - September 2023

Research Data Scientist

UPEC

France

Applied R&D project focused on designing and optimizing NLP models capable of automatically detecting abusive language, toxic speech, and hate content on social media platforms. The project emphasized memory and latency optimization, as well as industrialization for production use.

Key Achievements

Designed and developed NLP models based on Transformer architectures (BERT, RoBERTa) for abusive content classification.
Fine-tuned Large Language Models (LLMs) on domain-specific corpora to improve detection accuracy and robustness to informal language and social media–specific contexts.
Applied model distillation techniques to reduce model size and accelerate inference while maintaining high performance.
Built end-to-end NLP pipelines, including data cleaning, preprocessing, tokenization, embeddings, and vectorization.
Developed Retrieval-Augmented Generation (RAG) prototypes combining LLMs with document retrieval systems (vector databases and Elasticsearch) to: Improve classification accuracy. Provide contextual explanations of model decisions to support Explainable AI (XAI).
Results : Improved abusive language detection performance with a 15% increase in F1-score on specialized datasets. Delivered an explainable RAG prototype enabling moderators to better understand the context behind model decisions.

Technologies

PythonAutoencodersIsolation ForestAirflowDashPower BITalendBigQueryAWS Redshift

August 2021 - August 2022

AI & Data Consultant

Deloitte

Morocco

Contributed to two major innovation projects supporting the adoption of retail solutions provided by OCADO and RELEX, with the goal of improving pre-sales efficiency, automating customer interactions, and increasing the effectiveness of data and ML workflows in a cloud environment.

Key Achievements

Designed and implemented a full end-to-end MLOps architecture, including automated data collection, continuous model deployment via CI/CD pipelines (GitLab CI, Jenkins), and proactive monitoring of model performance in production.
Developed an intelligent AI-powered pre-sales assistant leveraging advanced Prompt Engineering techniques and Large Language Models (LLMs) to analyze customer needs and automate personalized responses.
Designed and managed a structured cloud-based Data Warehouse (AWS Redshift, BigQuery), including schema optimization and integration through automated ELT pipelines (Talend, Airflow).
Implemented robust data quality controls, including KPI definition and monitoring, automated validation processes, and anomaly detection using Python and BI tools (Power BI).
Orchestrated and containerized ML workflows and data pipelines using Docker and Kubernetes, ensuring scalability, reliability, and high availability.
Collaborated within an Agile (AgileSafe) environment, actively participating in sprints, reviews, iterative planning, and continuous delivery aligned with business requirements.
Results : Increased pre-sales productivity through the AI assistant, achieving a 30% reduction in customer request processing time and improved recommendation accuracy.
Results : Enhanced data quality and reporting through automated KPIs and anomaly alerts, ensuring compliance and consistency for business decision-making.

Technologies

PythonPyTorchDetectron2OpenCVGStreamerDockerKubernetesJenkinsCI/CDMLOpsPrompt EngineeringGitLab CITalendAirflowAWS RedshiftAWS LambdaBigQueryPower BI

February 2021 - July 2021

R&D Data Scientist (Internship)

SiliconeSignal Technologies

Morocco

Satellite image classification and segmentation project aimed at identifying and localizing buildings and other infrastructures from Sentinel-2 data, leveraging advanced Deep Learning approaches.

Key Achievements

Analyzed and preprocessed Sentinel-2 imagery, including radiometric correction, normalization, patch extraction, and creation of deep learning–ready datasets.
Conducted benchmarking and evaluation of existing methods and research projects for satellite image classification.
Compared traditional machine learning and deep learning approaches, including CNNs, autoencoders, and Transformers, for image classification and segmentation tasks.

Technologies

PythonDeep LearningSentinel2UnetMaskRCNNAuto-EncodersTransformers