Personal Website

Hi, my name is

Faisal Amin

I am a German national who has recently returned to Germany after completing a master's degree in Singapore. With fluency in both German and English, I bring a diverse perspective honed through international experience. Currently based in Darmstadt, Germany, I am eager to leverage my skills and relocate for the right career opportunity. My academic and professional background, combined with hands-on projects and a passion for continuous learning, have equipped me with a unique blend of technical expertise and adaptability to thrive in dynamic environments.

Education

National University of Singapore (NUS)

Singapore

Master of Computing - Artificial Intelligence Specialisation

Aug 2022 – June 2024

GPA: 4.7/5.0
Awards: Dean’s List – Top 5% of Cohort in Academic Performance
Relevant Modules: Neural Networks and Deep Learning (I and II), Natural Language Processing (A+), Text Mining (A+), Uncertainty Modelling in AI (A+), Big Data Analytics (A+), Applied Analytics (A+)

University of Durham

Durham, United Kingdom

Bachelor of Science in Computer Science

Oct 2017 - Aug 2020

GPA: 4.0/4.0 (Graduated with First Class Honours)
Awards: Outstanding Achievement Distinction, BCS Chartered Institute for IT Prize - Top 5 Student for Academic Achievement among Graduating BSc and MEng Students

ISF International School Frankfurt Rhein-Main

Frankfurt, Germany

Bilingual Diploma of the International Baccalaureate

Aug 2015 - July 2017

Final Scoring: 42/45 (Equivalent to 1.0 Abitur)

ISF International School Frankfurt Rhein-Main

Frankfurt, Germany

High School Diploma with Distinction

Aug 2005 - June 2017

GPA: 3.94 / 4.0

Work Experience

Machine Learning Engineer

Singapore

Savvy - NUS Social Impact Catalyst Nonprofit Organisation

Jan 2024 - May 2024

Integrated a RAG Chatbot with Python and LangChain, leveraging Pinecone for the vector database, into an application offering a lesson-based curriculum for elderly users to acquire digital literacy skills
Conducted comprehensive testing of local, API-based and AWS SageMaker-based Large Language Models with regards to scalability, pricing, reliability and performance
Developed classical and neural network models to analyse and interpret user data as well as evaluate pacing and effectiveness of the learning program
Synthesized a learning algorithm for the curriculum roadmap and fine-tuned reward weights based on the aforementioned user data analytics to improve user engagement and retention rates

NLP Teaching Assistant

Singapore

NUS CS5246 Text Mining

Jan 2024 – May 2024

Mentored and provided tailored after-class support to over 100 students in a graduate-level NLP course
Crafted nuanced assignments and projects to gauge student comprehension and mastery of NLP principles
Administered final exams and quizzes while accurately managing logistics and grading standards

Research Assistant

Singapore

NUS Asian Institute of Digital Finance – Credit Research Initiative

Oct 2023 – Jan 2024

Curated robust datasets by scraping international stock exchange announcements with Python and Selenium
Employed NLP and information extraction techniques to parse relevant financial websites and documents
Utilized LLMs like OpenAI GPT 3.5 and Llama 2 to classify default events for publicly traded companies

Data Analyst

Darmstadt, Germany

AM Group

May 2023 - July 2023, Jan 2021 - July 2022

Analysed delivery data within delivery zones to optimize distribution of coupons using PostgreSQL, lowering number of coupons distributed by about 35% with a profit margin increase of up to 20%
Conducted analysis of energy costs via Python and PostgreSQL, identifying periods of exceptional fluctuations in electric expenses which led to a 14% reduction in electric costs
Systemized a weekly delivery sales comparison chart with Microsoft Excel to track restaurant performances, highlighting underperforming restaurants and visualizing their ranking against competitor restaurants
Compiled data from 100+ delivery vehicles and regularly presented findings to restaurant managers and stakeholders as part of our Digitalization Project using interactive dashboards in Tableau

Software Engineering Intern

Darmstadt, Germany

Software AG

July 2019 - Sep 2019

Developed an education package comprised of JavaServer Pages to upskill users about Internet of Things
Managed it with OpenCMS and constructed in-house usability tests for maintenance and iterative upgrading
Facilitated seamless communication of updates and suggestions to cross-functional and international teams

Completed Projects

Twitter Bot Detection

Extracted and generated two robust datasets from a 90GB+ industry standard Twi-Bot 22 dataset
Conducted extensive feature engineering, EDA and data visualizations to identify critical patterns of bots
Developed two sets of multiple classifiers, one for unstructured and one for structured data, using both classical and deep neural network models, various text embedding strategies and bootstrapping
Implemented stacking with majority voting for combining all model predictions to enhance model performance and robustness, achieving results approaching state-of-the-art results from the original paper

PythonPyTorchScikit-learnPandasNLTKMatplotlibSeaborn

WeCare – Personalized Health Chatbot

Developed a healthcare chatbot companion utilizing a RAG-LLM setup with OpenAI, LangChain, and Pinecone vector databases to provide personalized medical assistance
Incorporated over 1000 scraped medical articles from respected Singaporean health websites and encyclopaedias, ensuring comprehensive, relevant and up-to-date information for users
Engineered a system that analyses user-uploaded medical reports, leverages top-k relevant knowledge from Pinecone, and incorporates conversation history to deliver personalized context for more accurate responses
Orchestrated backend integration with Python and FastAPI as well as a user-friendly frontend with React

PythonOpenAILangChainPineconeFastAPIReactSelenium

Mental Disorder Classification

Conducted a multi-label classification deep learning task in Python to detect depression, anxiety and neutral sentiments in Reddit and Twitter posts via classical and neural network-based approaches
Scraped over 40,000 Reddit posts via subreddit-based queries to train/evaluate the models and over one million Twitter tweets to evaluate transfer learning capabilities of the models on a different domain
Created classical models like KNN and Logistic Regression, utilizing a TF-IDF representation with a maximum accuracy of 86% on Reddit data and 93% on Twitter data
Built neural network models including MLP, CNN, RNN, Transformers and DeBERTa using fastText pretrained word embeddings with a maximum accuracy of 90% on Reddit data and 94% on Twitter data
Completed a separate Fake News Detection Project as well on a pre-existing dataset in a similar manner with a larger focus on domain transfer learning and large language models like BERT and Electra

PythonPyTorchScikit-learnPandasNLTKMatplotlibSeaborn

WiseGuard

Led the development of WiseGuard, a full-stack project leveraging LLMs to empower Singapore's seniors with scam prevention and awareness.
Utilized Python, OpenAI, and LangChain to integrate GPT 3.5, crafting realistic scam conversations and quizzes highlighting red flags and safety measures across six categories of scams.
Implemented Flask in the backend for response validation and processing of LLM responses
Engineered a user-friendly frontend using Django and Jinja, optimizing usability and accessibility, while deploying the application seamlessly with Docker and Render and managing user authentication through SQLite.

PythonOpenAILangChainFlaskDjangoJinjaDockerRenderSQLite

Portfolio Website

Developed a modern portfolio website utilizing Next.js and Tailwind CSS
Enhanced website interactivity and visual appeal by integrating Lottie Files, React Hot Toast, and other React libraries
Implemented an inbuilt web form that can send me a personal email via Resend
Ensured a highly responsive design for optimal user experience across all devices
Deployed the website via Vercel with a custom domain

ReactNext.jsTailwind CSSLottie FilesReact Hot ToastResendVercel

Grad Student Association of Computing Executive

Led the organization and execution of online and physical hackathons while actively engaging with potential sponsors to secure support for events and networking opportunities for participants
Fostered a sense of community via social events, facilitating connections among members and allowing for interdisciplinary discussions
Conceptualized innovative challenges tailored to members' interests and skill levels, encouraging participation and collaboration during events

ConTra - Self-Supervised Contrastive Approach to Text Classification using Transformers

Designed and implemented ConTra, a self-supervised contrastive learning approach using transformers for text classification tasks as an alternative to masked language modeling
Explored and analyzed the effectiveness of 7 different text data augmentation techniques like synonym substitution, word deletion, and contextual replacements for generating positive and negative sample pairs
Developed a transformer encoder model trained with a contrastive loss objective inspired by SimCLR, utilizing up to 4 chained augmentations to learn robust text representations
Conducted extensive experiments comparing ConTra against DistilBERT, achieving competitive performance of less than 1% accuracy difference on a text classification dataset when pre-trained on limited data

PythonPyTorchScikit-learnPandasNLTKMatplotlibSeaborn

Topic Modelling on Social Media Customer Service

Performed data wrangling and preprocessing on a dataset of 3 million tweets using PySpark's parallel processing to group relevant tweets into customer support conversations
Applied unsupervised learning techniques like K-Means clustering, Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), and Non-Negative Matrix Factorization (NMF) to identify major topics in the Twitter customer support data
Conducted comprehensive data analysis and generated insights per topic, including response time analysis, sentiment-based resolution/escalation rate studies, and identification of private vs urgent exchanges
Provided data-driven recommendations to enhance customer service strategies on social media, such as optimizing resource allocation, improving response times, addressing dissatisfaction for time-sensitive issues, and increasing visibility of private support channels

PythonPySparkScikit-learnPandasNLTKMatplotlibSeaborn

Fake News Detection

Designed and implemented 9 different deep learning models including MLPs, CNNs, LSTMs, Transformers, and pre-trained language models like BERT and RoBERTa for multi-label fake news classification
Performed comprehensive data preprocessing, exploratory analysis, and augmentation techniques on two fake news datasets from different domains to enhance model performance
Conducted extensive experiments to evaluate model effectiveness at solving the fake news detection task and transferring capabilities across domains, achieving state-of-the-art F1 scores
Optimized model architectures, hyperparameters, and training strategies like multi-task learning, regularization, and early stopping to improve generalization and prevent overfitting

PythonPyTorchScikit-learnPandasNLTKMatplotlibSeaborn

AWS PartyRock Generative AI Hackathon - Chimera Lab

Leveraged AWS PartyRock to develop Chimera Lab, an educational AI tool for young learners where they combine animal body parts, triggering educational content and generating unique final creature descriptions and images
Employed prompt engineering techniques within PartyRock to seamlessly integrate user choices, generating a cohesive creature description and accompanying image

AWS PartyRockPrompt Engineering

AI Singapore National AI Student Challenge 2024

Given a real-world dataset by PetFinder.my to derive actionable insights regarding pet adoption rates
Conducted Data Cleaning, Exploratory Data Analysis, Feature Engineering, Feature Selection, Model Building, and Model Evaluation
Effectively managed cleaning and merging of multiple datasets
Tested and evaluated multiple classical machine learning models using a pipeline, cross-validation, and grid search

PythonScikit-learnPandasMatplotlibSeaborn

American Politics Classifier

Developed multiple binary classification deep learning models capable of identifying pro-Trump and pro-Biden sentiments in Twitter tweets in Python
Scraped over 40,000 tweets using hashtag-based queries and applied NLP preprocessing steps to build higher quality training/validation/testing datasets
Formulated multiple models (MLP, CNN, LSTM, BERT) with different feature engineering techniques (bag-of-words, TF-IDF, and word embeddings) to achieve a maximum accuracy of 80%

PythonPyTorchScikit-learnPandasNLTKMatplotlibSeaborn

ChatCraft

Created ChatCraft, a conversational simulation tool aimed at enhancing communication skills through AI-driven interactions.
Designed and developed a user-friendly frontend interface using Streamlit, optimizing the user experience and accessibility of the application.
Leveraged Python, LangChain, and OpenAI for backend development, enabling seamless integration of GPT 3.5 to simulate realistic conversations.

PythonLangChainOpenAIStreamlit

Connect4Good

Conceptualized and developed a full-stack event-to-volunteer matching application with personalized task generation
Leveraged SQLAlchemy in Python to interact with PostgreSQL databases
Utilized OpenAI GPT 3.5 and LangChain to generate personalized tasks for each matched event
Utilized FastAPI for streamlined backend routing and Django with Jinja for frontend development

PythonOpenAILangChainSQLAlchemyPostgreSQLFastAPIDjangoJinja

Flatland Challenge

Implemented DQN-based reinforcement learning models (DQN, Double DQN, Dueling DQN, Dueling Double DQN) to solve the Vehicle Rescheduling Problem in the Flatland Challenge
Constructed a multi-agent 2D grid world using Python to optimize agents navigating train networks
Formulated optimal sparse reward functions to balance local single and global multi-agent reward signals

PythonPyTorchOpenAI GymReinforcement Learning

AI Art Hack 2023

Leveraged ChatGPT 3.5 for innovative story plot generation
Utilized Midjourney to generate relevant images, employing prompt engineering for stylization choices and ensuring consistency in story character portrayals
Employed Canva to craft a visually appealing 2-page comic and inserting appropriate text bubbles

ChatGPTMidjourneyCanva

Movie Recommender System

Performed exploratory data analysis and preprocessing on large user, movie, and ratings datasets to prepare data for building a movie recommender system
Developed and optimized two machine learning approaches for movie recommendations: an ensemble SVM model and a multi-layer perceptron neural network, achieving high precision scores of 0.68 and 0.69 respectively
Implemented multi-armed bandit algorithms with epsilon-greedy strategies and exploration functions to balance exploration and exploitation for personalized movie recommendations, attaining up to 80% overlap with most-liked movies
Evaluated movie recommendation models using precision, recall, F1-score metrics, and analyzed strengths and limitations of machine learning vs multi-armed bandit approaches

PythonScikit-learnPandasMatplotlibSeaborn

Singlife Datathon 2023

Analyzed a real-world anonymized dataset comprising 300+ client features to extract actionable insights aimed at enhancing customer experience and increasing insurance product sales
Executed end-to-end data analysis pipeline encompassing data cleaning, exploratory data analysis, feature engineering, and model building, resulting in robust predictive models to drive business decisions
Demonstrated proficiency in handling diverse missing or invalid data values within the dataset, ensuring data integrity and reliability throughout the analysis process
Applied Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance challenges, enhancing the performance and generalization of predictive models
Leveraged pipeline architecture, cross-validation, and grid search techniques to systematically evaluate and compare multiple classical machine learning models, optimizing model performance and scalability

PythonScikit-learnPandasMatplotlibSeaborn

Ongoing + Upcoming Projects

Knowledge Nexus

I will develop a personal research workspace, empowering users to effortlessly gather, analyze, and interact with diverse media types. Users can upload YouTube links, PDFs containing research papers, articles, or books, and audio files, with the system automatically extracting and cataloging pertinent information from each upload into a vector database. The platform generates comprehensive summaries for every uploaded media, offering condensed insights for efficient comprehension. Central to the experience is the integration of a RAG-LLM chatbot. This chatbot, equipped with access to the uploaded media's information, adeptly guides users to relevant sections within the media, facilitating swift access to answers and insights.

Skills

AI/ML

PyTorchTensorFlowKerasOpenAILangChainHugging Facescikit-learnpandasNumPyMatplotlibSeabornNLTKTextBlob

Frontend/Backend Development

ReactNext.jsTailwind CSSFastAPIStreamlit

More Programming

PythonPySparkSQLAlchemySeleniumSQLJavaScriptJavaC++GoR

Technologies

TableauPostgreSQLpgAdmin 4PineconeAmazon Web Services (AWS)GitDockerVercelRStudioVisual Studio CodeHadoop MapReduceSpark

General

LaTeXMicrosoft ExcelMicrosoft WordMicrosoft PowerPointGoogle SheetsGoogle DocsGoogle Slides

Languages

German (Native)English (Native)Bengali (Conversational)Mandarin (Elementary)

Certifications

AI

AI For Industry - Foundations in AI
AI For Industry - Literacy in AI
IBM Machine Learning Professional Certificate
Google Data Analytics Professional Certificate

Other

AWS Certified Solutions Architect – Associate
Google IT Automation with Python
Google Fundamentals of Digital Marketing
TikTok Tech Immersion 2023
TikTok Tech Immersion 2024

Let's Connect

I'm currently exploring new career opportunities and would love to connect with potential employers. Feel free to reach out through the contact form. I'm also happy to network with fellow developers, learners, and professionals in the tech community. Whether you have an exciting opportunity, seek collaboration on a project, or simply want to discuss the latest trends and technologies, I welcome the chance to connect and exchange ideas.