About Me
AI and Data Professional with 5+ years of experience delivering data-driven solutions in AI/ML, Data Engineering, and Applied Research across diverse industries including financial services, real estate, and utilities across North America. Certified Google Cloud Professional Machine Learning Engineer, holding an MSc in Computer Science (York University) and a BSc in Software Engineering. Applied research experience at the Data Mining Lab, York University, with award-winning work published at the IEEE MDM Conference (Versailles, France). Internationally recognized through EU funding to complete Bachelor’s thesis project at Uppsala University, Sweden. Proven ability to design and deploy scalable ML/AI models, implement analytics pipelines, and collaborate with stakeholders to drive measurable business impact.
I conducted research as an Erasmus Scholar @ the Department of Information Technology, Uppsala University, Sweden in the area of Cloud Computing.
During my graduate studies at York University, I was working under the supervision of Prof. Manos Papagelis.
Interests: data mining, machine learning, data science, natural language processing (NLP), cloud computing, Agentic AI, LLMs, RAG, etc.
Tools utilized and experimented with:
• Data Mining & Visualization - Python (matplotlib, seaborn, pandas, numpy), PySpark, Power BI/Tableau
• Programming Languages - Python, Java/C#, C/C++, JavaScript/TypeScript
• Databases - PostgreSQL/PostGIS, MongoDB, MySQL, Oracle
• Cloud & DevOps - AWS (Lambda, EC2/ECR), GCP(Vertex AI, BigQuery), Azure Databricks, Docker, GitHub Actions, Bitbucket, Jira
• Development & BI Tools - VS Code, Sourcetree, DBeaver, Power BI/Tableau, Matlab
• Operating Systems - Linux/Mac, Windows
• AI Tools & LLMs - Google Gemini, DeepSeek, OpenAI, Cohere, Ollama, Cursor, Claude
Experience
Optimize Financial Group
AI Engineer
Jan 2026 - April 2026
• Engineered autonomous AI agents to analyze advisor-representative consultations, extracting compliance insights to improve client service quality and regulatory adherence
• Automated contact enrichment pipeline using Playwright and Node.js to scrape and verify financial advisor data from regulatory databases and Google, reducing manual contact research time by 50% for the wealth management team
• Containerized AI agent services using Docker and Amazon ECR, deploying on EC2 instances to ensure reliable, cost-effective availability for on-demand workloads
• Architected serverless workflows via AWS Lambda to execute complex AI agent prompts, complementing container-based deployments for lightweight, on-demand tasks
Mercor
Data Science Expert
Nov 2025 - Jan 2026
• Contributed to rubric-driven AI model training, focusing on reasoning quality, instruction adherence, and robustness
• Built structured ML Jupyter notebooks from prompt → starter code → final reference implementation
• Designed datasets and evaluation criteria to systematically test model performance and failure modes
N. Harris Computer Corporation
Data Engineering Consultant
Apr 2023 - Mar 2025
• Designed Data Pipelines to extract, transform and load (ETL) data from customer information systems (CIS) into meter data management (MDM) software using Rules Engine (proprietary ETL tool) and SQL
• Managed the processing, transformation and extraction of meter data coming from advanced metering infrastructure (AMI) systems such as (Sensus, Itron, L+G) using Multispeak Web Services (SOAP/WSDL, REST APIs)
• Implemented modules such for line loss analysis, leak detection and transformer load analysis, reducing utility losses by (10-20%) and improving operational efficiency - data solutions & data models
• Collaborated with internal stakeholders and clients to define business requirements and deliver tailored solutions - compliance, service delivery
• Provided technical support and troubleshooting throughout project implementation - project management, configuration management
• Contributed to the development and integration of a proprietary LLM-based analytics module, extending MDM insights - Python, PowerBI, LLM, google-t5 (text-to-text transformer based model)
Badal.io
Machine Learning (MLOps) Engineer
Apr 2022 - Jan 2023
• Built end-to-end machine learning models (training, deployment & evaluation) using AutoML and custom model training by leveraging Vertex AI, BigQuery, Docker and Dataflow for a client in financial space. Improving their ability to train ML models more efficiently using GCP - Python, ML Libraries (scikit-learn, Tensorflow/Keras/PyTorch), pandas, numpy
• Developed and deployed semantic models to detect bias in appraisal documents, leveraging Document AI and Cohere API
• Certified Google Cloud Professional Machine Learning Engineer URL: https://bit.ly/3WhLBzg
EECS, Lassonde School of Engineering, York University
Graduate Teaching Assistant
Jan 2018 - Dec 2022
Taught multiple courses at the Electrical Engineering and Computer Science (EECS) department
• Courses: Programming for Mobile Computing EECS 1022 - Introduction to Database Systems EECS 3412 - Object Oriented Programming from Sensors to Actuators - EECS 1021 - Software Design - EECS 3311
Tasks include: directing tutorials, exam invigilation, final and midterm exam review sessions, grading assignments/exams, office hour duties. OOP, Java, Android Studio, IntelliJ
Data Mining Lab, Lassonde School of Engineering, York University
http://dminer.eecs.yorku.ca/Graduate Research Assistant
Jan 2018 - Apr 2020
• The research was related to trajectory data mining, machine learning, and statistical inference
• Developed a method that utilizes trajectories of [cars, pedestrians, etc.] as a way to infer semantic similarities between geographical areas
• Published the research titled Learning Semantic Relationships of Geographical Areas based on Trajectories at the IEEE Mobile Data Management Conference 2020 Versailles, France, and received Best Paper Award
Swedish National Infrastructure for Computing, Uppsala University
Erasmus Scholar
Aug 2015 - Jan 2016
• Designed a framework inside SNIC using Apache Spark, SparkR & Jupyter Notebook to simplify computations of highly parallel scientific applications
• Our project titled Towards Moving Scientific Applications in the Cloud enabled researchers to seamlessly deploy their applications on the spark server and scale it to multiple worker nodes as needed
Notable Projects
Learning Semantic Relationships of Geographical Areas based on Trajectories
https://github.com/saimmehmood/semantic_relationshipsPython (networkx, pandas, numpy, seaborn, matplotlib), PostgreSQL, PostGIS, MATLAB, Google Cloud (Places & Directions) API
• Developed a framework to understand semantic relationships between geographical areas based on object movement paths i.e., trajectories. (Best Paper Award for IEEE Conference on Mobile Data Management 2020)
Expert Developer Recommendation Using Very Large Datasets
https://github.com/saimmehmood/ExpertDeveloperRecommendationSQL, Google BigQuery, Elasticsearch
• Built a search engine to find expert developers by utilizing GitHub datasets. • Reduced 3TB of data into merely 600 MB by keeping developer specific information such as (number of commits, first and last commit, average time between commits etc.)
Towards Moving Scientific Applications in the Cloud
https://github.com/saimmehmood/Towards-Moving-Scientific-Applications-in-the-CloudCloud Computing, OpenStack, Apache Spark, Jupyter Notebook
• Cloud computing provides usability, scalability and on demand availability of computational and storage resources, remotely. These are the characteristics required by scientific applications and that’s why we used it. The project had two dimensions. First one addresses the benefits of cloud infrastructure for end users. In the second portion, we tried to do performance analysis.
COVID-19 - Risk of Geographical Areas being infected
https://medium.com/data-science/covid-19-risk-of-geographical-areas-being-infected-a81938a5e286PostgreSQL, PostGIS, Python (numpy, pandas)
• This experimental project was done as a use-case to predict COVID-19 infection hotspots for a probable second wave of cases in Manhattan area.
Education
York University
MSc Computer Science
Jan 2018 - June 2020
York University is a public research university in Toronto, Ontario, Canada. It is Canada's third-largest university, and it has approximately 55,700 students, 7000 faculty and staff, and over 315,000 alumni worldwide.
My studies at York University were focused on extensive research. I accumulated a wealth of knowledge in the area of Data Mining, Big Data, Data Science and Machine Learning. I published research track paper with my supervisor Manos Papagelis, titled Learning Semantic Relationships of Geographical Areas Based on Trajectories for The 21st IEEE International Conference on Mobile Data Management 2020. Our paper received Best Paper Award.
Notable Courses: Data Mining, Mining Software Engineering Data
University of the Punjab
Bachelor of Sciences in Software Engineering
Sep 2012 - Jul 2016
University of the Punjab is a public research university located in Lahore, Punjab, Pakistan. It is the oldest public university in Pakistan.
Four years of undergrad at University of the Punjab helped shaped my understanding of cloud computing, software development, and its requirements engineering. During the course of my studies, I earned an Erasmus Mundus scholarship to spend an exchange semester at Uppsala University.
Notable Courses: Applied Cloud Computing, Software Requirements Engineering, Database Systems
Certifications
Google Cloud Certified Professional Machine Learning Engineer
https://www.credential.net/42cc4d30-75be-410c-8486-a94afbe73effIntroduction to Quantum Computing
http://www.linkedin.com/learning/introduction-to-quantum-computingCertificate No: AY7IBy3zoehD_C4j4fc-gqdE_brr
Volunteer Experience
• Helping recent graduates and folks from different backgrounds to transition into Data Science.
• Open sourcing NLP (natural language processing) packages to increase the visibility of aggregate intellect in tech community - Github
• Actively contributed to organizing IBM’s CASCON x EVOKE 2019 conference.
Honors and Awards
Alongside my interests in data mining and software engineering I earned some awards:
- Awarded York University Graduate Fellowship for the entire duration of M.Sc. Computer Science, January 2018
- Electrical Engineering and Computer Science Graduate Student Association (EECS-GSA) York University, Vice-President Organization, Sep 2018 - 2019
- Represented Pakistani youth in China, Pakistan Youth Delegation, August 2016
- Won Erasmus Mundus Exchange Scholarship to spend an exchange semester at Uppsala University, May 2015
- Winner 17th In-House Speed Programming Competition, University of the Punjab, May 2015