Srijith Rajamohan, Ph.D.
AI Lead
Washington, DC
Education
2009-2014
Ph.D. in Computational Engineering The University of Tennessee
2007-2009
MS in Electrical Engineering The Pennsylvania State University
Leadership
- Lead product and research data science teams
- Develop the vision and long-term strategy for applied data science
- Lead and develop a cross-product research portfolio
- Mentored data scientists from graduate to principal levels
Core Competencies
Machine Learning | Deep Learning (NLP) |
Reinforcement Learning | Graph Learning |
Statistical Learning | Bayesian Learning |
MLOps | Data platforms |
GenAI | Information Visualization |
Experience (selected)
2022-Present
Staff AI Research Scientist, Sage AI
2022-2022
Senior Data Scientist, NerdWallet
2020-2022
Senior Developer Advocate (Data Science & ML), Databricks
2014-2020
Computational Scientist, Virginia Tech
2009-2014
Graduate Research Assistant, SimCenter: Center of Excellence in Computational Engineering, UT
Staff AI Research Scientist
I currently lead the data science and research portfolio within product teams at Sage AI. My responsibilities include finding novel solutions to difficult problems by interfacing cross-collaboratively across teams with Product Managers, Data Scientists, ML Engineers and business stakeholders to solve novel business problems.
Links: = Webpage of project
Projects
Thought leadership
- - Developed and led the research portfolio for developing an accounting-specific LLM
- - 7 patent submissions
- - Authored the strategy document ‘Applied data science within product teams’
- - Built a prior induced plan-and-act LLM-based framework for decision making ‘Constrained Hierarchical Planner using LLMs`
- - The Road to AGI
Jan 2024 - Present
Tech lead for domain-specific LLM training (Collaboration with AWS)
- - Lead a team of PMs, engineers and data scientists to train an accounting-specific LLM from OSS LLMs
- - Roadmap planning, experimental design, stakeholder management
- - Pretraining and finetuning of OSS LLMs such that they ingest new accounting knowledge and can be used in a variety of product use-cases
July 2023 - Present
Retrieval Augmented Generation (RAG) - LLM for Q&A (Patent filing in progress)
- - Lead the data science effort to build a QA engine for a customer-facing product
- - Architected a robust QA system (from SQL DB) using a redesigned vector store
- - Proposed a novel POS-enhanced retrieval system from vector store for context extraction
June 2023 - Sep 2023
Tech lead for cashflow forecasting
- - Led a team of three to perform cashflow forecasting from invoice data
- - Identify patterns in invoice history for short term forecasts of Accounts Payable data
- - Built a forecasting endpoint and a Dash dashboard
Mar 2023 - Present
Generative AI - LLM-based model for response generation (Patent pending)
- - Lead the data science effort to deploy OpenAI ChatGPT for intent detection and response generation
- - Proposed a method to improve the accuracy of model (intents) by 20%
- - Identify and quantify certainty of hallucinated responses (text-davinci)
- - Built a model monitoring notebook
Mar 2023 - Jun 2023
Intent detection from text using classical ML
- - Built and deployed an XGBoost classifier to predict intent from text
- - Compare the results of XGBoost classifier with zero shot ChatGPT classifier
- - Improved model accuracy using active learning and incremental learning for small data
Feb 2023 - Mar 2023
Applied Reinforcement Learning for Financial Applications
- - Overview of RL (including RLHF) algorithms presented to the Sage AI team
- - Identifying opportunities for applying RL within financial accounting
- - Evaluation of RLLib for online learning and D3RLPY for offline learning
September 2022 - Feb 2022
Identifying Recurring Trusted Transactions from vendor buyer interactions (Patent pending)
- - Identify transactions that are recurring and can be classified as trusted from vendor buyer transactions.
- - This allows the users of Sage products to help automate payments
- - Interact with stakeholders, identify and translate the problem, create a research proposal
- - Architect an ML framework using unsupervised density estimation to solve the above problem
- - Setup Kedro pipelines for repeatable DS experimentation
September 2022 - October 2022
Graph Neural Networks for Financial Applications
- - Presentation on using GNNs for various applications in financial accounting such as fraud detection, trusted transactions etc.
Sr. Data Scientist
In the data science team at Nerdwallet, I design experiments and perform modeling to solve data science problems for business units such as lead optimization. I also interface with the data engineering and ML infrastructure teams to optimize ML workflows.
Links: = Webpage of project
Projects
April 2022 - June 2022
Contextual bandits for product placement/ordering
- - Designed a contextual bandit-based framework to address the cold-start problem in product placement and recommendation.
- - This takes as input a context vector corresponding to a class of users and learns the appropriate ordering for that class over time
- - Evaluated contextual versions of algorithms such as epsilon-greedy, Thompson sampling, LinUCB etc.
May 2022 - July 2022
User understanding for prospective and propensity models
- - Maintenance of a ML/DL pipeline in for propensity and prospect models from user data
- - User data obtained from clickstream and transunion data populated from Snowflake
- - Unsupervised autoencoder-based model for generating user vectors
- - Vectors are fed to supervised models (XGBoost and Neural Network) for both propsect and propensity modeling
- - Pipeline was scheduled and automated using Airflow DAGs
April 2022 - July 2022
ML monitoring for production pipelines
- - Architected a proposal for monitoring ML models that are in production (using Evidently.ai)
- - Endpoints consist of both AWS Cloudwatch and Datadog
June 2022 - Jul 2022
ML job submission and workflow management
- - Design a solution for the ML experimentation and production platform
- - Command-line tools designed to keep track of and query submitted jobs, metadata and results
- - Read logs from AWS Cloudwatch using the awslogs Python package
- - Kedro for data science experiments using the PyData stack
- - Data lineage and provenance of generated data
- - Systematic tracking and visualization of experiments
Sr. Developer Advocate (Data science/Machine learning)
My role as a Developer Advocate in data science allows me to serve as a thought leader in machine learning and data science, and educate the community about the state-of-the-art. This role also allows me to engage in internal advocacy, and work cross-functionally across various units such as product management, product marketing, solutions and engineering. Some of my responsibilities include:
- - Thought leadership articles and presentations on enterprise and open-source ML/Data science
- - Provide guidance and feedback to product management and product marketing
- - Act as a subject matter expert to the solutions and engineering team
My technical areas of expertise here are Deep Learning for Natural Language Understanding (NLU), Bayesian inference and large-scale processing using PySpark.
Links: = Webpage of project
Projects
2021
Lead the DevRel efforts on Machine Learning at Databricks
- - Engage with all stakeholders, i.e. everyone from the executive team to the practitioner community to drive adoption of ML on Databricks
- - Guide the ML/DS Product Management team with regards to product features
- - Work with the Product Marketing team to help reach the appropriate community of practitioners
- - Provide OSS solutions for ML on the Databricks platform
2021
Apache Spark
- - Lead the efforts on the growth and messaging architecture of Spark and Koalas
- - Engage with C-suite executives to define the growth strategy for Spark
- - Offer strategic guidance on improvements to the Spark website and documentation, and grow community adoption
2021
Lead the advocacy efforts on OSS MLflow
- - Interface with the MLflow product team at Databricks and drive adoption of OSS MLflow
2020-2021
Authored & open-sourced a set of courses ‘Introduction to Computational Statistics for Data Scientists’
- - A practical guide to getting started with scalable Bayesian Inference using PyMC3
- - Introduction to Bayesian Statistics
- - Bayesian Inference with MCMC
- - PyMC3 for Bayesian Modeling and Inference
2020-2022
Articles on Data Science and Machine Learning
- - GPU-accelerated Sentiment Analysis Using Pytorch and Huggingface on Databricks
- - Are GPUs really expensive? A benchmark study for inference in NLP
- - How wrong is your model?
- - An Experimentation Pipeline for Extracting Topics From Text Data Using PySpark
- - MLflow for Bayesian Experiment Tracking
- - Bayesian Modeling of the Temporal Dynamics of COVID-19 using PyMC3
- - Using Bayesian Hierarchical Models to Infer the Disease Parameters of COVID-19
- - Beyond LDA: Dive into BigARTM for Topic Modeling
- - The Modern Chief Data Officer: Transitioning From Defense to Offense
- - Reproduce Anything: Machine Learning meets Data Lakehouse
2020-2022
Presentations/Talks
Machine Learning at Scale (Nov/Aug 2021)
- - Best practices for the full lifecycle of ML projects along with issues such as reproducibility, explainability, trustworth AI and governance
- - Keynote at the IEEE IDSTA conference
- - Invited talk at the ADBIS workshop
Deep Learning at Scale at Databricks (Oct 2021)
- - Presentation on how to scale Deep learning workloads at Databricks at the Big Data Symposium in South Korea
MLflow for the ML Lifecycle (Nov 2021)
- - Presentation on how to use OSS MLflow for model management, reproducibility with MLflow projects and Model registry, and model inference/serving
Bayesian Modeling of the Temporal Dynamics of COVID-19 using PyMC3 at the Data+AI Summit (Nov 2020)
- - Bayesian inference to estimate the disease parameters of COVID-19 from real case data with PyMC3
- - Scale data science codes written in Python using a JIT approach with Numba
- - Offload compute-intensive portions of the Python code using Eigen and Xtensor in C++
Computational Scientist
This role as a Computational Scientist involved providing Scientific Computing expertise, enabling High-Performance Computing and Visualization solutions and performing research in Machine Learning.
Links: = Webpage of project
Research Projects
2019-2020
Interactive Network Analysis of Social Graphs
- - Network analysis of social media network of political figures
- - Created a distributed deep learning sentiment analysis pipeline with Huggingface Roberta embeddings
- - Generated networks using Graphtools and interactive visualizations of these networks with Sigma.js
- - Network data obtained from Twitter, ETL with PySpark
- - Metabase as a BI dashboard for SMEs
2018-2020
Determining political affiliation from short texts using stance detection(NLP)
- - Created a deep neural network architecture for stance extraction from a weakly-supervised classifier
using contextual embeddings (Elmo/BERT) to determine political affiliation - - Fixed embedding generation frameworks such as Doc2Vec and Fasttext were also evaluated
- - Created an interactive web-based visualization tool for stance visualization
- - Self-attention for model interpretability
- - Tweets were stored and preprocessed in MongoDB
- - Python RQ for data acquisition
- - PySpark and Spacy used for corpus cleaning
2019-2020
Deep Learning lead for the project ‘Eye Gaze tracking for Surgical Training’
- - Oversaw a team of 5 attempting to learn gaze patterns of resident surgeons using Deep Learning
- - Collaboration between the ISE department at Virginia Tech and the Carilion School of Medicine
2019-2020
Co-PI on Jefferson National Lab funded project ‘Next-generation Visual Analysis Workspace for Multidimensional Nuclear Femtography Data’
- - Visual analytics for nuclear femtography data.
- - Big Data analytics and visualization: Investigating various novel ways to enable understanding
of nuclear physics phenomena with 3 dimensional visualizations.
2015-2020
General Dynamics Collaboration with the Discovery Analytics Center(VT)
- - Computational statistics and dimensionality reduction using unsupervised techniques such as PCA, MDS
- - Weighted Multi-Dimensional Scaling (WMDS) for semantic interaction
- - Formulated a highly-accurate Inverse MDS algorithm
- - Formulated and implemented the optimization scheme for the solution of Inverse MDS in Python using the NLopt optimization package
- - Parallelized Inverse MDS Python code using Numba
- - Accelerated Inverse MDS C++ code for low latency with Eigen
2019-2020
Generative Methods for Stance Detection and Visualization
- - Investigate Variational Autoencoders for extracting stance
- - Use a semi-supervised learning approach with partially labeled data
2019-2019
Reinforcement learning for Eye Tracking in Laparoscopic Surgery
- - Reinforcement learning (ADNet) for learning eye gaze patterns in surgeons during laparoscopic surgery
- - Used to train and evaluate residents
2019
Cost analysis of On-premise Cloud vs. Public Cloud for Virginia Tech
- - Performed a cost analysis of Virginia Tech’s on-premise cloud and compared it to those offered
by third-party external cloud providers for scientific computing - - Analyze the tradeoffs of an on-premise cloud for the research community that led to the
creation of an interactive report used to estimate and compare costs
2018-2019
ICAT SEAD Grant 2018
- - Co-PI on ICAT SEAD grant to analyze and visualize the health and nutrition policies across countries
- - Interdisciplinary collaboration with the Health and Nutrition, Business and SOVA departments
- - Architected an open-source interactive visual query framework
2018-2019
Scheduling and Visualization Application for Idaho National Lab
- - Built a Django and D3 based scheduling and visualization tool
- - Tool helps the Idaho National Lab manage outage tasks for the nuclear power plants
- - Managed and mentored two students in this development project
2016-2017
HNFE Project for Visualization of National Food and Beverage Endorsements
- - Produced interactive web-based visualizations for exploratory analysis of Food and Beverage endorsements for the HNFE department
- - Work was presented to policy makers to understand the impact of celebrity endorsements
- - Produced a web-based analytics framework for visual querying of endorsement data
- - Framework extended using Dask for analyzing Big Data
2016-2017
Interactive 3D Visualization for Nuclear Reactor Pool
- - Multi-user user interactive visualization using open-source technology (X3D) in web browsers
- - This involved collaborating with the Nuclear Engineering department to interface with their
RAPID toolset to generate a visualization and querying tool.
Infrastructure projects
Cloud Computing
- - Played a key role in architecting the cloud computing infrastructure for research computing using Openstack at Virginia Tech
- - Involved in designing and setting up cloud computing best practices and architectures using OpenStack, LXC and Docker containers
Scientific Visualization
- - ParaView and Visit for large-scale scientific visualization through remote rendering on supercomputers
Collaborative Computing Platforms
- - Architected collaborative-computing platforms for data science and ML using JupyterHub.
Miscellaneous
- • Research member of several NSF funded projects
- • Virginia Tech campus champion for XSEDE, representative for ACI-REF
- • Poster, and paper reviewer and session chair at XSEDE 2015-2018
- • Mentored and supervised several Graduate Research Assistants
Graduate School
Select research work conducted during my Masters and Ph.D. programs.
Graduate Research Assistant
2009-2014
Experience developing and maintaining a parallel 3D time domain electromagnetic solver using the finite element method for open/closed boundary problems such as radar cross sections, waveguides, frequency-dependent materials etc.
- - Distributed code using MPI that can execute over hundreds of nodes
- - GPU offloading using CUDA for acceleration
Graduate Student
2007-2009
At the Microsystems Design Lab at the Pennsylvania State University, I implemented a neural network for skin-tone detection on the IBM Cell and which resulted in a 23x speedup.
Teaching
2018-2019
SuperComputing 2018-209 short talks : Talks@VT series
- - Introduction to Generative Modeling
- - Cloud Cost Comparison: On-premise vs. External Vendor
- - Deep Learning on GPUs with PyTorch for Text Analysis
- - AutoML: An Overview of Automated Machine Learning
2016 - 2018
Lectures and workshops at Virginia Tech
- - Taught undergraduate class ‘CS1064: Introduction to Python’ in the Spring 2016 semester*
- - Taught ‘Text Summarization with Word Embeddings using PyTorch’ for CS4984/5984 in Fall 2018
- - Taught ’Introduction to OpenACC’ lecture for undergraduate class ‘CMDA 3634: Comp Sci Foundations for CMDA’*
- - Introduction to Scientific Python: 150 min handson workshop
- - Introduction to Data Visualization with Plot.ly: 150 min handson workshop
2015-2019
Select seminar classes taught at Virginia Tech
- - Taught workshop titled ’Introduction to ARC Cloud using OpenStack for Machine Learning’*
- - Introduction to Data Visualization
- - Introduction to CUDA
- - TensorFlow for Machine Learning
- - Dask for Out-of-Core Computing: Big Data solutions on your laptop
- - Unsupervised Machine Learning using Sckit-learn and TensorFlow
- - Supervised Machine Learning using Sckit-learn and TensorFlow
- - Deep Learning using TensorFlow and Keras
2015-2018
XSEDE conference workshops
- - A Data Scientist’s Python Toolbox
- - Workshop titled ’Introduction to Machine Learning’ taught at PEARC18
- - Introduction to Scientific Computing using Python
- - Python Pandas for Data Analytics
Publications
*- Chaitanya S. Kulkarni, Tianzi Wang, Nathan Lau, Jacob Hartman-Kenzler, Sarah E. Parker, Srijith Rajamohan, Laura E. Barnes, Shawn D. Safford Applying Deep Learning to Provide Eye Gaze Guidance for the Peg Transfer Task, Jan 1 2021, 16th Academic Surgical Congress
- Srijith Rajamohan, Robert Settlage Informing the On/Off-prem Cloud Discussion in Higher Education, PEARC20, ACM, Portland
- Robert Settlage, Srijith Rajamohan Enabling AI/DL Workloads on HPC Infrastructure through Containers and Open OnDemand. HPCKP20, High-Performance Computing Knowledge Meeting, Barcelona, July 2020
- Rincón-Gallardo Patiño, Sofía, Srijith Rajamohan, Kathleen Meaney, Eloise Coupey, Elena Serrano, Valisa E. Hedrick, Fabio da Silva Gomes, Nicholas Polys, and Vivica Kraak. Development of a Responsible Policy Index to Improve Statutory and Self-Regulatory Policies that Protect Children’s Diet and Health in the America’s Region. International Journal of Environmental Research and Public Health 17, no. 2 (2020): 495.
- Robert Settlage, Srijith Rajamohan, Kevin Lahmers2, Alan Chalker3, Eric Franz3, Steve Gallo4, David Hudak3. Portals for Interactive Steering of HPC Workflows. Nov 2019, Third Workshop on Interactive High-Performance Computing, SC19
- Srijith Rajamohan, Alana Romanella, Amit Ramesh. A Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation. Aug 2019, arXiv:1908.02282 [cs.CL], https:// arxiv.org/abs/1908.02282
- Zhou, M., Rajamohan, S., Hedrick, V., Rincón-Gallardo Patiño, S., Abidi, F., Polys, N., & Kraak, V. (2019). Mapping the Celebrity Endorsement of Branded Food and Beverage Products and Marketing Campaigns in the United States, 1990–2017 ,International journal of environmental research and public health 16.19 (2019): 3743
- Valerio Mascolino, Alireza Haghighat, Nicholas Polys, Nathan J. Roskoff, and Srijith Rajamohan. 2019. A Collaborative Virtual Reality System (VRS) with X3D Visualization for RAPID, The 24th International Conference on 3D Web Technology (Web3D ’19), ACM, New York, NY, USA, 1-8.
- Srijith Rajamohan and Faiz Abidi, Web-based Visualization and Querying of Food and Beverage Endorsements by Celebrities, PEARC19, ACM, Chicago
- Rajamohan, S., Romanella, A., Ramesh, A., A Human-in-the-Loop Deep Learning Based Document Tagging for Stance Detection, CHCI 2019: Algorithms that make you think, Blacksburg.
- Rajamohan,S. and Anderson, W.K. A Modified Streamline Upwind/Petrov-Galerkin Stabilization Matrix for Time-Domain FEM, ACES 2018, Denver
- Rajamohan,S. and Anderson, W.K. Using an Approximate Streamline Upwind/Petrov-Galerkin Stabilization Matrix for the Solution of Maxwell’s Equations in Dispersive Materials, ACES 2018, Denver.
- Abidi, F., Polys, N., Rajamohan, S., Arsenault, L., Mohammed, A. (2018, April). Remote high performance visualization of big data for immersive science. In Proceedings of the High Performance Computing Symposium (p. 5). Society for Computer Simulation International.
- Zhou M, Kraak VI, Rajamohan S, Abidi F, Polys N. Mapping the Celebrity Marketing of Branded Food and Beverage Products in the United States: Policy Implications and Research Needs. 15th World Congress on Public Health. April 3-7, 2017. Melbourne, Victoria, Australia
- Nicholas Polys, Ayat Mohammed, Jagathshree Iyer, Peter Radics, Faiz Abidi, Lance Arsenault, and Srijith Rajamohan. Immersive Analytics: Crossing the Gulfs with High-Performance Visualization. IEEE VR 2016 Workshop on Immersive Analytics
- Rajamohan,S and Anderson, W.K , HPC for Legacy EM Code, a Mixed Language Approach using CUDA. Applied Computational Electromagnetic Society 2012, Volume: GPU for CEM.
- Porting Algorithms to the IBM Cell Processor - an FFT case study. Penn State Research Symposium 2009.