Srijith Rajamohan, Ph.D.

AI Lead

Washington, DC

srijith.rajamohan@gmail.com - srijithr.gitlab.io - LinkedIn

Education

2009-2014 Ph.D. in Computational Engineering The University of Tennessee

2007-2009 MS in Electrical Engineering The Pennsylvania State University

Leadership

Lead product and research data science teams
Develop the vision and long-term strategy for applied data science
Lead and develop a cross-product research portfolio
Mentored data scientists from graduate to principal levels

Core Competencies

Machine Learning	Deep Learning (NLP)
Reinforcement Learning	Graph Learning
Statistical Learning	Bayesian Learning
MLOps	Data platforms
GenAI	Information Visualization

Experience (selected)

2022-Present Staff AI Research Scientist, Sage AI

2022-2022 Senior Data Scientist, NerdWallet

2020-2022 Senior Developer Advocate (Data Science & ML), Databricks

2014-2020 Computational Scientist, Virginia Tech

2009-2014 Graduate Research Assistant, SimCenter: Center of Excellence in Computational Engineering, UT

Staff AI Research Scientist

I currently lead the data science and research portfolio within product teams at Sage AI. My responsibilities include finding novel solutions to difficult problems by interfacing cross-collaboratively across teams with Product Managers, Data Scientists, ML Engineers and business stakeholders to solve novel business problems.

Links: = Webpage of project

Projects

Thought leadership

- Developed and led the research portfolio for developing an accounting-specific LLM
- 7 patent submissions
- Authored the strategy document ‘Applied data science within product teams’
- Built a prior induced plan-and-act LLM-based framework for decision making ‘Constrained Hierarchical Planner using LLMs`
- The Road to AGI

Jan 2024 - Present

Tech lead for domain-specific LLM training (Collaboration with AWS)

- Lead a team of PMs, engineers and data scientists to train an accounting-specific LLM from OSS LLMs
- Roadmap planning, experimental design, stakeholder management
- Pretraining and finetuning of OSS LLMs such that they ingest new accounting knowledge and can be used in a variety of product use-cases

July 2023 - Present

Retrieval Augmented Generation (RAG) - LLM for Q&A (Patent filing in progress)

- Lead the data science effort to build a QA engine for a customer-facing product
- Architected a robust QA system (from SQL DB) using a redesigned vector store
- Proposed a novel POS-enhanced retrieval system from vector store for context extraction

June 2023 - Sep 2023

Tech lead for cashflow forecasting

- Led a team of three to perform cashflow forecasting from invoice data
- Identify patterns in invoice history for short term forecasts of Accounts Payable data
- Built a forecasting endpoint and a Dash dashboard

Mar 2023 - Present

Generative AI - LLM-based model for response generation (Patent pending)

- Lead the data science effort to deploy OpenAI ChatGPT for intent detection and response generation
- Proposed a method to improve the accuracy of model (intents) by 20%
- Identify and quantify certainty of hallucinated responses (text-davinci)
- Built a model monitoring notebook

Mar 2023 - Jun 2023

Intent detection from text using classical ML

- Built and deployed an XGBoost classifier to predict intent from text
- Compare the results of XGBoost classifier with zero shot ChatGPT classifier
- Improved model accuracy using active learning and incremental learning for small data

Feb 2023 - Mar 2023

Applied Reinforcement Learning for Financial Applications

- Overview of RL (including RLHF) algorithms presented to the Sage AI team
- Identifying opportunities for applying RL within financial accounting
- Evaluation of RLLib for online learning and D3RLPY for offline learning

September 2022 - Feb 2022

Identifying Recurring Trusted Transactions from vendor buyer interactions (Patent pending)

- Identify transactions that are recurring and can be classified as trusted from vendor buyer transactions.
- This allows the users of Sage products to help automate payments
- Interact with stakeholders, identify and translate the problem, create a research proposal
- Architect an ML framework using unsupervised density estimation to solve the above problem
- Setup Kedro pipelines for repeatable DS experimentation

September 2022 - October 2022 Graph Neural Networks for Financial Applications

- Presentation on using GNNs for various applications in financial accounting such as fraud detection, trusted transactions etc.

Sr. Data Scientist

In the data science team at Nerdwallet, I design experiments and perform modeling to solve data science problems for business units such as lead optimization. I also interface with the data engineering and ML infrastructure teams to optimize ML workflows.

Links: = Webpage of project

Projects

April 2022 - June 2022 Contextual bandits for product placement/ordering

- Designed a contextual bandit-based framework to address the cold-start problem in product placement and recommendation.
- This takes as input a context vector corresponding to a class of users and learns the appropriate ordering for that class over time
- Evaluated contextual versions of algorithms such as epsilon-greedy, Thompson sampling, LinUCB etc.

May 2022 - July 2022 User understanding for prospective and propensity models

- Maintenance of a ML/DL pipeline in for propensity and prospect models from user data
- User data obtained from clickstream and transunion data populated from Snowflake
- Unsupervised autoencoder-based model for generating user vectors
- Vectors are fed to supervised models (XGBoost and Neural Network) for both propsect and propensity modeling
- Pipeline was scheduled and automated using Airflow DAGs

April 2022 - July 2022 ML monitoring for production pipelines

- Architected a proposal for monitoring ML models that are in production (using Evidently.ai)
- Endpoints consist of both AWS Cloudwatch and Datadog

June 2022 - Jul 2022 ML job submission and workflow management

- Design a solution for the ML experimentation and production platform
- Command-line tools designed to keep track of and query submitted jobs, metadata and results
- Read logs from AWS Cloudwatch using the awslogs Python package
- Kedro for data science experiments using the PyData stack
- Data lineage and provenance of generated data
- Systematic tracking and visualization of experiments

Sr. Developer Advocate (Data science/Machine learning)

My role as a Developer Advocate in data science allows me to serve as a thought leader in machine learning and data science, and educate the community about the state-of-the-art. This role also allows me to engage in internal advocacy, and work cross-functionally across various units such as product management, product marketing, solutions and engineering. Some of my responsibilities include:

- Thought leadership articles and presentations on enterprise and open-source ML/Data science
- Provide guidance and feedback to product management and product marketing
- Act as a subject matter expert to the solutions and engineering team

My technical areas of expertise here are Deep Learning for Natural Language Understanding (NLU), Bayesian inference and large-scale processing using PySpark.

Links: = Webpage of project

Projects

2021 Lead the DevRel efforts on Machine Learning at Databricks

- Engage with all stakeholders, i.e. everyone from the executive team to the practitioner community to drive adoption of ML on Databricks
- Guide the ML/DS Product Management team with regards to product features
- Work with the Product Marketing team to help reach the appropriate community of practitioners
- Provide OSS solutions for ML on the Databricks platform

2021 Apache Spark

- Lead the efforts on the growth and messaging architecture of Spark and Koalas
- Engage with C-suite executives to define the growth strategy for Spark
- Offer strategic guidance on improvements to the Spark website and documentation, and grow community adoption

2021 Lead the advocacy efforts on OSS MLflow

- Interface with the MLflow product team at Databricks and drive adoption of OSS MLflow

2020-2021 Authored & open-sourced a set of courses ‘Introduction to Computational Statistics for Data Scientists’

- A practical guide to getting started with scalable Bayesian Inference using PyMC3
- Introduction to Bayesian Statistics
- Bayesian Inference with MCMC
- PyMC3 for Bayesian Modeling and Inference

2020-2022 Articles on Data Science and Machine Learning

- GPU-accelerated Sentiment Analysis Using Pytorch and Huggingface on Databricks
- Are GPUs really expensive? A benchmark study for inference in NLP
- How wrong is your model?
- An Experimentation Pipeline for Extracting Topics From Text Data Using PySpark
- MLflow for Bayesian Experiment Tracking
- Bayesian Modeling of the Temporal Dynamics of COVID-19 using PyMC3
- Using Bayesian Hierarchical Models to Infer the Disease Parameters of COVID-19
- Beyond LDA: Dive into BigARTM for Topic Modeling
- The Modern Chief Data Officer: Transitioning From Defense to Offense
- Reproduce Anything: Machine Learning meets Data Lakehouse

2020-2022 Presentations/Talks
Machine Learning at Scale (Nov/Aug 2021)

- Best practices for the full lifecycle of ML projects along with issues such as reproducibility, explainability, trustworth AI and governance
- Keynote at the IEEE IDSTA conference
- Invited talk at the ADBIS workshop

Deep Learning at Scale at Databricks (Oct 2021)

- Presentation on how to scale Deep learning workloads at Databricks at the Big Data Symposium in South Korea

MLflow for the ML Lifecycle (Nov 2021)

- Presentation on how to use OSS MLflow for model management, reproducibility with MLflow projects and Model registry, and model inference/serving

Bayesian Modeling of the Temporal Dynamics of COVID-19 using PyMC3 at the Data+AI Summit (Nov 2020)

- Bayesian inference to estimate the disease parameters of COVID-19 from real case data with PyMC3

Presented ‘Maintable HPC and Data Science with Python/C++’ at the National Center for Atmospheric Research (NCAR) (Mar 2021)

- Scale data science codes written in Python using a JIT approach with Numba
- Offload compute-intensive portions of the Python code using Eigen and Xtensor in C++

Computational Scientist

This role as a Computational Scientist involved providing Scientific Computing expertise, enabling High-Performance Computing and Visualization solutions and performing research in Machine Learning.

Links: = Webpage of project

Research Projects

2019-2020 Interactive Network Analysis of Social Graphs

- Network analysis of social media network of political figures
- Created a distributed deep learning sentiment analysis pipeline with Huggingface Roberta embeddings
- Generated networks using Graphtools and interactive visualizations of these networks with Sigma.js
- Network data obtained from Twitter, ETL with PySpark
- Metabase as a BI dashboard for SMEs

2018-2020 Determining political affiliation from short texts using stance detection(NLP)

- Created a deep neural network architecture for stance extraction from a weakly-supervised classifier
using contextual embeddings (Elmo/BERT) to determine political affiliation
- Fixed embedding generation frameworks such as Doc2Vec and Fasttext were also evaluated
- Created an interactive web-based visualization tool for stance visualization
- - Self-attention for model interpretability
- - Tweets were stored and preprocessed in MongoDB
- - Python RQ for data acquisition
- - PySpark and Spacy used for corpus cleaning

2019-2020 Deep Learning lead for the project ‘Eye Gaze tracking for Surgical Training’

- Oversaw a team of 5 attempting to learn gaze patterns of resident surgeons using Deep Learning
- Collaboration between the ISE department at Virginia Tech and the Carilion School of Medicine

2019-2020 Co-PI on Jefferson National Lab funded project ‘Next-generation Visual Analysis Workspace for Multidimensional Nuclear Femtography Data’

- Visual analytics for nuclear femtography data.
- Big Data analytics and visualization: Investigating various novel ways to enable understanding
of nuclear physics phenomena with 3 dimensional visualizations.

2015-2020 General Dynamics Collaboration with the Discovery Analytics Center(VT)

- Computational statistics and dimensionality reduction using unsupervised techniques such as PCA, MDS
- Weighted Multi-Dimensional Scaling (WMDS) for semantic interaction
- Formulated a highly-accurate Inverse MDS algorithm
- Formulated and implemented the optimization scheme for the solution of Inverse MDS in Python using the NLopt optimization package
- Parallelized Inverse MDS Python code using Numba
- Accelerated Inverse MDS C++ code for low latency with Eigen

2019-2020 Generative Methods for Stance Detection and Visualization

- Investigate Variational Autoencoders for extracting stance
- Use a semi-supervised learning approach with partially labeled data

2019-2019 Reinforcement learning for Eye Tracking in Laparoscopic Surgery

- Reinforcement learning (ADNet) for learning eye gaze patterns in surgeons during laparoscopic surgery
- Used to train and evaluate residents

2019 Cost analysis of On-premise Cloud vs. Public Cloud for Virginia Tech

- Performed a cost analysis of Virginia Tech’s on-premise cloud and compared it to those offered
by third-party external cloud providers for scientific computing
- Analyze the tradeoffs of an on-premise cloud for the research community that led to the
creation of an interactive report used to estimate and compare costs

2018-2019 ICAT SEAD Grant 2018

- Co-PI on ICAT SEAD grant to analyze and visualize the health and nutrition policies across countries
- Interdisciplinary collaboration with the Health and Nutrition, Business and SOVA departments
- Architected an open-source interactive visual query framework

2018-2019 Scheduling and Visualization Application for Idaho National Lab

- Built a Django and D3 based scheduling and visualization tool
- Tool helps the Idaho National Lab manage outage tasks for the nuclear power plants
- Managed and mentored two students in this development project

2016-2017 HNFE Project for Visualization of National Food and Beverage Endorsements

- Produced interactive web-based visualizations for exploratory analysis of Food and Beverage endorsements for the HNFE department
- Work was presented to policy makers to understand the impact of celebrity endorsements
- Produced a web-based analytics framework for visual querying of endorsement data
- Framework extended using Dask for analyzing Big Data

2016-2017 Interactive 3D Visualization for Nuclear Reactor Pool

- Multi-user user interactive visualization using open-source technology (X3D) in web browsers
- This involved collaborating with the Nuclear Engineering department to interface with their
RAPID toolset to generate a visualization and querying tool.

Infrastructure projects

Cloud Computing

- Played a key role in architecting the cloud computing infrastructure for research computing using Openstack at Virginia Tech
- Involved in designing and setting up cloud computing best practices and architectures using OpenStack, LXC and Docker containers

Scientific Visualization

- ParaView and Visit for large-scale scientific visualization through remote rendering on supercomputers

Collaborative Computing Platforms

- Architected collaborative-computing platforms for data science and ML using JupyterHub.

Miscellaneous

• Research member of several NSF funded projects
• Virginia Tech campus champion for XSEDE, representative for ACI-REF
• Poster, and paper reviewer and session chair at XSEDE 2015-2018
• Mentored and supervised several Graduate Research Assistants

Graduate School

Select research work conducted during my Masters and Ph.D. programs.

Graduate Research Assistant

2009-2014 Experience developing and maintaining a parallel 3D time domain electromagnetic solver using the finite element method for open/closed boundary problems such as radar cross sections, waveguides, frequency-dependent materials etc.

- Distributed code using MPI that can execute over hundreds of nodes
- GPU offloading using CUDA for acceleration

Graduate Student

2007-2009 At the Microsystems Design Lab at the Pennsylvania State University, I implemented a neural network for skin-tone detection on the IBM Cell and which resulted in a 23x speedup.

Teaching

2018-2019 SuperComputing 2018-209 short talks : Talks@VT series

- Introduction to Generative Modeling
- Cloud Cost Comparison: On-premise vs. External Vendor
- Deep Learning on GPUs with PyTorch for Text Analysis
- AutoML: An Overview of Automated Machine Learning

2016 - 2018 Lectures and workshops at Virginia Tech

- Taught undergraduate class ‘CS1064: Introduction to Python’ in the Spring 2016 semester*
- Taught ‘Text Summarization with Word Embeddings using PyTorch’ for CS4984/5984 in Fall 2018
- Taught ’Introduction to OpenACC’ lecture for undergraduate class ‘CMDA 3634: Comp Sci Foundations for CMDA’*
- Introduction to Scientific Python: 150 min handson workshop
- Introduction to Data Visualization with Plot.ly: 150 min handson workshop

2015-2019 Select seminar classes taught at Virginia Tech

- Taught workshop titled ’Introduction to ARC Cloud using OpenStack for Machine Learning’*
- Introduction to Data Visualization
- Introduction to CUDA
- TensorFlow for Machine Learning
- Dask for Out-of-Core Computing: Big Data solutions on your laptop
- Unsupervised Machine Learning using Sckit-learn and TensorFlow
- Supervised Machine Learning using Sckit-learn and TensorFlow
- Deep Learning using TensorFlow and Keras

2015-2018 XSEDE conference workshops

- A Data Scientist’s Python Toolbox
- Workshop titled ’Introduction to Machine Learning’ taught at PEARC18
- Introduction to Scientific Computing using Python
- Python Pandas for Data Analytics

Publications

*- Chaitanya S. Kulkarni, Tianzi Wang, Nathan Lau, Jacob Hartman-Kenzler, Sarah E. Parker, Srijith Rajamohan, Laura E. Barnes, Shawn D. Safford Applying Deep Learning to Provide Eye Gaze Guidance for the Peg Transfer Task, Jan 1 2021, 16th Academic Surgical Congress

- Srijith Rajamohan, Robert Settlage Informing the On/Off-prem Cloud Discussion in Higher Education, PEARC20, ACM, Portland

- Robert Settlage, Srijith Rajamohan Enabling AI/DL Workloads on HPC Infrastructure through Containers and Open OnDemand. HPCKP20, High-Performance Computing Knowledge Meeting, Barcelona, July 2020

- Rincón-Gallardo Patiño, Sofía, Srijith Rajamohan, Kathleen Meaney, Eloise Coupey, Elena Serrano, Valisa E. Hedrick, Fabio da Silva Gomes, Nicholas Polys, and Vivica Kraak. Development of a Responsible Policy Index to Improve Statutory and Self-Regulatory Policies that Protect Children’s Diet and Health in the America’s Region. International Journal of Environmental Research and Public Health 17, no. 2 (2020): 495.

- Robert Settlage, Srijith Rajamohan, Kevin Lahmers2, Alan Chalker3, Eric Franz3, Steve Gallo4, David Hudak3. Portals for Interactive Steering of HPC Workflows. Nov 2019, Third Workshop on Interactive High-Performance Computing, SC19

- Srijith Rajamohan, Alana Romanella, Amit Ramesh. A Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation. Aug 2019, arXiv:1908.02282 [cs.CL], https:// arxiv.org/abs/1908.02282

- Zhou, M., Rajamohan, S., Hedrick, V., Rincón-Gallardo Patiño, S., Abidi, F., Polys, N., & Kraak, V. (2019). Mapping the Celebrity Endorsement of Branded Food and Beverage Products and Marketing Campaigns in the United States, 1990–2017 ,International journal of environmental research and public health 16.19 (2019): 3743

- Valerio Mascolino, Alireza Haghighat, Nicholas Polys, Nathan J. Roskoff, and Srijith Rajamohan. 2019. A Collaborative Virtual Reality System (VRS) with X3D Visualization for RAPID, The 24th International Conference on 3D Web Technology (Web3D ’19), ACM, New York, NY, USA, 1-8.

- Srijith Rajamohan and Faiz Abidi, Web-based Visualization and Querying of Food and Beverage Endorsements by Celebrities, PEARC19, ACM, Chicago

- Rajamohan, S., Romanella, A., Ramesh, A., A Human-in-the-Loop Deep Learning Based Document Tagging for Stance Detection, CHCI 2019: Algorithms that make you think, Blacksburg.

- Rajamohan,S. and Anderson, W.K. A Modified Streamline Upwind/Petrov-Galerkin Stabilization Matrix for Time-Domain FEM, ACES 2018, Denver

- Rajamohan,S. and Anderson, W.K. Using an Approximate Streamline Upwind/Petrov-Galerkin Stabilization Matrix for the Solution of Maxwell’s Equations in Dispersive Materials, ACES 2018, Denver.

- Abidi, F., Polys, N., Rajamohan, S., Arsenault, L., Mohammed, A. (2018, April). Remote high performance visualization of big data for immersive science. In Proceedings of the High Performance Computing Symposium (p. 5). Society for Computer Simulation International.

- Zhou M, Kraak VI, Rajamohan S, Abidi F, Polys N. Mapping the Celebrity Marketing of Branded Food and Beverage Products in the United States: Policy Implications and Research Needs. 15th World Congress on Public Health. April 3-7, 2017. Melbourne, Victoria, Australia

- Nicholas Polys, Ayat Mohammed, Jagathshree Iyer, Peter Radics, Faiz Abidi, Lance Arsenault, and Srijith Rajamohan. Immersive Analytics: Crossing the Gulfs with High-Performance Visualization. IEEE VR 2016 Workshop on Immersive Analytics

- Rajamohan,S and Anderson, W.K , HPC for Legacy EM Code, a Mixed Language Approach using CUDA. Applied Computational Electromagnetic Society 2012, Volume: GPU for CEM.

- Porting Algorithms to the IBM Cell Processor - an FFT case study. Penn State Research Symposium 2009.