Hi, I'm Pramod Toraskar
Principal Data Engineer @ Red Hat | AI/ML Enthusiast | Open Source Contributor | Platform Thinker
Welcome to my GitHub! With 14+ years in IT and nearly a decade at Red Hat, I lead large-scale data engineering, AI/ML integration, and cloud platform projects. I'm passionate about building intelligent, scalable systems—especially where data, AI, and business meet.
Current Focus
- Building AI-powered data product pipelines using Snowflake, dbt, and OpenShift AI
- Leading projects across data engineering, agentic memory systems, and CI/CD automation
- Contributing to enterprise transformations through platform modernization and open-source AI tools
Key Skills
| Area | Tools |
|---|---|
| Programming | Python, SQL, Shell, JavaScript |
| Data Platforms | Snowflake, Starburst, dbt, Fivetran |
| Orchestration | Apache Airflow (Astrocloud), GitLab CI/CD |
| AI & ML | LLMs (Gemini, OpenAI, HuggingFace), vLLM, Triton, Inference |
| Cloud & DevOps | AWS, OpenShift, Git, Docker |
| Sales/Marketing Data | Marketo, Salesforce, HG Insights, Adobe AAM, Bombora, Eloqua |
Projects & Contributions
Marketo to Snowflake Pipeline
Fivetran + dbt + Snowflake
Marketing automation sync from Marketo → Snowflake using auto-ingest and modular dbt models.
Streamlit Data Product Factory
Streamlit + Gemini + dbt + GitLab
Smart UI to generate staging models, join configs, tests, and full dbt data products using Gemini 2.5 Flash + LangChain.
Hela GitLab CEE Migration
Migrated Git repos and CI/CD workflows. Solved blockers for broken CI/CD in marketing pipelines.
Hela Domino AAM Migration to AWS
Rebuilt batch ETL pulling Starburst → formatting → pushing to Adobe AAM S3.
HG Insights SADP
Built scalable job to ingest firmographic data into Snowflake from HG v2 feed, replacing legacy systems.
Salesforce & Eloqua Integration
Migrated legacy data pipelines to Dataverse with modular dbt modeling and Snowpipe ingestion.
Open Source Projects
| Repo | Description |
|---|---|
fluvii |
Framework to streamline marketing data flows across internal teams |
qlikreader |
CLI/SDK to extract structured data from Qlik dashboards |
python-outreach |
Python client to interact with Outreach CRM API |
dwm |
Personal fork with productivity enhancements to dynamic window manager |
AI & Memory Systems: Memorix AI
I'm currently building Memorix AI – a next-gen Agentic Memory SDK and platform to give AI agents recall, context, and privacy-aware memory.
🔧 Key Repos:
| Repo | Purpose |
|---|---|
memorix-sdk |
Core SDK with modular API, YAML config, tiered memory |
memorix-server |
REST API server exposing MemoryAPI |
memorix-embedders |
Embedding plugins for OpenAI, Gemini, etc. |
memorix-vectorstores |
Vector store plugins for FAISS, Qdrant, Chroma |
memorix-meta |
Metadata integrations (DuckDB, Postgres, etc.) |
memorix-roadmap |
Public issues, feature planning, RFCs |
Innovation, Recognition & Leadership
- Sprint Facilitator – Led cross-team agile planning & delivery
- CI/CD Optimization – Automated broken CI/CD across GitLab Marketing repos
- Data Governance Lead – Approved for PII hosting in non-prod Snowflake, implemented privacy-safe workflows
- Patent Planning – Currently drafting patents around agentic memory systems and AI platform design
- Red Hat Mentoring Program – Mentee and future mentor in the company-wide career growth initiative
Career Goals
- Launch open-source tools that bridge data platforms and agentic AI workflows
- Advance in LLM inference, memory systems, and AI orchestration
- Build enterprise-grade AI-first data platforms
- Work in memory systems, inference optimization, and agent design
Let’s Connect
Thanks for stopping by!
Let’s build something impactful together—especially at the intersection of data, AI, and open source.