๐ง Current Projects
- R5SEBA โ An N-way out-of-order RISC-V processor with speculative execution, branch recovery, and associative caches
- ConvoLite โ A CUDA-accelerated deep learning engine with shared memory tiling and pipelined convolution
- ฮผTracker โ A microarchitectural profiling framework with integrated testbenches and RTL debug automation
๐ Live walkthroughs available by request.
๐ฃ Source Code Policy
Most of my work is private to protect intellectual property and maintain academic integrity.
I do not share sensitive source code publicly, especially coursework or competitive designs. However, I'm happy to:
- Share proof of authorship and commit history
- Walk through projects live over a verified call
- Discuss design, test, and debug methodology
๐ก๏ธ Contact me for verified access or demonstrations.
๐ ๏ธ Skills & Capabilities
๐ฌ Architecture & RTL Design
- Superscalar, out-of-order, speculative execution pipelines
- Tomasulo's algorithm, register renaming, CDB, PRF, ROB
- Associative and non-blocking cache design (write-back, write-allocate)
- RTL: SystemVerilog, Verilog, Chisel (exploring)
๐ ๏ธ Hardware Design & Verification
- Synthesis, P&R, timing closure, static timing analysis
- ASIC & FPGA workflows - Vivado, Quartus, ModelSim, Verdi
- Formal verification via JasperGold, plus UVM-style validation
- CMOS and VLSI design techniques
โก High-Performance Computing & Parallelism
- CUDA (shared memory tiling, coalesced access, INT8 quantization)
- OpenMP, multithreading, low-latency pipelines
- Real-time image/video processing on FPGAs (30 FPS @ high-res)
๐ง Software & Simulation
- C/C++, Python, Bash, MATLAB, TCL, Perl
- Embedded systems (Arduino, STM32, custom SoCs)
- Web development (HTML/CSS/JS, Flask, Node, Firebase)
- Linux internals, kernel mods, custom drivers, networking stacks
- Database design, file system tweaking, cross-arch simulation
๐งฐ Tools & Platforms
- Cadence Virtuoso, Synopsys Verdi, Vivado, Quartus, SLURM, Makefiles
- Git, GitHub, GitOps workflows, CI/CD
- OS-native scripting (Linux, macOS, Windows PowerShell)
๐งฉ AI/ML Hardware Optimization
- ResNet-50 inference acceleration with TensorRT
- INT8 quantization, CUDA tensor cores
- 4x+ speedup over CPU baselines for real-time inference
๐งโ๐ฌ Meta-Skills
- ๐ง Extreme learning agility - I pick up new stacks like Iโve been doing them for years
- ๐ง Builderโs instinct - from fixing engines to optimizing memory subsystems
- ๐ก Hyper-resourceful - nothing gets blocked, everything gets solved
- ๐ฅ Full-stack to full-system - I don't need a framework. I am the framework
๐ Connect With Me
๐ง Email: balaboud@umich.edu ยท bureirENGR@gmail.com
A commit a day keeps the bugs away
If you see it like a video game, where the daily and weekly "bonuses" are less bugs and less trouble, it becomes much easier to make even one small improvement daily which compounds over time!
