Eunik Park | ML Engineer
About Me
I am a ML Engineer focused on efficient inference and hardware-aware optimization. My work spans LLM serving, model optimization, and runtime performance across GPU, NPU, and mobile environments. I enjoy turning research and systems ideas into practical, production-ready improvements in throughput, latency, and reliability.
Skills
Work Experience
06/2022 - Present
- Optimizing models for target hardware & platforms
- Enhancing performance-speed trade-offs through PTQ and QAT
- Conducted benchmarking of vLLM and TensorRT-LLM serving
07/2021 - 08/2021
- Built AWS 3-tier web service using Terraform
Projects
vLLM for RBLN
12/2025 - Present
[Repo]
- Worked on serving-path optimization for decoding, scheduling, and structured generation
- Improved end-to-end inference performance through runtime profiling and targeted optimizations
- Built supporting benchmark and validation workflows for repeatable performance analysis
MAX
01/2026 - Present
[Repo]
- Integrated model pipelines into inference platforms and production-style serving paths
- Optimized interactions between preprocessing, model execution, and postprocessing stages
- Added verification and benchmarking coverage to support stable iteration
08/2023 - 12/2025
[Website] [Github] [OwLite Examples]
- Developed a framework for easy model quantization from PyTorch to TensorRT
- Implemented various quantization algorithms and simulations
- Produced various examples and identified optimization patterns
02/2024 - 06/2024
[Website]
- Conducted comprehensive performance benchmarking of LLM serving frameworks
- Implemented evaluation module
- Wrote blog post, [vLLM vs TensorRTLLM] weight-activation quantization
Efficient Keyword Spotting Research
02/2024 - 06/2024
- Presented poster at Interspeech 2024
RepTor: Re-parameterizable Temporal Convolution for Keyword Spotting via Differentiable Kernel Search - Developed CNN-based KWS model using structural reparameterization
- Implemented Latency-aware Neural Architecture Search
- Achieved 97.9% accuracy with 183μs latency on Galaxy S10 CPU
Education
POSTECH
- Bachelor's in IT Convergence Engineering
- 03/2016 - 09/2022
Changwon Science High School
- 03/2014 - 02/2016