pei0033 - Overview

Eunik Park | ML Engineer

LinkedIn Email

About Me


I am a ML Engineer focused on efficient inference and hardware-aware optimization. My work spans LLM serving, model optimization, and runtime performance across GPU, NPU, and mobile environments. I enjoy turning research and systems ideas into practical, production-ready improvements in throughput, latency, and reliability.

Skills


Python PyTorch C++ CUDA
vLLM SGLang TensorRT TensorRT--LLM
MAX ONNX LiteRT Android

Work Experience


ML Engineer @ SqueezeBits SqueezeBits Logo

06/2022 - Present

  • Optimizing models for target hardware & platforms
  • Enhancing performance-speed trade-offs through PTQ and QAT
  • Conducted benchmarking of vLLM and TensorRT-LLM serving

Internship @ LG CNS LG CNS Logo

07/2021 - 08/2021

  • Built AWS 3-tier web service using Terraform

Projects


vLLM for RBLN

12/2025 - Present

[Repo]

  • Worked on serving-path optimization for decoding, scheduling, and structured generation
  • Improved end-to-end inference performance through runtime profiling and targeted optimizations
  • Built supporting benchmark and validation workflows for repeatable performance analysis

MAX

01/2026 - Present

[Repo]

  • Integrated model pipelines into inference platforms and production-style serving paths
  • Optimized interactions between preprocessing, model execution, and postprocessing stages
  • Added verification and benchmarking coverage to support stable iteration

owlite_logo OwLite

08/2023 - 12/2025

[Website] [Github] [OwLite Examples]

  • Developed a framework for easy model quantization from PyTorch to TensorRT
  • Implemented various quantization algorithms and simulations
  • Produced various examples and identified optimization patterns

fistonchips_logo Fits-on-Chips

02/2024 - 06/2024

[Website]

Efficient Keyword Spotting Research

02/2024 - 06/2024

Education


POSTECH

  • Bachelor's in IT Convergence Engineering
  • 03/2016 - 09/2022

Changwon Science High School

  • 03/2014 - 02/2016