π° AI Researcher
Ph.D. Student @ HKU | Research Associate @ UCSB | Specializing in Efficient AI, LLM Optimization, and Hardware-Aware Training
π Actively seeking full-time opportunities in AI research, ML systems, or hardware-software co-design starting in 2025/2026.
π« zhoutomas177@gmail.com(Personal) |
LinkedIn |
βοΈ ryjjc@connect.hku.hk |
π Mountain View, California (Now) |
π Education
Ph.D., EEE | The University of Hong Kong (December 2025)
- Focus: Efficient LLM training/inference, quantization, reasoning, and edge deployment
- Supervised by Prof. Ngai Wong
- Publications in NLP/FPGA-related conferences/Journal
M.S., IC Design | HKUST (December 2019)
- Advisor: Prof. Mansun
- Focus: VLSI Design, Embedded Systems, Semiconductor Devices
B.Eng., IC Design | National Huaqiao University (June 2018)
- Multiple scholarships and an exchange student experience in Taiwan
πΌ Experience
Research Intern @ Samsung Research America
May. 2025 β Present
Research Associate @ UCSB
Sept. 2023 β Apr. 2025 | Advisor: Prof. Zheng Zhang
- Developed low-bit quantized fine-tuning techniques for LLMs (QuZO, LoRA variants)
- Collaborated with the Amazon AGI team on scalable training paradigms
- NAACL 2024 Spotlight Paper: Demonstrated high scalability vs. other tuning baselines
Research Assistant @ CUHK
Apr. 2021 β Dec. 2021 | Advisor: Prof. Guoliang Xing
- Co-designed FPGA-GPU hybrid acceleration schemes
- Led NFC wireless charging system project from concept to prototype
Mixed-Signal IC Design Engineer @ ASTRI
Sep. 2019 β Mar. 2021 | Technology Co-Design Group
- Designed key analog IPs, including ADCs, comparators, and amplifiers
- Delivered taped-out chip with 10-bit ADC and PMU subsystems
π§ Selected Publications
- QuZO: Quantized Zeroth-Order Fine-Tuning for LLMs β (Under Review)
- LoRETTA: Tensor-Train Adaptation for LLMs β NAACL 2024
- DyBit: Dynamic Bit-Precision Inference β IEEE TCAD 2023
- MSD: Mixing Signed Digits on FPGAs β FCCM 2023
- NoiseZO: RRAM-Driven ZO Optimization β DAC 2025
- HKLUT: Hundred-kilobyte lookup tables for Super-resolution - IJCAI 2024
- PECAN: Product-Quantized CAM Network β DATE 2023
- Lite It Fly: All-Deformable Butterfly Network β TNNLS (in brief)
π Full publication list on Google Scholar
π Research Highlights
Machine learning and systems, with a focus on efficient training and inference:
- Efficient LLM Fine-Tuning: Developed QuZO and LoRETTA frameworks to push the limit of parameter-efficient and quantized tuning strategies.
- Hardware-Aware ML: Designed acceleration methods on FPGAs and NPU chips for DNN inference and edge AI.
- Algorithm/Hardware Co-Design: Collaborated on the hardware compiler optimization across algorithms and model simulator.
π· Fun Fact
I enjoy exploring the intersection of AI algorithms and hardwareβwhether itβs crafting efficient LLM models, squeezing memory on an edge chip, or analyzing training efficiency.
π€ Academic
Iβm passionate about bridging academia and decentralized technologyβwhether itβs co-authoring papers on efficient LLM training, collaborating with global research labs, or exploring blockchain infrastructure projects that bring AI infrastructure and intelligent agents on-chain.
π§ Technical Skills
Languages: Python, C/C++, MATLAB, Verilog
Frameworks & Platforms: PyTorch, TensorFlow (incl. Lite & Keras), CUDA
Tools: Cadence, Xilinx Vivado & ISE, HSpice, Modelsim, VS Code
