Kuan Zhou's profile picture

San Francisco Bay Area

Kuan Zhou

I am currently a machine learning engineer focusing on ML/AI systems including distributed training(with US patents), inference service performance, AI platform engineering based on Kubernetes, MLOps etc. Additionally, I have a keen interest in building AI applications which leverage the power of generative AI and understanding the mathematics and physics behind neural networks.

Before immersing myself in AI systems, I worked on scientific research in physics - I developed mathematical analysis, research capabilities, and programming skills during undergraduate studies in Physics(thesis: computational simulation for NMR based quantum computing systems, advised by Prof. Xinhua Peng and Prof. Jiangfeng Du) at Univeristy of Science and Technology of China and PhD in Computational Physics(thesis: electronic properties modeling of two-dimensional materials, advised by Prof. Roger Lake) at LATTE lab at University of California, Riverside.

The journey which navigated me from Physics to ML/AI started with reading news about ML/AI, attending ML/AI seminars in Prof. Linli Xu's group, taking ML courses in the CS department, participating Kaggle competitions and completing Insight data science bootcamp. My passion for math and physics was ignited in high school by reading inspiring stories about Albert Einstein and Richard Feynman and participating in Math and Physics Olympiads.

In my spare time, I enjoy films, music, and spending time with my family, hiking, camping, biking, traveling, and trying new foods, along with our two cats, Gemma (orange tabby) and Nova (ragdoll).

Passion

Exploring the synergy between science and technology, building AI applications, understanding the math and physics behind neural networks.

Tech Stack

Proficient in, familiar with, or able to contribute after a brief learning period

Programming Languages

Python, Golang, C/C++, Java, JavaScript/TypeScript, Rust

AI Frameworks

PyTorch, HF Transformers, JAX, TensorFlow, Triton, CUDA

Distributed Systems

Torch Distributed, Megatron-ML, DeepSpeed

ML Platforms

Docker, gRPC, Kubernetes, Istio, OpenTelemetry, Kubebuilder

MLOps

MLFlow, Weights & Biases, BentoML, Flyte, Kubeflow, Hydra

ML Compilers

MLIR, LLVM, TVM

Service Serving

vLLM, Triton Inference Server, Text Generation Inference

AI applications

Electron, Swift/SwiftUI, Streamlit

Frontend

React, NextJS, Material UI, TailwindCSS, FastAPI

Databases

PostgreSQL, BoltDB, SQL

Scientific Tools

Mathematica, Julia, Matlab, LaTeX

Others

Bazel, Mermaid, Pybind, Pydantic, JsonSchema, Spark, Hadoop, ORTools, Numba

Experience

  • Principal Engineer - Machine LearningSambaNova Systems

    April 2020 - PresentPALO ALTO, CA

    • Tech lead in containerizing and deploying generative AI models onto Kubernetes platform SambaStudio
      • Led a 5+ engineers team to deploy foundation model based solutions to business customers
      • Prototyped the generative AI model deployment pipeline and Kubernetes platform
      • Built general and extensive infrastructure for continuous model integration and deployment
      • Standardized the model bringup and integration procedure via refactoring ML applications
    • Co-designed and co-developed distributed learning infrastructure for extreme large models
    • Contributed in core features of SambaNova AI framework
      • Designed, implemented and maintained a binary data extractor as bridge between compiler and runtime
      • Refactored and upgraded AI framework codebase to support functional programming style dataflow execution
      • Implemented various deep learning operators from compiler low level kernels to AI framework end to end
      • Optimized performance of deep learning models(HIPNN etc.) based on SambaNova AI framework and dataflow architecture
      • Integrated TensorBoard as visualization and accuracy debugger tool into SambaNova AI framework
  • Software Engineer - Machine LearningPetuum Inc.

    February 2019 - March 2020SUNNYVALE, CA

    • Leveraged OCR engines and deep learning models to process logistic bills automatically with 0.87 accuracy
    • Collaborated in implementation of various anomaly detection models for equipment health prediction
    • Contributed in machine learning pipeline refactoring and model improvement based on various use cases
  • Artificial Intelligence FellowInsight Data Science

    June 2018 - September 2018SAN FRANCISCO, CA

    • Architected SketchTML that takes in several hand drawn sketches and produces an interactive HTML website
    • Leveraged the framework of pix2code to build a more robust image captioning model with different styles
    • Improved BLEU score up to 0.88 through inventive data augmentation methods and weighted loss functions

Education

© 2024 Kuan Zhou. Crafted using Gatsby framework.