Chuyan Zhou

Junior Student in CS Major ShanghaiTech University
I’m currently an undergraduate student at ShanghaiTech University majoring in Computer Science & Technology. I am also an undergraduate researcher in the field of NLP, with a focus on LLMs and related areas. Here is my Curriculum Vitae.

Research Interest

Currently, I am most interested or committed in these research topics:

  • Post-training of LLMs
    • Reinforcement Finetuning, Process Reward Models & Design
    • Distillation
  • Diffusion LLMs and its Scaling,
  • Latent/Soft Token LLMs
  • Efficient LLMs, such as
    • Parallel Decoding (Speculative/Jacobi/…)
    • Quantization,
    • KV Cache Compression/Pruning,
  • Combination of Connectionist and Probabilistic/Symbolic Methods for NLP
  • Mechanistic Interpretability
    • Sparse Autoencoders
  • AI for Biology

where the bold items are which I have nonzero experience on. Generally, I am interested in researches in fields involving Natural Language Processing, Machine Learning Systems & Theories, Reinforcement Learning, AI for Biology and Computer Vision.


Misc

  • C1 level in Japanese & English (JLPT N1 169/180, TOEFL iBT 108/120)
  • I post blogs here for research, notes and random things such as my traveling: 123

Following is my publication list and research project experience.

Publications

GiLT: Augmenting Transformer Language Models with Dependency Graphs

ACL 2026 Main Conference
Tianyu Huang, Yida Zhao, Chuyan Zhou, Kewei Tu [Paper]

Recent Posts

INFOTH Note 23: Parallel Gaussian Channels, WSS, Distributed Source Compression

Parallel Gaussian Channel Model, Water-Filling Theorem, Szegő Theorem for Colored Noise, Distributed Source Compression and Slepian-Wolf Theorem

INFOTH Note 22: AWGN Channel and Shannon Limit

Waveform Channel Model, Additive White Gaussian Noise with its Shannon Theorem, and Spectral Efficiency

INFOTH Note 21: Polar Codes and Related Theories (Bhattacharyya, Martingale)

Polar Codes, Bhattacharyya Parameter, Martingale Theory, and Channel Polarization

INFOTH Note 20: Channel Coding Schemes

Maximum Likelihood Decoding, Block Coding (Binary Linear, Hamming, etc.) & Polar Codes

INFOTH Note 19: Channel Coding Theorem for DMC 2

Proof of Shannon's Channel Coding Theorem, Converse Theorem

Research Experience

Undergraduate Researcher

  • Other Contributions
    • (2023.12-2025.1, as a main contributor) Developed a binary classification and regression model to predict packaging efficiency leveraging ESM Encoders for synthesis (inpainting) of gene sequence insertions into AAV2 caspid proteins—with the motif for gene therapy vectors.
    • (2024.9-2025.2, as the primary contributor) Investigated the feasibility of a Jacobi decoding method in a continuous fashion; further work (e.g. parallel continuous CoT) is under investigation.

More Projects

Reconstruction and Re-evaluation of SFCNN for PLA Scoring

COMPSCI 177 Research Project · ShanghaiTech University
2025.4 - Present
  • (2025.4) Constructed a PyTorch implementation of the model architecture, training, and evaluation of SFCNN (Scoring Function 3D Convolutional Neural Network) for protein-ligand binding affinity prediction.
  • (2025.4-6, expected) Developed a novel benchmark for similar models to allow inference directly on the 3D structure of protein-ligand complexes instead of on decoupled protein and ligand structures.

LLM-powered Lecture Generation

COMPSCI 194-196 Research Project · University of California, Berkeley
2024.8 - 2024.12
  • (2024.8-9) Independently developed the backend framework of the lecture generation pipeline using FastAPI as a deployable web service. This backend includes asynchronous task execution via multithreading, task management via an API powered by Redis databases, and a metadata system for managing generated data.
  • (2024.9-10) Worked as the main developer to integrate the respective model components on the backend framework.
  • (2024.10-11) Developed an additional LLM-powered QA agent based on the backend that interacts with LLMs using a long context of generated lectures and a RAG system to dynamically index grounding sources (e.g., textbooks).