Chuyan Zhou

Junior Student in CS Major ShanghaiTech University

I’m currently an undergraduate student at ShanghaiTech University majoring in Computer Science & Technology. I am also an undergraduate researcher in the field of NLP, with a focus on LLMs and related areas. Here is my Curriculum Vitae.

Research Interest

Currently, I am most interested or committed in these research topics:

LLM Alignment and Tuning
- Reinforcement Finetuning including Process Reward Models
Mechanistic Interpretability
- Sparse Autoencoders
Diffusion LLMs and its Scaling,
Efficient LLMs, such as
- Parallel Decoding (Speculative/Jacobi/…)
- Quantization,
- KV Cache Compression/Pruning,
Combination of Connectionist and Probabilistic/Symbolic Methods for NLP
AI for Biology where the bold items are which I have nonzero experience on. Generally, I am interested in researches in fields involving Natural Language Processing, Machine Learning Systems & Theories, Reinforcement Learning, AI for Biology and Computer Vision.

Language Skills

English: Fluent, TOEFL iBT 108/120 (2025.07.09 before reformation), GRE 331/340 (2025.03.08, V161/Q170)
Japanese: Fluent, JLPT N2 170/180 (2024.07)
Mandarin: Native

I post blogs here for research, notes and random things such as my traveling: 123

Publications

Coming soon...

Recent Posts

INFOTH Note 20: Channel Coding Schemes

Maximum Likelihood Decoding, Block Coding (Binary Linear, Hamming, etc.) & Polar Codes

2025-11-13

2 min read

Notes

Information Theory and Coding (UCB FA24 EE229A, rewritten in 2025)

INFOTH Note 19: Channel Coding Theorem for DMC 2

Proof of Shannon's Channel Coding Theorem, Converse Theorem

2025-11-06

1 min read

Notes

Information Theory and Coding (UCB FA24 EE229A, rewritten in 2025)

INFOTH Note 18: Channel Coding Theorem for DMC 1

Discrete Memoryless Channel, Shannon's Channel Coding Theorem (2nd)

2025-11-04

1 min read

Notes

Information Theory and Coding (UCB FA24 EE229A, rewritten in 2025)

Note: Gated DeltaNet & Qwen3-Next

Notes for Gated DeltaNet & Qwen3-Next.

2025-10-31

1 min read

Paper Reading

2025 Paper Reading , Architectures

Paper Reading: SEER (Structured Reasoning and Explanation via RL)

SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning.

2025-10-20

3 min read

Paper Reading

2025 Paper Reading

Research Experience

Undergraduate Researcher

Supervised by Dr. Kewei Tu · VDI Center, ShanghaiTech University

2023.10 - Present

Other Contributions
- (2023.12-2025.1, as a main contributor) Developed a binary classification and regression model to predict packaging efficiency leveraging ESM Encoders for synthesis (inpainting) of gene sequence insertions into AAV2 caspid proteins—with the motif for gene therapy vectors.
- (2024.9-2025.2, as the primary contributor) Investigated the feasibility of a Jacobi decoding method in a continuous fashion; further work (e.g. parallel continuous CoT) is under investigation.

More Projects

Reconstruction and Re-evaluation of SFCNN for PLA Scoring

COMPSCI 177 Research Project · ShanghaiTech University

2025.4 - Present

(2025.4) Constructed a PyTorch implementation of the model architecture, training, and evaluation of SFCNN (Scoring Function 3D Convolutional Neural Network) for protein-ligand binding affinity prediction.
(2025.4-6, expected) Developed a novel benchmark for similar models to allow inference directly on the 3D structure of protein-ligand complexes instead of on decoupled protein and ligand structures.

LLM-powered Lecture Generation

COMPSCI 194-196 Research Project · University of California, Berkeley

2024.8 - 2024.12

(2024.8-9) Independently developed the backend framework of the lecture generation pipeline using FastAPI as a deployable web service. This backend includes asynchronous task execution via multithreading, task management via an API powered by Redis databases, and a metadata system for managing generated data.
(2024.9-10) Worked as the main developer to integrate the respective model components on the backend framework.
(2024.10-11) Developed an additional LLM-powered QA agent based on the backend that interacts with LLMs using a long context of generated lectures and a RAG system to dynamically index grounding sources (e.g., textbooks).