ML Research - Ruizhang Zhou

LLM & GNN Research for Medical Imaging

RWTH Chair of DBIS | Research Assistant | Jul 2023 - Mar 2024 (9 months)

Tech Stack

Python PyTorch Transformers LLaMA BLIP PyG (PyTorch Geometric) Knowledge Graphs GPU Cluster

Research Contributions

Medical Report Generation: Working on research paper combining BLIP vision-language model with LLMs for automated chest X-ray report generation
Knowledge Graph Integration: Investigated integrating medical knowledge graphs with GNNs to improve diagnostic accuracy and clinical relevance
LLM Deployment & Serving: Deployed LLaMA models on GPU cluster using Ollama, built FastAPI REST service for research access, experimented with vLLM for inference optimization
Large-scale Experimentation: Conducted experiments on GPU cluster infrastructure, managing distributed training and hyperparameter optimization
Regular Research Presentations: Presented findings weekly at DBIS group meetings, discussing latest developments in LLMs and GNNs
GNN-based Fake News Detection: Reproduced and experimented with GNN architectures (GCN, GAT, GraphSAGE) for fake news classification on social network propagation data

Research Impact

          ✓ Paper in preparation on LLM-based medical report generation

          ✓ Demonstrated feasibility of knowledge graph-enhanced medical AI

          ✓ Contributed to understanding of multi-modal LLM applications

Automatic Speech Recognition System

RWTH Software Project | ASR Research | 2022-2023

Tech Stack

C++ HMM Neural Networks Signal Processing

Research Overview

Implemented custom ASR system from scratch using both traditional HMM-based methods and neural network approaches. Gained deep understanding of acoustic modeling, language modeling, and decoding algorithms.

Components

Feature Extraction: MFCC, filterbank features for audio representation
Acoustic Modeling: HMM-GMM and DNN-HMM hybrid systems
Language Modeling: N-gram models for linguistic constraints
Decoding: Viterbi algorithm for optimal sequence prediction

Links

GitHub Repository

Research Seminars & Coursework

RWTH Master's Program | 2021 - 2023

Large Scale Language Models and GPTs (SS 2023)

Topic: Reinforcement Learning with Human Feedback for LLMs

Deep dive into RLHF methodology for aligning LLMs with human preferences
Analysis of InstructGPT, ChatGPT training procedures
Reward modeling, PPO fine-tuning, and safety considerations

End-to-End Machine Translation (WS 2021/22)

Neural machine translation with seq2seq and transformer architectures
Attention mechanisms and their role in translation quality
Evaluation metrics (BLEU, METEOR) and translation challenges

Computer Vision (SS 2022)

CNNs for image classification, object detection, segmentation
Vision transformers and attention-based architectures
Generative models for image synthesis

Machine Learning (WS 2022/23)

Supervised, unsupervised, and reinforcement learning fundamentals
Deep learning architectures and optimization techniques
Practical implementation with PyTorch and TensorFlow

Reinforcement Learning and Learning-based Control (SS 2023)

MDP formulation and dynamic programming
Q-learning, policy gradient methods, actor-critic algorithms
Applications to robotics and autonomous systems

High-Performance Computing (WS 2022/23)

Parallel computing architectures (CPU, GPU, distributed)
CUDA programming for deep learning acceleration
Performance optimization and scalability analysis

Master Thesis: AI-Based Scenario Generation for Autonomous Vehicles

RWTH Cyber-Physical Mobility Group | Oct 2023 - Apr 2024

Tech Stack

Python PyTorch TimeGAN Diffusion-TS GANs CUDA IKA Datasets

Research Contribution

Comparative study of generative AI approaches (TimeGAN vs Diffusion-TS) for synthesizing realistic traffic scenarios from time series trajectory data. Addresses the challenge of generating diverse and realistic edge cases for testing autonomous vehicle motion planners.

Research Methodology

Problem Formulation: Formulated scenario generation as conditional time series generation problem for vehicle trajectories
Data Collection: Utilized IKA real-world datasets (inD - intersections, rounD - roundabouts, exiD - expressways) containing complex traffic interactions
Comparative Analysis: Implemented and compared TimeGAN (GAN-based) and Diffusion-TS (diffusion-based) approaches for trajectory generation
Evaluation Framework: Designed comprehensive evaluation metrics using PCA, t-SNE visualization, and density analysis to assess realism and diversity
Integration & Validation: Converted generated scenarios to XML format for CPM Remote platform integration and validation with motion planning algorithms
Edge Case Discovery: Analyzed capability of both methods to discover challenging scenarios not covered by traditional rule-based generation

Research Impact

          ✓ Demonstrated time series generative models can produce realistic traffic scenarios

          ✓ Comparative insights on GAN vs Diffusion approaches for trajectory generation

          ✓ Identified edge cases and challenging scenarios for autonomous vehicle testing

          ✓ Provided framework for automated scenario generation from real-world data

Links

GitHub Repository | Thesis PDF