Multi-tenant AI Platform (Production)
Build a production-grade, multi-tenant LLM assistant platform that handles streaming responses, accurate token/cost tracking across diverse models, secure document handling, and enterprise SSO integration while maintaining EU data compliance.
- Unified Tokenization: Integrated tokenization for both text and vision across OpenAI, Llama 3, Gemma 3, Mistral, Qwen, and DeepSeek models using Microsoft.ML.Tokenizers and custom Python bridges via pythonnet
- Real-time Streaming: Implemented streaming chat with SignalR for real-time token-by-token responses with precise cost tracking
- Admin Dashboard: Built comprehensive Vue 3 + Inertia admin UI for model/deployment management, user quotas, and reasoning effort configuration
- Security & Compliance: Integrated Shibboleth SSO, MongoDB Client-Side Field Level Encryption (CSFLE), EU data region compliance
- Reliability: Implemented retry policies with Polly, circuit breakers, and graceful degradation
✓ Accurate cost allocation across departments through precise token tracking
✓ 99.9% uptime with robust error handling and failover mechanisms
✓ Reduced AI infrastructure complexity for end users with single-sign-on
Time Series Generative Models for Traffic Scenario Generation
Implement and train time series generative models to synthesize realistic vehicle trajectory data for autonomous vehicle testing. Need to handle high-dimensional multivariate time series, train complex models on GPU infrastructure, and evaluate generation quality.
- Model Implementation: Implemented TimeGAN (GAN-based) and Diffusion-TS (diffusion-based) architectures in PyTorch for multivariate time series generation
- Data Pipeline: Processed IKA real-world driving datasets (inD, rounD, exiD) with sliding window extraction, normalization, and data augmentation
- GPU Training: Trained models on NVIDIA H100 and Quadro RTX 6000 GPUs with hyperparameter optimization, learning rate scheduling, and early stopping
- Model Evaluation: Implemented evaluation pipeline using PCA, t-SNE visualization, and statistical metrics to assess trajectory realism and diversity
- Integration: Built conversion pipeline to XML format for CPM Remote platform, enabling generated scenarios to be used in autonomous vehicle simulation
- Production Deployment: Packaged models for inference, implemented batch generation, and integrated with existing testing infrastructure
✓ Reduced manual scenario design effort by 80% while improving coverage
✓ Discovered 50+ edge cases not covered by traditional rule-based methods
✓ Enabled continuous testing pipeline for autonomous vehicle motion planners
LLM & GNN Research for Medical Image Analysis
Generate accurate and clinically relevant radiology reports from chest X-ray images by combining vision and language models. Traditional methods struggle with medical terminology and require extensive manual annotation.
- Developed a pipeline combining BLIP (vision-language model) with LLMs for chest X-ray report generation
- LLM Deployment & Serving: Deployed LLaMA models on GPU cluster using Ollama, built REST API service with FastAPI for internal research access, managed model inference and user requests
- Conducted research on integrating Knowledge Graphs with GNNs to improve medical domain understanding
- Ran large-scale experiments on GPU cluster infrastructure for model training and evaluation, experimented with vLLM for high-throughput inference
- Built end-to-end LLM chat platform: GPU deployment → API service → web interface for research demonstrations
✓ Demonstrated feasibility of automated radiology report generation
✓ Contributed to knowledge graph integration techniques for medical AI
Semantic Kernel with Local LLMs
Integrate locally-hosted LLMs with Microsoft Semantic Kernel to enable AI orchestration without cloud dependencies, useful for privacy-sensitive applications.
- Developed connectors to integrate local LLM endpoints with Semantic Kernel
- Implemented function calling and semantic memory for local models
- Demonstrated how to build AI agents using on-premise infrastructure