Projects

Evolution Meets Diffusion: Efficient Neural Architecture Generation

Evolutionary Diffusion-based Neural Architecture Generation (EDNAG) is a novel approach that achieves efficient and training-free architecture generation. EDNAG leverages evolutionary algorithms to simulate the denoising process in diffusion models, using fitness to guide the transition from random Gaussian distributions to optimal architecture distributions. This approach combines the strengths of evolutionary strategies and diffusion models, enabling rapid and effective architecture generation.

Paper GitHub

NOAA Fisheries Steller Sea Lion Population Count

YouCount is a YOLOv11-based pipeline with an Adaptive Bounding Box Coordinates Annotation Framework (ABC) that enables accurate Steller sea lion counting from challenging aerial imagery. Our method achieved an RMSE of 11.65312 and 2nd place in the "NOAA Fisheries Steller Sea Lion Population Count" Kaggle competition, demonstrating superior performance and effective adaptive strategies.

Report Slides GitHub

Acceleration and Deployment of Large Language Models

This project presents a comprehensive optimization pipeline for the Llama-3.2-3B-Instruct model, significantly enhancing inference efficiency while preserving model quality. By integrating QLoRA fine-tuning, GPTQ quantization, the VLLM inference engine, 8-bit KV-cache, and speculative decoding, we demonstrate that state-of-the-art language models can be effectively deployed on consumer hardware. Targeting both the NVIDIA RTX 3060 and the integrated AMD Radeon 780M GPUs, our approach showcases broad hardware compatibility—from mainstream gaming desktops to ultra-portable laptops—thereby democratizing AI access.

Acceleration Deployment

SCU-ACM-OJ

SCU-ACM-OJ is an Online Judge system specifically designed for school-level ACM programming competitions, aimed at addressing the issues of existing OJ systems being unsuitable for school-level competitions. The system significantly enhances competition organization efficiency and participants' experience through a simple user interface, efficient competition organization, and comprehensive evaluation and feedback.

Demo Slides Docs GitHub

Image Restoration using PromptIR

This project aims to restore images degraded by rain and snow using a single model. We utilize the PromptIR model combined with an improved SE (Squeeze-and-Excitation) module to enhance restoration performance.

GitHub

Medical Instance Segmentation by Mask R-CNN

It's a flexible and reproducible pipeline for multi-class instance segmentation of medical images using Mask R-CNN with a ResNet-50-FPN backbone. It targets the accurate segmentation of four cell types from microscopy images by providing precise instance masks. To improve segmentation performance, the model integrates a Feature Pyramid Network (FPN) for multi-scale feature extraction and a Convolutional Block Attention Module (CBAM) to enhance feature representation via adaptive channel and spatial attention. This combination enables robust detection and segmentation of complex and overlapping cell structures.

GitHub

Wide-ResNet Classification

This project trains a Wide ResNet model on specified dataset (100 classes) using PyTorch Lightning and tests it on the test set.

GitHub

Detection and Classification of Standby Trees on Streets in Taipei City

This project leverages deep learning and high-resolution RGB satellite imagery to automatically detect, locate, and count individual street trees in urban environments, with a particular focus on Taipei City, Taiwan. The approach addresses challenges such as tree occlusion and proximity to buildings, enabling accurate street tree mapping in complex cityscapes.

GitHub

Quantization on DeiT and Llama3 based on HQQ.

This project aims to implement the linear quantization, then quantize the DeiT model & Small Language Model with Half-Quadratic Quantization, resulting in reduced model size and potentially improved inference speed without significant loss in accuracy.

GitHub

Prune Mobilenetv2 by Magnitude-based and Taylor-based Methods

This project aims to prune the MobileNetV2 model, resulting in a significant speedup while maintaining satisfactory accuracy.

GitHub

Recognize Numbers with Arbitrary Length using Faster R-CNN.

This project implements a dual-task digit analysis system using Faster R-CNN with PyTorch, combining object detection and sequence recognition capabilities. The system processes images containing multiple digits (0-9) through a configurable pipeline that supports both ResNet50-FPN and MobileNetV3 backbones, featuring COCO-format data loading with augmentation, dynamic batch processing, and multi-scale training. The model architecture includes customizable trainable layers and class-aware RoI heads, optimized via AdamW with cosine annealing learning rate scheduling. Evaluation simultaneously measures detection accuracy (through COCO mAP metrics) and ordered digit sequence recognition (via spatial sorting and sklearn's accuracy_score), with outputs generated in both JSON and CSV formats. The implementation provides full training/inference workflows, visualization tools, and command-line configuration.

GitHub