Soccer Player Re-Identification through Broadcast Video Streams

computer-vision machine-learning python opencv yolo deep-learning tracking

Overview

In recent years, computer vision has demonstrated its potential in fields like surveillance, entertainment, and sports analytics. One area that presents intriguing challenges is re-identifying soccer players from broadcast video streams.

This project represents a comprehensive research endeavor to develop a real-time computer vision pipeline capable of:

  • Detecting soccer players in broadcast footage
  • Tracking player movements across frames
  • Re-identifying individual players using jersey numbers

The project uses deep learning techniques, custom datasets, and novel methodologies to overcome obstacles like occlusions, low resolution, and jersey number recognition complexities.


Literature Review

Foundations of Re-Identification and Jersey Number Recognition

The literature surrounding soccer player re-identification can be broadly categorized into player detection, jersey number recognition, and tracking methodologies. Various researchers have attempted to solve these problems with a combination of classic computer vision techniques and deep learning models.

Key Challenges

One prominent challenge in soccer player identification is the occlusion of visual cues, such as facial features or jersey numbers, which often become blurred or hidden during fast-paced gameplay.

Traditional techniques using handcrafted features (like Histogram of Oriented Gradients) combined with Support Vector Machines proved ineffective at generalizing under realistic match conditions, achieving only moderate accuracy levels. Instead, convolutional neural networks (CNNs), particularly YOLO (“You Only Look Once”) models, emerged as a reliable solution for localizing jersey numbers on players’ backs.

Person Re-Identification (ReID)

Person re-identification builds upon feature extraction techniques to identify players across frames:

  • Siamese Networks: Effectively match visual features from different frames, maintaining individual identities despite partial occlusions
  • Gated Siamese CNN: Emphasize salient local features, achieving improved results over traditional approaches
  • Attention Mechanisms + Body Pose: Significantly improve player tracking and re-identification, especially under occlusion-heavy conditions

Literature Summary

TechniquePurposeKey ModelsResults
Handcrafted Features + SVMJersey Number RecognitionHOG, SVMModerate accuracy, limited generalization
Convolutional Neural NetworksPlayer and Jersey LocalizationYOLO, Spatial TransformerHigh accuracy in jersey localization
Siamese NetworksPerson Re-IdentificationSiamese CNNEffective feature matching despite occlusions
Gated Siamese CNNEnhanced Feature MatchingGated Siamese CNNImproved over traditional Siamese networks
Attention + Body PoseImproved TrackingPose-Guided R-CNNSignificant improvement under occlusion

Our Solution

Our proposed solution is a comprehensive computer vision pipeline that integrates person tracking and jersey number recognition, carefully balancing accuracy and processing speed to function effectively in real time.

1. Dataset Creation and Fine-Tuning

To develop our player detection and jersey number recognition models, we crafted two private datasets:

  1. Player Localization Dataset: Designed for localizing soccer players in broadcast frames
  2. Jersey Number Dataset: Targeted jersey number localization and recognition

We modified Google’s Street View House Numbers (SVHN) dataset to include whole numbers ranging from 0 to 99, fine-tuning the model to recognize soccer jersey numbers effectively in dynamic environments.

Dataset Statistics:

  • 31,000+ training images
  • 4,000+ test samples
  • Adapted for varying lighting conditions and jersey styles

2. Player Detection and Tracking

Detection Model: YOLOv5-based object detector fine-tuned on custom soccer player dataset

  • Detection accuracy: >90%
  • Real-time performance on broadcast video frames

Tracking Model: DeepSORT model fine-tuned on custom player tracking sequences

  • Pre-trained on CrowdHuman dataset
  • Fine-tuned on 85 tracks from Premier League matches
  • Rank-1 recognition rate: 95.3%
  • Real-time tracking accuracy: 85%

By coupling YOLOv5 with DeepSORT, we achieved seamless real-time tracking with high accuracy.

3. Jersey Number Localization

Jersey number localization enables player identification by detecting the position of numbers on jerseys.

Model: YOLOv5 trained on custom dataset of jersey number bounding boxes

Dataset Details:

  • Manually annotated from 21 Premier League matches
  • Covers different scenarios and team combinations

Performance:

  • Mean Average Precision (mAP): 96.3%
  • High precision and recall rates

4. Jersey Number Recognition

Model: ResNet18 trained on modified SVHN dataset (numbers 0-99)

Training Approach:

  • Fine-tuned on custom dataset of jersey numbers from Premier League broadcasts
  • Weighted cross-entropy loss to handle class imbalance
  • Extensive augmentation techniques for robustness

Performance:

  • Average accuracy: 88%
  • Robust across different viewing angles and lighting conditions

Results Summary

ComponentDatasetModel/TechniquePerformance
Player DetectionCustom soccer player framesYOLOv595.3% rank-1 recognition
Player TrackingCustom ReID datasetDeepSORT + YOLOv585% tracking accuracy
Jersey Localization21 Premier League matchesYOLOv596.3% mAP
Jersey RecognitionModified SVHN + custom dataResNet1888% accuracy

Conclusion

Our research successfully culminated in a real-time computer vision pipeline capable of localizing, tracking, and re-identifying soccer players in broadcast video streams.

Key Achievements

  • ✅ Real-time processing of broadcast footage
  • ✅ Robust player detection and tracking under occlusions
  • ✅ High-accuracy jersey number recognition
  • ✅ Scalable architecture for sports analytics applications

Technical Stack

By using a combination of:

  • YOLOv5 models for localization
  • ResNet for jersey number recognition
  • DeepSORT for player tracking

We achieved a system that is both efficient and reliable, balancing the need for real-time processing with the intricacies of player identification, even under challenging conditions like occlusions and frequent motion.

Impact

This project represents a significant advancement in the application of computer vision to sports analytics, providing a foundation for:

  • Player performance analysis
  • Team strategy optimization
  • Automated broadcast enhancements
  • Real-time game statistics