SirenEye
Siren Eyes is an AI-powered system that simultaneously uses sound recognition and video processing to detect ambulances, dynamically controlling traffic signals to reduce delays and save lives.
Videos
Description
Siren Eyes: AI-Powered Emergency Vehicle Detection
Problem Statement
Severe Ambulance Delays: Traffic congestion causes critical delays in emergency response.
Inaccurate Existing Systems: Single-mode detection (only siren or video) leads to false positives.
Manual Traffic Management: Current systems are slow, manual, and unresponsive.
Objective
Ensure faster ambulance movement by accurate detection and dynamic traffic control.
Solution: Hybrid Detection System
Dual-Mode Detection:
Audio Recognition: Keras-based model detects siren sounds.
Video Processing: YOLOv8 detects ambulances in video frames.
Confirmation Logic:
Only if both audio and video detections match (≥70% confidence), the system acts.
Automated Traffic Control:
Smartly changes traffic signals to prioritize ambulances in real-time.
Working Flow and Tech stack
Upload real-time traffic video through GUI.
Extract and analyze audio (KERAS model).
Detect ambulances in frames (YOLOv8, OpenCV).
Cross-validate both detections for maximum accuracy.
Impact
Reduce average ambulance response time from 17 minutes to 8 minutes.
Increase survival rate by up to 63%.
Future Scope
Predict ambulance arrival times at signals.
Instead of uploading a video, this model can work with integrated real-time cameras.
Enable coordination between multiple traffic signals via cloud computing.
Expand detection to other emergency vehicles like fire trucks and police vans.
Conclusion
Siren Eyes is a step toward smarter cities — saving lives by using AI to optimize emergency traffic management.
Progress During Hackathon
Ideation: Identified ambulance delays at traffic signals as a critical problem. Research: Designed a multi-modal system combining audio (siren detection) and video (YOLO-based ambulance tracking). Implementation: Integrated YOLOv8 and a Keras audio classifier. Built logic to detect sirens first, then run video detection, and finally trigger smart signal override. Evaluation: Tested on 4 real-world videos. Achieved 100% Recall with fine-tuned thresholds and obtained an accuracy of 75%. Optimization: ensured CPU-compatibility.
Tech Stack
Fundraising Status
Actively seeking funding to expand model training, integrate hardware systems at traffic signals, and pilot the project across urban intersections.