hackquest logo

SirenEye

Siren Eyes is an AI-powered system that simultaneously uses sound recognition and video processing to detect ambulances, dynamically controlling traffic signals to reduce delays and save lives.

Videos

Description

Siren Eyes: AI-Powered Emergency Vehicle Detection

Problem Statement

  1. Severe Ambulance Delays: Traffic congestion causes critical delays in emergency response.

  2. Inaccurate Existing Systems: Single-mode detection (only siren or video) leads to false positives.

  3. Manual Traffic Management: Current systems are slow, manual, and unresponsive.

Objective

  • Ensure faster ambulance movement by accurate detection and dynamic traffic control.

Solution: Hybrid Detection System

  • Dual-Mode Detection:

    1. Audio Recognition: Keras-based model detects siren sounds.

    2. Video Processing: YOLOv8 detects ambulances in video frames.

  • Confirmation Logic:

    Only if both audio and video detections match (≥70% confidence), the system acts.

  • Automated Traffic Control:

    Smartly changes traffic signals to prioritize ambulances in real-time.

Working Flow and Tech stack

  1. Upload real-time traffic video through GUI.

  2. Extract and analyze audio (KERAS model).

  3. Detect ambulances in frames (YOLOv8, OpenCV).

  4. Cross-validate both detections for maximum accuracy.

Impact

  • Reduce average ambulance response time from 17 minutes to 8 minutes.

  • Increase survival rate by up to 63%.

Future Scope

  1. Predict ambulance arrival times at signals.

  2. Instead of uploading a video, this model can work with integrated real-time cameras.

  3. Enable coordination between multiple traffic signals via cloud computing.

  4. Expand detection to other emergency vehicles like fire trucks and police vans.

Conclusion

Siren Eyes is a step toward smarter cities — saving lives by using AI to optimize emergency traffic management.

Progress During Hackathon

Ideation: Identified ambulance delays at traffic signals as a critical problem. Research: Designed a multi-modal system combining audio (siren detection) and video (YOLO-based ambulance tracking). Implementation: Integrated YOLOv8 and a Keras audio classifier. Built logic to detect sirens first, then run video detection, and finally trigger smart signal override. Evaluation: Tested on 4 real-world videos. Achieved 100% Recall with fine-tuned thresholds and obtained an accuracy of 75%. Optimization: ensured CPU-compatibility.

Tech Stack

Python
Streamlit
Tkinter
OpenCV
YOLOv8(Ultralytics)
Librosa
MoviePy
Keras

Fundraising Status

Actively seeking funding to expand model training, integrate hardware systems at traffic signals, and pilot the project across urban intersections.

Team LeaderRRohit Sharma
Sector
AIOther