hackquest logo

SirenEye

Siren Eyes is an AI-powered system that simultaneously uses sound recognition and video processing to detect ambulances, dynamically controlling traffic signals to reduce delays and save lives.

视频

描述

Siren Eyes: AI-Powered Emergency Vehicle Detection

Problem Statement

  1. Severe Ambulance Delays: Traffic congestion causes critical delays in emergency response.

  2. Inaccurate Existing Systems: Single-mode detection (only siren or video) leads to false positives.

  3. Manual Traffic Management: Current systems are slow, manual, and unresponsive.

Objective

  • Ensure faster ambulance movement by accurate detection and dynamic traffic control.

Solution: Hybrid Detection System

  • Dual-Mode Detection:

    1. Audio Recognition: Keras-based model detects siren sounds.

    2. Video Processing: YOLOv8 detects ambulances in video frames.

  • Confirmation Logic:

    Only if both audio and video detections match (≥70% confidence), the system acts.

  • Automated Traffic Control:

    Smartly changes traffic signals to prioritize ambulances in real-time.

Working Flow and Tech stack

  1. Upload real-time traffic video through GUI.

  2. Extract and analyze audio (KERAS model).

  3. Detect ambulances in frames (YOLOv8, OpenCV).

  4. Cross-validate both detections for maximum accuracy.

Impact

  • Reduce average ambulance response time from 17 minutes to 8 minutes.

  • Increase survival rate by up to 63%.

Future Scope

  1. Predict ambulance arrival times at signals.

  2. Instead of uploading a video, this model can work with integrated real-time cameras.

  3. Enable coordination between multiple traffic signals via cloud computing.

  4. Expand detection to other emergency vehicles like fire trucks and police vans.

Conclusion

Siren Eyes is a step toward smarter cities — saving lives by using AI to optimize emergency traffic management.

本次黑客松进展

Ideation: Identified ambulance delays at traffic signals as a critical problem. Research: Designed a multi-modal system combining audio (siren detection) and video (YOLO-based ambulance tracking). Implementation: Integrated YOLOv8 and a Keras audio classifier. Built logic to detect sirens first, then run video detection, and finally trigger smart signal override. Evaluation: Tested on 4 real-world videos. Achieved 100% Recall with fine-tuned thresholds and obtained an accuracy of 75%. Optimization: ensured CPU-compatibility.

技术栈

Python
Streamlit
Tkinter
OpenCV
YOLOv8(Ultralytics)
Librosa
MoviePy
Keras

融资状态

Actively seeking funding to expand model training, integrate hardware systems at traffic signals, and pilot the project across urban intersections.