Project Description
I’m building a hybrid artificial-intelligence pipeline that can spot low-flying drones by fusing two independent sensing channels: computer-vision video feeds and raw RF captures. The computer-vision branch will run YOLOv8 for real-time object detection on Anti-UAV and VisDrone frames, while the RF branch will pass HackRF / RTL-SDR captures (plus the DeepSig RadioML corpus) through a CNN trained on spectrograms.
Once both branches output their per-frame or per-burst confidence scores, I need a fusion layer—either a clear rule-based logic tree or a lightweight neural network—that decides when a drone is present. Whichever fusion strategy you implement, the final system must report Accuracy, Precision, Recall and False Alarm Rate on a held-out test split of each dataset.
Key technical notes
• Vision model: YOLOv8 (PyTorch) fine-tuned on Anti-UAV & VisDrone
• RF model: CNN on spectrograms (TensorFlow/Keras or PyTorch, whichever you are faster with)
• Fusion: rule engine or small MLP, well-documented so it can be swapped out later
Expected deliverables
1. Clean, reproducible training scripts for both branches and the fusion stage
2. Saved model weights and inference scripts capable of running on a single GPU (8 GB VRAM) or CPU-only fallback
3. Evaluation notebook or script that prints the four metrics and exports confusion matrices plus ROC curves in PNG/PDF
4. Brief write-up (2–3 pages) explaining data preprocessing, model choices, fusion logic and observed performance
Acceptance criteria
• Each metric computed on at least one public vision dataset and one RF dataset
• End-to-end inference latency under one second per sample on an Nvidia 3060 or equivalent
• All code packaged in a Git repo with a README that lets me replicate results with a single command
PyTorch, TensorFlow, scikit-learn, seaborn/matplotlib for plots and, if necessary, GNU Radio for SDR preprocessing are all fair game as long as dependencies are pinned in a requirements.txt or environment.yml.
If anything about the datasets, tooling or deliverables is unclear, flag it early so we can adjust before training begins.