ini

2026-02-09 12:50:10 +05:30 · 2026-02-09 12:50:10 +05:30 · 98103806ef
commit 98103806ef
30 changed files with 879 additions and 0 deletions
--- a/GUIDE_TRAIN_TEST_YOLO.md
+++ b/GUIDE_TRAIN_TEST_YOLO.md
@ -0,0 +1,112 @@
 # YOLOv8 Training & Testing Guide
 This guide details how to prepare your dataset, train the YOLOv8 model for Potholes and Road Signs, and test the trained model.
 ## 1. Dataset Preparation
 YOLOv8 requires data in a specific format.
 ### A. Data Structure
 Organize your dataset folder like this:
 ```
 datasets/
  road_signs_potholes/
    train/
      images/
        img1.jpg
        ...
      labels/
        img1.txt
        ...
    val/
      images/
        ...
      labels/
        ...
    data.yaml
 ```
 ### B. Label Format
 Each `.txt` file in `labels/` corresponds to an image.
 Format: `class_id center_x center_y width height` (normalized 0-1).
 Example `img1.txt`:
 ```
 0 0.5 0.5 0.2 0.3
 1 0.1 0.1 0.05 0.1
 ```
 **Class IDs Mapping (Example):**
 - `0`: Traffic Sign
 - `1`: Pothole
 - `2`: Manhole
 ### C. Creating `data.yaml`
 Create a `data.yaml` file inside your dataset folder (or anywhere accessible).
 ```yaml
 path: ../datasets/road_signs_potholes # dataset root dir
 train: train/images # train images (relative to 'path')
 val: val/images # val images (relative to 'path')
 # Classes
 names:
  0: Traffic Sign
  1: Pothole
  2: Manhole
 ```
 ## 2. Training the Model
 We have provided a script `backend/models/train_yolo.py`.
 **Command:**
 Open your terminal in `d:\Time-Pass-Projects\pothole-roadsign detection`.
 ```bash
 # Activate your environment if needed
 python backend/models/train_yolo.py
 ```
 _Note: You will need to edit `backend/models/train_yolo.py` slightly to point to your actual `data.yaml` path if you haven't already, or pass it as an argument if you modify the script to accept args._
 **Training Process:**
 1.  The script downloads `yolov8n.pt` (nano) as a starting point.
 2.  It runs for 50 epochs (adjustable).
 3.  **Result:** Weights are saved in `runs/detect/train/weights/best.pt`.
 ## 3. Testing the Model
 Once trained, you should test it visually.
 ### A. Locate your specific model
 Find `runs/detect/trainX/weights/best.pt`.
 Copy this file to `backend/models/best.pt` (or update paths in scripts).
 ### B. Run the Test Script
 We have created `backend/test_model.py` for quick verification.
 ```bash
 python backend/test_model.py
 ```
 _Make sure to update the `video_path` or `image_path` in `test_model.py` to point to a real file._
 ## 4. Integration
 After you are satisfied with `best.pt`:
 1.  Move `best.pt` to `backend/models/`.
 2.  Updates `backend/pipelines/video_processor.py` line 10:
    ```python
    self.yolo = YOLOManager("backend/models/best.pt")
    ```
--- a/backend/.gitignore
+++ b/backend/.gitignore
@ -0,0 +1 @@
 /venv
--- a/backend/README.md
+++ b/backend/README.md
@ -0,0 +1,43 @@
 # Pothole & Road Sign Detection Backend
 This project implements a Two-Stage Detection Pipeline using YOLOv8 (for detection & tracking) and CLIP (for zero-shot classification).
 ## Setup
 1.  **Install Dependencies**:
    ```bash
    pip install -r backend/requirements.txt
    ```
 2.  **Training YOLO (Crucial Step)**:
    You MUST train a YOLO model on your custom dataset (Pothole, Manhole, Traffic Signs) for this to work effectively.
    - Prepare your `dataset.yaml`.
    - Run the training script:
    ```python
    from backend.models.train_yolo import train_yolo
    train_yolo("path/to/dataset.yaml", epochs=50)
    ```
    - This will generate `best.pt`. Move this file to `backend/models/best.pt`.
 ## Running the API
 Start the FastAPI server:
 ```bash
 cd backend
 python main.py
 ```
 The server will start at `http://0.0.0.0:8000`.
 ## API Usage
 **Endpoint**: `POST /detect/video`
 - **Body**: `multipart/form-data`, key `file` (Video file).
 - **Response**: JSON summary of unique objects detected.
 ## Configuration
 - Modify `backend/pipelines/video_processor.py` to change the `yolo_model_path` to your trained model path (e.g., `backend/models/best.pt`).
 - You can also adjust the CLA candidate labels in `VideoProcessor.__init__`.
--- a/backend/main.py
+++ b/backend/main.py
@ -0,0 +1,59 @@
 from fastapi import FastAPI, File, UploadFile, HTTPException
 from fastapi.responses import JSONResponse
 import shutil
 import os
 import uuid
 from backend.pipelines.video_processor import VideoProcessor
 app = FastAPI(title="Pothole & Road Sign Detection API")
 # Initialize Processor (Loading models takes time, do it on startup)
 # In production, use lifespan events or dependency injection
 print("Initializing Video Processor...")
 try:
    processor = VideoProcessor()
 except Exception as e:
    print(f"Warning: Could not initialize processor (Check model paths). Error: {e}")
    processor = None
 UPLOAD_DIR = "uploads"
 os.makedirs(UPLOAD_DIR, exist_ok=True)
@app.get("/")
 def health_check():
    return {"status": "running", "models_loaded": processor is not None}
@app.post("/detect/video")
 async def detect_video(file: UploadFile = File(...)):
    if processor is None:
        raise HTTPException(status_code=503, detail="Models not accepted or loaded.")
    # Save uploaded file
    file_id = str(uuid.uuid4())
    file_location = os.path.join(UPLOAD_DIR, f"{file_id}_{file.filename}")
    with open(file_location, "wb") as buffer:
        shutil.copyfileobj(file.file, buffer)
    try:
        # Run processing
        start_time = os.times().elapsed
        results = processor.process_video(file_location)
        # Cleanup
        os.remove(file_location)
        return {
            "video_id": file_id,
            "processed": True,
            "unique_objects": results
        }
    except Exception as e:
        # Cleanup on error
        if os.path.exists(file_location):
            os.remove(file_location)
        raise HTTPException(status_code=500, detail=str(e))
 if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)
--- a/backend/models/clip_manager.py
+++ b/backend/models/clip_manager.py
@ -0,0 +1,62 @@
 from transformers import CLIPProcessor, CLIPModel
 import torch
 from PIL import Image
 class CLIPManager:
    def __init__(self, model_id: str = "openai/clip-vit-base-patch32"):
        """
        Initializes the CLIP model and processor.
        Args:
            model_id (str): Hugging Face model ID.
        """
        print(f"Loading CLIP model: {model_id}...")
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        self.model = CLIPModel.from_pretrained(model_id).to(self.device)
        self.processor = CLIPProcessor.from_pretrained(model_id)
        print(f"CLIP loaded on {self.device}.")
    def classify_image(self, image: Image.Image, candidate_labels: list[str]):
        """
        Classifies an image against a list of text labels.
        Args:
            image (PIL.Image): The cropped image to classify.
            candidate_labels (list[str]): List of strings to compare against.
        Returns:
            dict: {label: score} sorted by confidence.
        """
        if not candidate_labels:
            return {}
        inputs = self.processor(text=candidate_labels, images=image, return_tensors="pt", padding=True).to(self.device)
        with torch.no_grad():
            outputs = self.model(**inputs)
        logits_per_image = outputs.logits_per_image  # this is the image-text similarity score
        probs = logits_per_image.softmax(dim=1)  # we can take the softmax to get the label probabilities
        # Convert to dictionary
        scores = probs.cpu().numpy()[0]
        result = {label: float(score) for label, score in zip(candidate_labels, scores)}
        # Sort by score descending
        sorted_result = dict(sorted(result.items(), key=lambda item: item[1], reverse=True))
        return sorted_result
    def get_best_match(self, image: Image.Image, candidate_labels: list[str], threshold: float = 0.5):
        """
        Returns the single best match if it exceeds the threshold.
        """
        results = self.classify_image(image, candidate_labels)
        if not results:
            return None, 0.0
        best_label = list(results.keys())[0]
        best_score = list(results.values())[0]
        if best_score >= threshold:
            return best_label, best_score
        return "Uncertain", best_score
--- a/backend/models/train_yolo.py
+++ b/backend/models/train_yolo.py
@ -0,0 +1,39 @@
 from ultralytics import YOLO
 def train_yolo(data_yaml_path: str, model_size: str = "yolov8n.pt", epochs: int = 50):
    """
    Trains a YOLOv8 model on a custom dataset.
    Args:
        data_yaml_path (str): Path to the dataset.yaml file.
        model_size (str): Pre-trained model to start from (e.g., yolov8n.pt, yolov8s.pt).
        epochs (int): Number of training epochs.
    """
    print(f"Loading {model_size}...")
    model = YOLO(model_size)
    print(f"Starting training for {epochs} epochs using {data_yaml_path}...")
    model.train(data=data_yaml_path, epochs=epochs, imgsz=640)
    print("Training complete. Validating...")
    metrics = model.val()
    print(f"Validation metrics: {metrics}")
    print("Exporting model...")
    path = model.export(format="onnx")
    print(f"Model exported to {path}")
 if __name__ == "__main__":
    # Example usage:
    # Ensure you have a data.yaml file configured for your dataset
    # train_yolo("path/to/data.yaml")
    # Use relative path from where user is running (backend folder)
    # They are running from 'backend', so dataset is at '../datasets/...'
    dataset_path = "../datasets/road_signs_potholes/data.yaml"
    # Or absolute path if needed:
    # dataset_path = "d:/Time-Pass-Projects/pothole-roadsign detection/datasets/road_signs_potholes/data.yaml"
    print(f"Using dataset: {dataset_path}")
    train_yolo(dataset_path, epochs=100) # Increased epochs for better results on small data
--- a/backend/models/yolo_manager.py
+++ b/backend/models/yolo_manager.py
@ -0,0 +1,35 @@
 from ultralytics import YOLO
 import cv2
 import numpy as np
 class YOLOManager:
    def __init__(self, model_path: str = "yolov8n.pt"):
        """
        Initializes the YOLO model for inference.
        Args:
            model_path (str): Path to the trained YOLO model weights (.pt file).
        """
        print(f"Loading YOLO model from {model_path}...")
        self.model = YOLO(model_path)
    def track(self, frame, conf: float = 0.25, iou: float = 0.5):
        """
        Runs YOLO tracking on a single frame.
        Args:
            frame: Numpy array (image).
            conf (float): Confidence threshold.
            iou (float): IoU threshold.
        Returns:
            Results object from Ultralytics.
        """
        # persist=True is crucial for tracking to work across frames
        results = self.model.track(frame, persist=True, conf=conf, iou=iou, tracker="bytetrack.yaml", verbose=False)
        return results[0]
    def detect(self, frame):
        """Standard detection without tracking."""
        results = self.model.predict(frame, verbose=False)
        return results[0]
--- a/backend/pipelines/video_processor.py
+++ b/backend/pipelines/video_processor.py
@ -0,0 +1,165 @@
 import cv2
 import time
 from collections import defaultdict
 from backend.models.yolo_manager import YOLOManager
 from backend.models.clip_manager import CLIPManager
 from backend.utils.image_utils import is_blurry, crop_image, convert_cv2_to_pil
 class VideoProcessor:
    def __init__(self, yolo_model_path="yolov8n.pt", clip_model_id="openai/clip-vit-base-patch32"):
        self.yolo = YOLOManager(yolo_model_path)
        self.clip = CLIPManager(clip_model_id)
        # Buffer to store the best shot for each track ID
        # Format: {track_id: {'crop': np.array, 'area': float, 'frame_idx': int, 'bbox': list}}
        self.active_tracks = {}
        # Store final results
        self.final_results = []
        # CLIP Candidates
        self.pothole_labels = ["pothole", "shadow", "patch work", "manhole", "road crack"]
        self.sign_labels = ["stop sign", "yield sign", "speed limit 30", "speed limit 40", "speed limit 50", "speed limit 60", "pedestrian crossing", "no u-turn", "traffic light", "keep right"]
        # Frame counter
        self.frame_count = 0
    def process_video(self, video_path: str):
        cap = cv2.VideoCapture(video_path)
        if not cap.isOpened():
            print(f"Error opening video: {video_path}")
            return []
        width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
        height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
        print(f"Processing video {video_path} ({width}x{height})...")
        while cap.isOpened():
            ret, frame = cap.read()
            if not ret:
                break
            self.frame_count += 1
            # 1. Run YOLO Tracking
            results = self.yolo.track(frame)
            if results.boxes is None or results.boxes.id is None:
                continue
            boxes = results.boxes.xyxy.cpu().numpy()
            track_ids = results.boxes.id.cpu().numpy()
            class_ids = results.boxes.cls.cpu().numpy() # 0, 1, 2 depending on training
            current_frame_ids = set()
            for box, track_id, cls in zip(boxes, track_ids, class_ids):
                track_id = int(track_id)
                current_frame_ids.add(track_id)
                x1, y1, x2, y2 = box
                w_box = x2 - x1
                h_box = y2 - y1
                area = w_box * h_box
                # Check if this is the "best shot" so far
                if track_id not in self.active_tracks:
                    self.active_tracks[track_id] = {
                        'crop': crop_image(frame, box),
                        'area': area,
                        'frame_idx': self.frame_count,
                        'class_id': int(cls),
                        'bbox': box,
                        'processed': False
                    }
                else:
                    # Update if bigger area and not processed yet
                    if area > self.active_tracks[track_id]['area'] and not self.active_tracks[track_id]['processed']:
                        self.active_tracks[track_id].update({
                            'crop': crop_image(frame, box),
                            'area': area,
                            'frame_idx': self.frame_count,
                            'bbox': box
                        })
                # Trigger Classification if object is near the edge (leaving frame)
                # Margin of 50 pixels
                if x1 < 50 or y1 < 50 or x2 > width - 50 or y2 > height - 50:
                    self._classify_and_store(track_id)
            # Cleanup tracks that are no longer present (Exited frame)
            # Identify tracks in self.active_tracks that are NOT in current_frame_ids
            # We need to be careful not to classify already processed/removed tracks
            # But here we execute the "Trigger A"
            # Simple approach: Check all active tracks. If a track was seen recently but not now, assume it left?
            # Better approach for simplicity: We do classification on 'processed' flag or when finalizing.
            # Real ByteTrack keeps tracks 'lost' for some frames.
            # Here we will iterate existing keys and if not in current_frame, classify.
            # Cleanup tracks that are no longer present (Exited frame)
            # We iterate over a copy of keys to avoid RuntimeError
            for tid in list(self.active_tracks.keys()):
                if tid not in current_frame_ids:
                    # It's gone from view (or mostly gone)
                    self._classify_and_store(tid)
                    # Remove from active tracks to save memory
                    if self.active_tracks[tid].get('processed'):
                         del self.active_tracks[tid]
        cap.release()
        # Process any remaining tracks
        for tid in list(self.active_tracks.keys()):
            self._classify_and_store(tid)
        print("Processing complete.")
        return self.final_results
    def _classify_and_store(self, track_id):
        track_data = self.active_tracks.get(track_id)
        if not track_data or track_data.get('processed'):
            return
        crop = track_data['crop']
        # Blur check - if too blurry, maybe skip or mark low confidence?
        # For now, we process anyway but could log it.
        # if is_blurry(crop): ...
        # Prepare for CLIP
        pil_image = convert_cv2_to_pil(crop)
        # Deciding which labels to use based on YOLO class
        # Assuming YOLO classes: 0=Sign, 1=Pothole/Manhole (Just an example schema)
        # You'd need to map this to your specific training.
        # For this logic, let's try both or fallback.
        # Strategy: Classify against ALL relevant labels to be safe? 
        # Or split if we trust YOLO class.
        # Let's trust YOLO class if available. 
        # For this template, I will simply check against both lists and take the max confidence one.
        candidates = self.sign_labels + self.pothole_labels
        best_label, score = self.clip.get_best_match(pil_image, candidates, threshold=0.5)
        obj_type = "Traffic Sign" if best_label in self.sign_labels else "Road Damage"
        result = {
            "id": track_id,
            "type": obj_type,
            "subtype": best_label,
            "confidence": float(score),
            "frame_idx": track_data['frame_idx'],
             # In a real app, you might save the crop to disk and return a URL
             # "crop_path": save_to_disk...
        }
        self.final_results.append(result)
        self.active_tracks[track_id]['processed'] = True
 if __name__ == "__main__":
    # Test run
    processor = VideoProcessor()
    # processor.process_video("test_video.mp4")
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@ -0,0 +1,11 @@
 ultralytics
 transformers
 torch
 fastapi
 uvicorn
 opencv-python-headless
 pillow
 numpy
 ftfy
 regex
 tqdm
--- a/backend/runs/detect/train/args.yaml
+++ b/backend/runs/detect/train/args.yaml
@ -0,0 +1,109 @@
 task: detect
 mode: train
 model: yolov8n.pt
 data: ../datasets/road_signs_potholes/data.yaml
 epochs: 100
 time: null
 patience: 100
 batch: 16
 imgsz: 640
 save: true
 save_period: -1
 cache: false
 device: cpu
 workers: 8
 project: null
 name: train
 exist_ok: false
 pretrained: true
 optimizer: auto
 verbose: true
 seed: 0
 deterministic: true
 single_cls: false
 rect: false
 cos_lr: false
 close_mosaic: 10
 resume: false
 amp: true
 fraction: 1.0
 profile: false
 freeze: null
 multi_scale: 0.0
 compile: false
 overlap_mask: true
 mask_ratio: 4
 dropout: 0.0
 val: true
 split: val
 save_json: false
 conf: null
 iou: 0.7
 max_det: 300
 half: false
 dnn: false
 plots: true
 end2end: null
 source: null
 vid_stride: 1
 stream_buffer: false
 visualize: false
 augment: false
 agnostic_nms: false
 classes: null
 retina_masks: false
 embed: null
 show: false
 save_frames: false
 save_txt: false
 save_conf: false
 save_crop: false
 show_labels: true
 show_conf: true
 show_boxes: true
 line_width: null
 format: torchscript
 keras: false
 optimize: false
 int8: false
 dynamic: false
 simplify: true
 opset: null
 workspace: null
 nms: false
 lr0: 0.01
 lrf: 0.01
 momentum: 0.937
 weight_decay: 0.0005
 warmup_epochs: 3.0
 warmup_momentum: 0.8
 warmup_bias_lr: 0.1
 box: 7.5
 cls: 0.5
 dfl: 1.5
 pose: 12.0
 kobj: 1.0
 rle: 1.0
 angle: 1.0
 nbs: 64
 hsv_h: 0.015
 hsv_s: 0.7
 hsv_v: 0.4
 degrees: 0.0
 translate: 0.1
 scale: 0.5
 shear: 0.0
 perspective: 0.0
 flipud: 0.0
 fliplr: 0.5
 bgr: 0.0
 mosaic: 1.0
 mixup: 0.0
 cutmix: 0.0
 copy_paste: 0.0
 copy_paste_mode: flip
 auto_augment: randaugment
 erasing: 0.4
 cfg: null
 tracker: botsort.yaml
 save_dir: D:\Time-Pass-Projects\pothole-roadsign detection\backend\runs\detect\train
--- a/backend/test_model.py
+++ b/backend/test_model.py
@ -0,0 +1,71 @@
 from backend.models.yolo_manager import YOLOManager
 import cv2
 import os
 def test_model(model_path="backend/models/best.pt", source="test_video.mp4"):
    """
    Tests the YOLO model on a video or image.
    """
    if not os.path.exists(model_path):
        print(f"Model not found at {model_path}. Using standard yolov8n.pt for demo.")
        model_path = "yolov8n.pt"
    yolo = YOLOManager(model_path)
    # Check if source is image or video
    ext = os.path.splitext(source)[1].lower()
    if ext in ['.jpg', '.jpeg', '.png', '.bmp']:
        frame = cv2.imread(source)
        if frame is None:
            print(f"Could not read image: {source}")
            return
        results = yolo.detect(frame)
        res_plotted = results.plot()
        cv2.imshow("YOLO Detection", res_plotted)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
    else:
        # Video
        cap = cv2.VideoCapture(source)
        if not cap.isOpened():
            print(f"Could not open video: {source}")
            return
        print("Press 'q' to exit.")
        while True:
            ret, frame = cap.read()
            if not ret:
                break
            # Use 'track' or 'detect'
            results = yolo.track(frame)
            # Plot results on frame
            annotated_frame = results.plot()
            cv2.imshow("YOLO Tracking", annotated_frame)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        cap.release()
        cv2.destroyAllWindows()
 if __name__ == "__main__":
    # CHANGE THIS to your test file
    TEST_FILE = "d:/path/to/your/test/video_or_image.jpg" 
    if not os.path.exists(TEST_FILE):
        if TEST_FILE == "0":
            # Webcam
            test_model(source=0)
        else:
            print(f"File {TEST_FILE} not found.")
            TEST_FILE = input("Enter path to image/video (or 0 for webcam): ").strip('"')
            if TEST_FILE == "0":
                test_model(source=0)
            else:
                test_model(source=TEST_FILE)
    else:
        test_model(source=TEST_FILE)
--- a/backend/utils/image_utils.py
+++ b/backend/utils/image_utils.py
@ -0,0 +1,52 @@
 import cv2
 import numpy as np
 from PIL import Image
 def is_blurry(image: np.ndarray, threshold: float = 100.0) -> bool:
    """
    Checks if an image is blurry using the Laplacian variance method.
    Args:
        image (np.ndarray): The image to check (BGR format).
        threshold (float): The variance threshold below which the image is considered blurry.
    Returns:
        bool: True if blurry, False otherwise.
    """
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    variance = cv2.Laplacian(gray, cv2.CV_64F).var()
    return variance < threshold
 def crop_image(frame: np.ndarray, bbox: list[float], padding_percent: float = 0.1) -> np.ndarray:
    """
    Crops an image based on a bounding box with optional padding.
    Args:
        frame (np.ndarray): The full image frame.
        bbox (list): [x1, y1, x2, y2]
        padding_percent (float): Percentage of padding to add around the box.
    Returns:
        np.ndarray: The cropped image.
    """
    h, w, _ = frame.shape
    x1, y1, x2, y2 = bbox
    width = x2 - x1
    height = y2 - y1
    pad_w = width * padding_percent
    pad_h = height * padding_percent
    # Apply padding ensuring we stay within frame boundaries
    new_x1 = max(0, int(x1 - pad_w))
    new_y1 = max(0, int(y1 - pad_h))
    new_x2 = min(w, int(x2 + pad_w))
    new_y2 = min(h, int(y2 + pad_h))
    return frame[new_y1:new_y2, new_x1:new_x2]
 def convert_cv2_to_pil(cv2_image: np.ndarray) -> Image.Image:
    """Conventional cv2 BGR to PIL RGB conversion."""
    color_converted = cv2.cvtColor(cv2_image, cv2.COLOR_BGR2RGB)
    return Image.fromarray(color_converted)
--- a/backend/utils/label_helper.py
+++ b/backend/utils/label_helper.py
@ -0,0 +1,66 @@
 import cv2
 import sys
 import os
 def get_yolo_coordinates(image_path):
    """
    Opens an image, lets the user draw a box, and prints the YOLO format coordinates.
    """
    if not os.path.exists(image_path):
        print(f"Error: File {image_path} not found.")
        return
    # Load image
    img = cv2.imread(image_path)
    if img is None:
        print("Error: Could not read image.")
        return
    height, width, _ = img.shape
    print("---------------------------------------------------------")
    print(f"Loaded {image_path} ({width}x{height})")
    print("INSTRUCTIONS:")
    print("1. Draw a box around the object using your mouse.")
    print("2. Press ENTER or SPACE to confirm the box.")
    print("3. Press 'c' to cancel.")
    print("---------------------------------------------------------")
    # Select ROI
    # fromCenter=False, showCrosshair=True
    r = cv2.selectROI("Draw Box (Press Enter to Confirm)", img, fromCenter=False, showCrosshair=True)
    cv2.destroyAllWindows()
    # r is (x, y, w, h) in pixels
    x_pixel, y_pixel, w_pixel, h_pixel = r
    if w_pixel == 0 or h_pixel == 0:
        print("No box selected.")
        return
    # Convert to YOLO format (Normalized)
    # center_x, center_y, width, height
    center_x = (x_pixel + (w_pixel / 2)) / width
    center_y = (y_pixel + (h_pixel / 2)) / height
    norm_w = w_pixel / width
    norm_h = h_pixel / height
    # Limit precision to 6 decimal places
    print(f"\nSUCCESS! Here is your YOLO label line:")
    print(f"---------------------------------------------------------")
    print(f"<class_id> {center_x:.6f} {center_y:.6f} {norm_w:.6f} {norm_h:.6f}")
    print(f"---------------------------------------------------------")
    print("Replace <class_id> with:")
    print("0 -> if it is a Traffic Sign")
    print("1 -> if it is a Pothole")
    print("2 -> if it is a Manhole")
    print(f"---------------------------------------------------------")
 if __name__ == "__main__":
    if len(sys.argv) > 1:
        path = sys.argv[1]
    else:
        path = input("Enter the path to your image: ").strip('"')
    get_yolo_coordinates(path)
--- a/backend/yolov8n.pt
+++ b/backend/yolov8n.pt
--- a/datasets/road_signs_potholes/README.md
+++ b/datasets/road_signs_potholes/README.md
@ -0,0 +1,31 @@
 # Dataset Structure Guide
 This folder contains the structure required for YOLOv8 training.
 ## What goes where?
 1.  **Images**:
    - Put your training images (80% of data) in: `train/images/`
    - Put your validation images (20% of data) in: `val/images/`
    - Supported formats: `.jpg`, `.png`, `.bmp`.
 2.  **Labels**:
    - For every image `image1.jpg`, you need a text file `image1.txt` in the corresponding `labels/` folder.
    - Example:
      - `train/images/road_01.jpg`
      - `train/labels/road_01.txt`
 3.  **data.yaml**:
    - This file configures the dataset paths and class names.
    - It is the entry point for the training script.
 ## Label Format
 YOLO expects a `.txt` file with one line per object:
 `<class_id> <x_center> <y_center> <width> <height>`
 - **class_id**: Integer (0, 1, 2...) from `data.yaml`.
 - **coordinates**: Normalized between 0 and 1.
 Example:
 `0 0.5 0.5 0.2 0.4` -> Class 0, centered in the middle, 20% width, 40% height.
--- a/datasets/road_signs_potholes/data.yaml
+++ b/datasets/road_signs_potholes/data.yaml
@ -0,0 +1,16 @@
 # Train/Val/Test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
 path: D:/Time-Pass-Projects/pothole-roadsign detection/datasets/road_signs_potholes # dataset root dir
 train: train/images
 val: val/images
 # Classes
 names:
  0: Traffic Sign
  1: Pothole
  2: Manhole
 # Key:
 # data.yaml is the "Map" for YOLO.
 # 1. It tells YOLO where to find the images for training and validation.
 # 2. It tells YOLO how many classes there are and what their names are (0, 1, 2...).
 # 3. YOLO reads this file first to verify everything exists.
--- a/datasets/road_signs_potholes/train/images/40.png
+++ b/datasets/road_signs_potholes/train/images/40.png
--- a/datasets/road_signs_potholes/train/images/60.png
+++ b/datasets/road_signs_potholes/train/images/60.png
--- a/datasets/road_signs_potholes/train/images/manhole.png
+++ b/datasets/road_signs_potholes/train/images/manhole.png
--- a/datasets/road_signs_potholes/train/images/narrow_bridge_ahead.png
+++ b/datasets/road_signs_potholes/train/images/narrow_bridge_ahead.png
--- a/datasets/road_signs_potholes/train/images/pothole1.png
+++ b/datasets/road_signs_potholes/train/images/pothole1.png
--- a/datasets/road_signs_potholes/train/images/pothole2.png
+++ b/datasets/road_signs_potholes/train/images/pothole2.png
--- a/datasets/road_signs_potholes/train/labels/40.txt
+++ b/datasets/road_signs_potholes/train/labels/40.txt
@ -0,0 +1 @@
 0 0.481481 0.488739 0.757202 0.815315
--- a/datasets/road_signs_potholes/train/labels/60.txt
+++ b/datasets/road_signs_potholes/train/labels/60.txt
@ -0,0 +1 @@
 0 0.492045 0.492081 0.979545 0.979638
--- a/datasets/road_signs_potholes/train/labels/manhole.txt
+++ b/datasets/road_signs_potholes/train/labels/manhole.txt
@ -0,0 +1 @@
 2 0.485294 0.511561 0.858289 0.734104
--- a/datasets/road_signs_potholes/train/labels/narrow_bridge_ahead.txt
+++ b/datasets/road_signs_potholes/train/labels/narrow_bridge_ahead.txt
@ -0,0 +1 @@
 0 0.470684 0.442231 0.700326 0.749004
--- a/datasets/road_signs_potholes/train/labels/pothole1.txt
+++ b/datasets/road_signs_potholes/train/labels/pothole1.txt
@ -0,0 +1 @@
 1 0.511981 0.741117 0.321086 0.172589
--- a/datasets/road_signs_potholes/train/labels/pothole2.txt
+++ b/datasets/road_signs_potholes/train/labels/pothole2.txt
@ -0,0 +1 @@
 1 0.679221 0.808594 0.454545 0.296875
--- a/datasets/road_signs_potholes/val/images/road_003.jpg
+++ b/datasets/road_signs_potholes/val/images/road_003.jpg
--- a/datasets/road_signs_potholes/val/labels/road_003.txt
+++ b/datasets/road_signs_potholes/val/labels/road_003.txt
@ -0,0 +1 @@
 0 0.4 0.4 0.15 0.15