YOLOv11m - Question Segmentation for PDFs
A fine-tuned YOLOv11m model designed to detect and segment questions in Turkish educational documents (PDFs).
- Task: Object Detection
- Classes:
question(Single class) - Resolution:
1280x1280
Model Details
- Base Model:
yolo11m(Ultralytics) - Parameters: ~20M
- Training Epochs: 44
- Compute: Trained for ~4 hours on an NVIDIA RTX 4090 Mobile GPU.
- Precision: 0.971 (Very Low False Positives)
- mAP@50: 0.716
Intended Use & Limitations
This model is optimized for extracting question blocks from dense test papers, worksheets, and exam booklets.
✅ Best For
- Two-Column Tests: Standard exam layouts where questions are split into columns.
- Dense Worksheets: Pages packed with questions.
- LGS Style: It potentially works for single-column LGS-style next-generation questions detailed with graphics, though performance is robustest on two-column layouts.
⚠️ Limitations
- Header Merging: In some cases, the model might accidentally merge the test header/title with the first question.
- Answer Format: The model is heavily biased towards typical "choice-based" questions (A, B, C, D, E). It may fail to detect questions that lack these choice markers or follow an open-ended format.
- Weird Layouts: Extremely irregular layouts or overlapping text boxes might confuse the boundary checks.
- Confidence: It is recommended to use a confidence threshold of 0.3 - 0.4 depending on the specific test.
Usage & Best Practices
from ultralytics import YOLO
from ultralytics.utils.downloads import safe_download
# Load the model
model_url = "https://huggingface.co/erayyapagci/yolo11m-question-segmentation/resolve/main/yolov11m-question-seg.pt"
model = YOLO(safe_download(model_url))
# Run Inference
# Recommended conf: 0.3 - 0.4
results = model("page_image.jpg", imgsz=1280, conf=0.35)
# Show results
results[0].show()
Example Output
Here is a side-by-side comparison on a real ÖSYM test sample:

Filtering False Positives (Heuristics)
To further eliminate false positives (e.g., random paragraphs detected as questions), it is highly recommended to use an OCR library (like Tesseract, EasyOCR, or PaddleOCR) on the cropped question image.
- Check for Question Numbers: verify if the text starts with a number pattern like
1.,2),Soru 3:. - Check for Choices: verify if the text contains multiple choice markers like
A),B),C),D),E). - If a detected box contains neither, it is likely a false positive (header, instruction text) and can be discarded.
Training Data & Citations
Trained on a dataset of 14,693 images (after strict filtering) sourced from 10 public Roboflow datasets. The data was split using Document-Aware Splitting to ensure no data leakage between training and validation sets.
We gratefully acknowledge the keys datasets used in this training:
- PDF Soru Cikarma (tanimazsinu): Link
- WholeQuestionDetection (Gazi University): Link
- ExamBuddy (ExamBuddy): Link
- Questions (Terry Li): Link
- Question Parsing from Document (Sefa): Link
- Question Dedector (Nur Etinkaya): Link
- Sorukes (Sorualgilama): Link
- Question Detection (Cognizen): Link
- Questions2 (Fiver): Link
- Question-New (Question): Link
- Downloads last month
- 38