1. datasets-overlay

Dataset: Labeled Website Screenshots for Overlay Detection

1 min read

Dataset: Labeled Website Screenshots for Overlay Detection

This dataset facilitates research on automatically detecting overlays (like pop-ups, cookie banners, ads) in website screenshots, a common challenge in web data analysis and scraping. It contains 1,397 website screenshots, manually labeled into two classes: 'Overlay Present' (285 images, label 1) and 'No Overlay' (1,112 images, label 0). The screenshots were collected from various websites and resized to 224x224 for use with deep learning models. This dataset was used in our ICAR'15 conference paper to train and evaluate CNN models for overlay classification. You can access the dataset, including train/validation/test splits, via the link below.

git clone https://hf.co/datasets/goker/overlay

https://hf.co/datasets/goker/overlay

1.0.2