Introduction for CVSD
To facilitate the exploration of video shadow detection in the wild, we build a new dataset named Complex Video Shadow Dataset (CVSD). It comprises 196 video clips featuring diverse scenarios encompassing various shadow patterns across 149 categories, resulting in a wide range of challenging cases and shadow characteristics. Within the dataset, we carefully annotate 309,183 disjoint shadow areas, yielding a collection of 19,757 frames with high-quality shadow masks for training and evaluating video shadow detection methods in real-world and complex scenarios.
Eventually, our CVSD enjoys notable features including complex and diverse shadow patterns, improved resolution and crowded objects, expanded illumination scenarios, which pose more opportunities and challenges for video shadow detection.
CVSD Dataset in Detail
Enhanced resolution and Crowded Objects
Previous video shaodw datasets struggle with low resolution and simple scenes, which reduces its ability to detect small or distant shadows in real-world scenes. By comparison, our dataset has been carefully selected to perform exceptionally well in the identification of shadows in real-world circumstances. This is achieved by utilizing its much greater resolution and purposeful inclusion of crowded objects. We provide comprehensive visual examples of our intricate background with a high object density. One noteworthy feature of our dataset is the addition of densely packed backgrounds, which adds another layer of complexity to shadow detection. Our dataset's applicability to real-world settings is further enhanced by addressing the difficulty of recognizing shadows for small and distant objects, as demonstrated in Column 1, Row 2, where shadows for pedestrians and pigeons in the distance are accurately defined. Unlike approaches fixated on dominant shadow instances, our dataset distinguishes itself by offering an unprecedented level of detail. This is seen in Column 3, Row 1, where the labels for the shadows on the fence are very detailed. Together, these improvements strengthen our practical utility and robustness in real-world circumstances.
Diverse Shadow Patterns and Illumination Scenarios
Our dataset displays a wide range of shadow patterns that are impacted by many circumstances, including different kinds of motion, changes in perspective, and a variety of object and scene kinds. Visual examples below show differences between camera types (such as macro, fish eye, and drone aerial photography). This allows for the introduction of viewpoint shifts, motion blur, and many dynamic views.
Our CVSD encompasses shadows generated by multiple light sources, thereby extending the range of original illumination scenarios or scene types beyond conventional categories such as indoor, outdoor, day, and night to include 12 distinct types. Specifically, we broaden 'indoor' to encompass stage lighting, bar lighting, and common indoor lighting. The category of 'night' lighting is expanded to include spotlights and floodlights. Similarly, 'day' lighting is extended to include sunrise, dusk, overcast, and sunny variations. Additionally, the 'outdoor' category is expanded to urban, waterfront, and natural types. This expansion is driven by our observation that even within a single indoor lighting category, bar lighting provides softer shadows because the light is not very bright, while stage lighting provides sharp shadows but the light switches very frequently. Shadows caused by these different factors will have different patterns, and with the new scenarios we have introduced, the trained model will be more robust in practice.
▾ <train>/
▾ images/
▾ 000/
0000.jpg
0001.jpg
...
▾ trees/
0000.jpg
...
...
▾ labels/
▾ 000/
0000.png
0001.png
...
▾ trees/
0000.png
...
...
▾ <test>/
...
Statistics of CVSD
CVSD include many shadow attributes: