PANews reported on December 6 that well-known AI institutions Grass, Ontocord and LAION announced the joint release of the VALID (Video-Audio Large Interleaved Dataset) dataset.

The dataset is built on the Grass video warehouse and contains 30 million audio clips that are interleaved with images and text. It is the industry's first video-audio interleaved dataset. The release of VALID will provide new data support for the training of multimodal AI models.