PANews reported on December 12 that according to Google's official blog, Google released a new generation of artificial intelligence model Gemini 2.0. Gemini 2.0 supports multi-modal input such as text, images, videos, and audio, and has multi-modal output functions such as native image generation and multi-language text-to-speech (TTS). Compared with Gemini 1.5 Pro, the model speed is increased to two times, and multi-modal reasoning, complex instruction execution, and tool usage capabilities are optimized, supporting the call of Google search, code execution, and third-party functions.

The experimental version of Gemini 2.0 Flash is now open to developers. In January 2025, the multimodal function will be fully promoted, and a multimodal real-time API will be launched to provide developers with more application support.