Apple’s Stealthy Move – Unveiling Ferret, an Open-Source Multimodal LLM

In a quiet revelation that echoed louder in the tech corridors than any orchestrated announcement, Apple Inc. and Cornell University researchers slyly introduced the world to Ferret last October. This open-source and multimodal large language model (LLM) breaks Apple’s tradition of secrecy, marking a significant leap into the AI space. Utilizing images as queries, Ferret’s silent debut on GitHub has sparked considerable interest among artificial intelligence enthusiasts and researchers.

Amidst the hushed corridors of innovation, Apple and Cornell University researchers, in an unexpected move, introduced an open-source multimodal large language model (LLM) known as Ferret last October. This unannounced release on GitHub slipped under the radar but has since captivated the AI community’s attention.

Ferret’s ingenious operation – A closer look

Ferret’s modus operandi involves examining specific regions within an image, identifying valuable elements, and encapsulating them within a bounding box. This novel approach allows users to use those elements as queries, prompting Ferret to respond in a traditional manner. 

For example, when a user highlights an image of an animal and queries Ferret about its species, the model identifies and responds accordingly. Ferret can even leverage the context of other elements in the image to provide more detailed responses, offering a glimpse into its unique multimodal capabilities.

The open-source Ferret model, characterized as possessing the capability to reference and establish connections across diverse elements at varying levels of granularity, marks a significant departure for Apple, as indicated by insights shared by Apple AI research scientist Zhe Gan. 

Known for its secretive nature, the company’s willingness to share its AI advancements with the open-source community is seen as a surprising move. This newfound openness positions Apple as a significant player in the multimodal AI space, challenging the industry’s expectations.

Apple’s strategic pivot – Navigating the AI landscape

Ferret’s release not only marks Apple’s foray into open-source AI but also reflects the company’s strategic response to challenges in the AI industry. As noted by tech blogger Ben Dickson, Apple faces stiff competition from rivals like Microsoft Corp. and Google LLC due to limitations in its computing resources. Unlike models like ChatGPT, Apple’s infrastructure is not equipped to serve large language models (LLMs) at scale.

This predicament leaves Apple at a crossroads, with two viable options. The first involves forming strategic partnerships with cloud hyperscale providers to bolster its AI capabilities. The second, as indicated by the release of Ferret, is to embrace an open-source approach, akin to the strategy employed by Meta Platforms Inc. The choice between collaboration and community sharing reflects Apple’s commitment to remaining competitive in the rapidly evolving AI landscape.

As Ferret quietly charts unexplored territories in the realm of multimodal AI, Apple stands at a crossroads that transcends mere technological innovation. The release of this open-source marvel poses a nuanced question about Apple’s future in the AI arena. 

Will Ferret propel Apple into the forefront of multimodal AI, challenging industry norms and fostering collaborative strides? Or does it symbolize a broader shift in the AI landscape, where industry giants balance proprietary prowess with communal innovation? The echoes of Ferret’s stealthy arrival linger, inviting speculation about Apple’s evolving role in shaping the future of artificial intelligence. The answer unfolds in the intersection of technology, collaboration, and the ever-shifting dynamics of the AI narrative.

Source: https://www.cryptopolitan.com/apples-ferret-an-open-source-multimodal-llm/