Technology, Artificial Intelligence

Siri will continue to improve eventually, it might even comprehend how iPhone apps operate.

Siri was perhaps the first artificial intelligence (AI) tool you ever used if you were a child growing up in the 1990s. The AI-powered voice assistant was introduced in 2011 along with the features of the iPhone 4S.Ã’ Siri was entertaining to interact with and made life easier, whether it was by answering our calls or helping us set an alarm. But there haven’t been any significant announcements about Siri in the past few years. Since the release of OpenAI’s chatbot ChatGPT, AI has gained significant attention. It has been suggested that Siri may become smarter in the future.

There have long been rumours that Apple is developing generative AI capabilities for Siri. Recently, a Cornell University study report describes a novel MLLM (Multimodal Large Language Model) that may be able to comprehend the operation of a phone’s interface. The study, Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs, describes how although technology has advanced significantly, screen interaction still faces significant challenges.

Siri

Siri

On the other hand, Ferret UI, an MLLM that was introduced in October of last year, is being developed to comprehend UI interfaces and possibly comprehend how phone apps function. According to the research, the MLLM also possesses “referring, grounding, and reasoning capabilities.”

A major obstacle to improving AI’s comprehension of app interfaces is the variety of aspect ratios and small visual components present on smartphone screens. In order to overcome this obstacle, Ferret-UI enlarges details and makes use of improved visual elements to make even the tiniest buttons and icons understandable. The research also notes that Ferret-UI has outperformed other models in comprehending and interacting with application interfaces thanks to rigorous training. Ferret-UI may provide intelligence to Siri, Apple’s voice assistant if it is integrated into the tool.

In the future, the digital assistant might carry out intricate operations within apps. Say you tell Siri to make a reservation or book a flight, and Siri will do it with ease by interacting with the relevant app.

Speaking of Ferret, it is a multimodal, open-source large language model that was developed by Cornell University and Apple as a consequence of an in-depth study into the recognition and comprehension of visual elements by large language models. This implies that questions similar to those for ChatGPT or Gemini might be handled by a UI with Ferret underlying. In October of last year, Ferret was established for research purposes.

By MunafeKiDeal

Leave a Reply

Your email address will not be published. Required fields are marked *