Ahora Microsoft tiene un nuevo modelo de ia

Servidores servidores

Image: Morsa Images/Getty Images

Microsoft has unveiled Kosmos-1, which it describes as a multimodal large language model (MLLM) that can not only respond to language prompts but also visual cues, which can be used for an array of tasks, including image captioning, visual question answering, and more.

OpenAI's ChatGPT has helped popularize the concept of LLMs, such as the GPT (Generative Pre-trained Transformer) model, and the possibility of transforming a text prompt or input into an output.

Also:OpenAI is hiring developers to make ChatGPT better at coding

While people are impressed by these chat capabilities, LLMs still struggle with multimodal inputs, such as image and audio prompts, Microsoft's AI researchers argue in a paper called 'Language Is Not All You Need: Aligning Perception with Language Models'. The paper suggests that multimodal perception, or knowledge acquisition and "grounding" in the real world, is needed to move beyond ChatGPT-like capabilities to artificial general intelligence (AGI).

"More importantly, unlocking multimodal input greatly widens the applications of language models to more high-value areas, such as multimodal machine learning, document intelligence, and robotics," the paper says.

Alphabet-owned robotics firm Everyday Robots and Google's Brain Team showed off the role of grounding last year when using LLMs to get robots to follow human descriptions of physical tasks. The approach involved grounding the language model in tasks that are possible within a given real-world context. Microsoft also used grounding in its Prometheus AI model for integrating OpenAI's GPT models with real-world feedback from Bing search ranking and search results.

Microsoft says its Kosmos-1 MLLM can perceive general modalities, follow instructions (zero-shot learning), and learn in context (few-shot learning). "The goal is to align perception with LLMs, so that the models are able to see and talk," the paper says.

The demonstrations of Kosmos-1's outputs to prompts include an image of a kitten with a person holding a paper with a drawn smile over its mouth. The prompt is: 'Explain why this photo is funny?' Kosmos-1's answer is: "The cat is wearing a mask that gives the cat a smile."

Other examples show it: perceiving from an image that a tennis player has a pony tail; reading the time on an image of a clock face at 10:10; calculating the sum from an image of 4 + 5; answering 'what is TorchScale?' (which is a PyTorch machine-learning library), based on a GitHub description page; and reading the heart rate from an Apple Watch face.

Each of the examples demonstrates a potential for MLLMs like Kosmos-1 to automate a task in multiple situations, from telling a Windows 10 user how to restart their computer (or any other task with a visual prompt), to reading a web page to initiate a web search, interpreting health data from a device, captioning images, and so on. The model, however, does not include video-analysis capabilities.

Also: What is ChatGPT? Here's everything you need to know

The researchers also tested how Kosmos-1 performed in the zero-shot Raven IQ test. The results found a "large performance gap between the current model and the average level of adults", but also found that its accuracy showed potential for MLLMs to "perceive abstract conceptual patterns in a nonverbal context" by aligning perception with language models.

The research into "web page question answering" is interesting given Microsoft's plan to use Transformer-based language models to make Bing a better rival to Google search.

"Web page question answering aims at finding answers to questions from web pages. It requires the model to comprehend both the semantics and the structure of texts. The structure of the web page (such as tables, lists, and HTML layout) plays a key role in how the information is arranged and displayed. The task can help us evaluate our model's ability to understand the semantics and the structure of web pages," the researchers explain.

Image: Microsoft

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Servidores servidores

Noticias calientes

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

Now Microsoft has a new AI model - Kosmos-1

See also

Etiquetas calientes: Inteligencia Artificial innovación

Ordering Guide

Recursos recursos

Sobre nosotros

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Servidores servidores

Noticias calientes

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

Now Microsoft has a new AI model - Kosmos-1

See also

Etiquetas calientes: Inteligencia Artificial innovación

Ordering Guide

Recursos recursos

Sobre nosotros

Huawei CloudEngine S5731‑S48P4X Datasheet