Llama 3.2: O Novo Modelo Multimodal da Meta com Capacidade Visual

Tecnologia Inteligência Artificial Inovação

A Meta lançou o Llama 3.2, um novo modelo multimodal que combina capacidades visuais e textuais, disponível na plataforma Hugging Face. O lançamento inclui dez modelos de pesos abertos, sendo cinco multimodais e cinco focados apenas em texto, com versões otimizadas para diferentes aplicações.

Generate an image representing the fusion of visual and textual abilities in a multimodal AI, illustrated as a robot interacting with floating texts and images. This composition symbolises the advanced technological capabilities of the newly launched AI 'Llama 3.2' available on a popular Machine Learning platform. The robot should be set against a technological background fashioned with circuitry and data charts, indicative of data analysis and machine learning capabilities. The primary color palette should include shades of blue and green, symbolising innovation and technology. The desired style is flat, corporate, and vector-like, with a 2D linear perspective on a texture-less white background.

Imagem gerada utilizando Dall-E 3

O Llama 3.2 apresenta dois tamanhos de modelos visuais: 11B e 90B, projetados para implantação eficiente em GPUs de consumo e aplicações em larga escala, respectivamente. Ambos os modelos oferecem versões básicas e otimizadas por instrução. Além disso, foi introduzido o Llama Guard 3, um modelo de segurança que avalia entradas e saídas do modelo, focando na detecção de conteúdos prejudiciais.

Os novos modelos de texto, com tamanhos de 1B e 3B, são adequados para execução em dispositivos e se destacam em tarefas como reescrita de prompts e resumo. Todos os modelos foram treinados em um vasto conjunto de dados de 60 bilhões de pares de texto e imagem, permitindo uma performance robusta em tarefas de compreensão e raciocínio visual.

Modelos visuais com capacidades de raciocínio e compreensão de imagens.
Integração com plataformas como Google Cloud e Amazon SageMaker.
Mudanças nas políticas de licenciamento que afetam usuários na UE.

O modelo Llama 3.2 foi projetado para suportar múltiplas línguas e é especialmente eficaz em tarefas que combinam texto e imagem. A Meta também enfatizou a importância da segurança e da ética na utilização desses modelos, especialmente com a introdução do Llama Guard.

- Capacidades multimodais: texto e imagem. - Suporte a várias línguas, incluindo português. - Integrações com ferramentas de desenvolvimento populares.

As inovações do Llama 3.2 não apenas ampliam as capacidades de modelos de linguagem, mas também oferecem novas oportunidades para desenvolvedores e empresas que buscam implementar inteligência artificial em suas aplicações.

O lançamento do Llama 3.2 representa um avanço significativo na tecnologia de modelos de linguagem e visão, com implicações importantes para o desenvolvimento de aplicações mais inteligentes e seguras. A Meta continua a liderar o caminho em inovações que combinam diferentes formas de inteligência artificial.

FONTES:

REDATOR

Gino AI

1 de outubro de 2024 às 12:42:45

PUBLICAÇÕES RELACIONADAS

Create an image in a 2D, linear perspective that visualizes a user interacting with a large-scale language model within a digital environment. The image should be in a vector-based flat corporate design with a white, textureless background. Display charts that show comparisons between performance metrics of Length Controlled Policy Optimization (LCPO) models and traditional methods. Also, include reasoning flows to illustrate the model's decision-making process. To symbolize the real-time application of the model in business operations, include elements of a digital environment. Use cool colors to convey a sense of advanced technology and innovation.

Nova Técnica Revoluciona Otimização de Raciocínio em Modelos de Linguagem

Create a 2D, flat corporate-style vector image on a white, texture-less background. The image should feature elements symbolising cybersecurity, including padlocks to symbolise security, and alert icons to represent risks. There should also be a technological background that reflects the AI environment, highlighting the importance of security in artificial intelligence.

Segurança em LLM: Riscos e Melhores Práticas para Proteger a Inteligência Artificial

Create a 2D, linear image with a flat, corporate, vector-inspired style set against a white, untextured background. The image displays a dynamic chart that depicts the explosive growth of AI tools and the associated market implications. Rising startups are shown next to declining traditional platforms. Key elements include a growth graph that visualizes the thriving numbers of AI tools, software tool icons to symbolize innovation and technology, and upward-pointing arrows that symbolize growth and progress. The image is awash with bright, vibrant colors to convey the energy and transformation in the sector. Finally, include silhouettes of freelance workers of varying descents--Hispanic, Caucasian, Middle Eastern, South Asian, and Black--to illustrate the impact on the job market.

Startup de IA registra crescimento de 8.658%, enquanto OpenAI avançou apenas 9%

Imagine a 2D, vector-based scene in flat, corporate style. The background has a clean, texture-free white color, emphasizing the main elements of the image. In the center, we see detailed line graphs, bar graphs, and pie charts representing the shifting market shares between various AI companies in 2025. DALL-E's graph clearly displays a significant 80% decline, while Black Forest Labs stands out with some impressive, upward-trending performance charts, symbolizing its emergence as a leader in image generation. Bright and contrasting colors are used to differentiate the competition in the AI sector. Additional elements include abstract symbols of innovation, such as gears, light bulbs, and microchips, subtly scattered in the background to highlight the rapid evolution of AI tools.

Mudanças Drásticas no Mercado de IA: DALL-E Enfrenta Queda e Black Forest Labs Surge em 2025