NVIDIAs-Nemotron-3-8B: Diffusion Design using DALL-E3

NVIDIA AI Foundation Models for Enterprise Applications

Introduction

NVIDIA’s Nemotron-3 8B models present a transformative suite of Large Language Models (LLMs) for enhancing enterprise applications through AI-driven chatbots and co-pilots.

Features & Technical Details

  • Nemotron-3-8B Base: Enables domain-adapted LLMs with parameter-efficient fine-tuning and continuous pretraining capabilities.
  • Chat Models:
    • Nemotron-3-8B-Chat-SFT: Base model for instruction tuning and user-defined alignments like RLHF or SteerLM.
    • Nemotron-3-8B-Chat-RLHF: Delivers superior chat model performance, excelling in MT-Bench score.
    • Nemotron-3-8B-Chat-SteerLM: Offers flexible alignment at inference time, fostering continuous improvement.
  • Nemotron-3-8B-QA: Specialized Q&A model achieving a zero-shot F1 score of 41.99% on the Natural Questions dataset.

Requirements & Prerequisites

  • TensorRT-LLM: Supports advanced optimization techniques for efficient LLM inference on NVIDIA GPUs.
  • NVIDIA Data Center GPUs: Requires at least one A100 (40 GB/80 GB), H100 (80 GB), or L40S GPU.
  • NVIDIA NeMo Framework: Necessary for deploying and customizing the Nemotron-3-8B models, including training and inference containers.

Deployment & Customization

  • Optimization Techniques: KV caching, Efficient Attention modules, in-flight batching, and low-precision quantization
  • NeMo Framework: Provides necessary tools for applying TensorRT-LLM optimizations and hosting models with Triton Inference Server.

Benefits

  • Efficiency and Flexibility: Streamlines development and deployment for rapid, customized AI solutions.
  • Optimized Performance: Enhanced accuracy, low latency, and high throughput through integration with NVIDIA TensorRT-LLM and NVIDIA Triton Inference Server.
  • Data Privacy and Security Compliance: Adheres to regulations with tools like NeMo Guardrails for secure data storage.

Deployment and Customization

  • Inference and Optimization Techniques: Supports KV caching, Efficient Attention modules, in-flight batching, and low-precision quantization​
  • Prompting Techniques: Includes specific single-turn and multi-turn prompt formats for different chat models​
  • Further Customization: Suitable for domain-specific dataset customization, with options like SFT, RLHF, and SteerLM, supported by easy-to-use scripts in the NeMo framework​

Conclusion

The Nemotron-3 8B model family, complemented by the NeMo framework, offers a comprehensive and flexible solution for enterprises looking to integrate advanced AI in their operations. Its combination of customization, performance optimization, and adherence to privacy and legal standards makes it a standout choice in the realm of enterprise AI applications.

GOOGLE DeepMind : Diffusion Design using DALL-E3

Introducing Google DeepMind’s Lyria – Transforming Music Creation

Introduction

Google DeepMind announced Lyria, its most advanced AI music generation model, alongside two innovative AI experiments, Dream Track and a suite of music AI tools. These developments aim to redefine the landscape of music creation by integrating AI into the creative process.

Features & Technical Details

  • Lyria Model: Built by Google DeepMind, Lyria excels in generating high-quality music, encompassing instrumentals and vocals. It performs complex tasks like transformation and continuation, offering nuanced control over style and performance​
  • Dream Track Experiment: A YouTube Shorts experiment to enhance connections between artists, creators, and fans. Dream Track enables the creation of unique soundtracks using the AI-generated voice and style of various artists​
  • Music AI Tools: Developed with artists, songwriters, and producers, these tools assist in the creative process, allowing users to transform melodies and chords into realistic musical elements​

Benefits

  • Creative Empowerment: Lyria and the associated AI tools offer a new dimension in musical creativity, enabling artists and creators to experiment with AI-generated music.
  • Enhanced Musical Experience: These tools aim to deepen the connection between artists and their audience, offering unique, AI-powered musical experiences.

Other Technical Details

  • Watermarking with SynthID: Lyria-generated content is watermarked using SynthID, ensuring responsible deployment and identification of AI-generated audio. This watermark remains detectable through various audio modifications, maintaining integrity and authenticity​
  • Responsible Development: DeepMind emphasizes responsible development and deployment, aligning with YouTube’s AI principles to protect artists and their work, ensuring that these technologies benefit the wider music community​

Conclusion

Google DeepMind’s Lyria, along with its AI experiments, represents a significant stride in the fusion of AI and music. By responsibly integrating advanced AI models into music creation, DeepMind is not only enhancing the creative capacities of artists but also setting new standards in the responsible development and use of AI in the arts. This initiative promises to transform the future of music creation, offering unprecedented tools for artistic expression and innovation.

SILO AI PORO: Diffusion Design using DALL-E3

Silo AI’s Poro – Open Source Language Model for Europe

Introduction

Silo AI, a Finland-based artificial intelligence startup, launched Poro, a groundbreaking large language model (LLM), marking Europe’s significant foray into the realm of advanced AI models comparable to those from major players like OpenAI​

Features & Technical Details

  • Open Source: Poro is available under the Apache 2.0 License, suitable for both commercial and research applications​
  • Multilingual Capabilities: Focused on enhancing AI capabilities for European languages, starting with English and Finnish​
  • Model Architecture: Poro uses the BLOOM transformer architecture with a 34.2 billion parameter model​
  • Training Data: The model is trained on a dataset of 1 trillion tokens, including English, Finnish, and programming languages​

Benefits

  • Language Diversity: Aims to eventually support all 24 official European Union languages, enhancing multilingual AI capabilities in Europe​
  • Accessibility: Being open-source, Poro is available for a wide range of applications, fostering innovation and research in AI​
  • Technical Sophistication: The use of a large-scale transformer architecture positions Poro as a competitive tool in the field of AI and natural language processing​

Other Technical Details

  • Development Collaboration: Poro is a product of collaboration between SiloGen from Silo AI and the TurkuNLP group at the University of Turku, showcasing a significant partnership in the field of AI and natural language processing​
  • Ongoing Training: The model is in the process of expanding its training, currently at 300 billion tokens of its planned 1 trillion tokens, indicating a commitment to continuous improvement and expansion

Conclusion

Poro represents a significant step in Europe’s AI landscape, offering a robust, multilingual, and open-source language model. Its development underlines the commitment to enhancing language technology across Europe and provides a platform for both commercial and academic advancements in AI. The collaboration between Silo AI, SiloGen, and the University of Turku’s TurkuNLP research group in creating Poro exemplifies the collaborative spirit driving AI innovation in Europe

Other AI News

  • Humane Launches $699 AI-Powered Projector to Replace Phones

Humane, a startup founded by former Apple employees and supported by OpenAI’s CEO and Microsoft, has launched an innovative product called the AI Pin. This wearable device, priced at $699, functions as a projector and is designed to replace traditional smartphones. Equipped with an ultraawide RGB camera, it can capture photos, send text messages and emails, and provide answers to questions with ChatGPT-like capabilities.

The AI Pin, while compact and all-aluminum, weighs about 55 grams and requires a magnetic battery pack to attach to clothing. It can project a Laser Ink Display onto the user’s palm, navigated by hand movements and gestures. The device’s functionality includes projecting caller ID for incoming calls, summarizing emails, identifying food nutritional values, and answering context-based questions. It utilizes various AI tools to service user needs, and features a built-in “Personic Speaker” for audio experiences, Bluetooth connectivity, and a recording indicator light.

The device will be available in three color options and requires a $24 monthly subscription for cellular data, cloud storage, and unlimited voice assistant queries. Its real-world performance, particularly in terms of projection accuracy and reliability in different environments, remains to be seen

  • Meta Introduces AI-Powered Video Editing Solutions

Meta Platforms has introduced two AI-based video editing features: Emu Video and Emu Edit. Emu Video generates four-second videos with captions, photos, or images paired with descriptions. Emu Edit allows users to easily modify videos with text prompts. These tools build upon the parent model Emu, which generates images in response to text prompts. Meta’s foray into generative AI aligns with the growing trend of businesses exploring new capabilities and streamlining processes in the generative AI market since the launch of OpenAI’s ChatGPT. The company is actively advancing in the AI space to compete with tech giants like Microsoft, Google, and Amazon.

  • Tangram Vision launches AI-powered 3D sensor to assist computer vision in robotics

Tangram Vision, an AI startup, has developed a 3D sensor powered by artificial intelligence that has the potential to revolutionize computer vision in robotics. The sensor, known as Tangram 3D, enables robots to perceive and interact with their environments more effectively. It provides a detailed 3D understanding of the surroundings, allowing robots to navigate and interact with objects more accurately. This technology has applications in various industries, including autonomous vehicles, manufacturing, and logistics, and it could significantly enhance the capabilities of robots in these domains. Tangram Vision aims to make its AI-powered 3D sensor available to a wide range of robotics developers​.

  • Google DeepMind Unveils ‘Mirasol3B’ for Advanced Video Analysis

Google DeepMind has announced a significant breakthrough in AI research with its new autoregressive model, “Mirasol3B.” This model represents a major step forward in understanding long video inputs and multimodal learning, integrating audio, video, and text data in a more efficient manner. Unlike current models that extract all information at once, Mirasol3B adopts an autoregressive approach, conditioning jointly learned video and audio representations on feature representations from previous time intervals, preserving essential temporal information.

The Mirasol3B model processes video by partitioning it into smaller chunks (4-64 frames each) and then employs a learning module called the Combiner. This module generates a joint audio and video feature representation for each chunk, compacting the most vital information. Subsequently, an autoregressive Transformer processes this joint feature representation, applying attention to previous features and generating representations for subsequent steps. This process enables the model to understand not only each video chunk but also the temporal relationship between them. The Mirasol3B model’s ability to handle diverse data types while maintaining temporal coherence makes it a substantial advancement in multimodal machine learning, delivering state-of-the-art performance more efficiently than previous models​

  • Typecast’s AI Technology Enables Emotion Transfer in Speech

Typecast, an AI startup, has introduced innovative technology called Cross-Speaker Emotion Transfer, revolutionizing how generative AI can process and convey human emotions. This technology enables users to apply emotions from another person’s voice to their own while preserving their unique vocal style. This advancement addresses the challenge of expressing the wide spectrum of human emotions in AI-generated speech, which traditional text-to-speech technology has struggled with due to the complexities of emotional nuances and the requirement of large amounts of labeled data.

Typecast’s approach leverages deep neural networks and unsupervised learning algorithms to discern speaking styles and emotions from a vast database. This method allows the AI to learn from a wide range of emotional voices without the need for specific emotion labels. The technology can adapt to specific voice characteristics from just snippets of recorded voice, enabling users to express a range of emotions and intensities naturally without altering their voice identity. This breakthrough opens new possibilities in content creation, making it faster and more efficient, and has already been utilized by companies like Samsung Securities and LG Electronics. Typecast is now working to extend these speech synthesis technologies to facial expressions​.

  • Microsoft Ignite 2023: Major AI and Cloud Advancements

At Microsoft Ignite 2023, Microsoft revealed several advancements in AI and cloud computing. The company introduced GPT-4 Turbo with Vision (GPT-4V), a sophisticated large language model capable of generating text, translating languages, writing creative content, and providing informative answers. Integrated with Azure AI Vision, GPT-4V enables the inclusion of multimedia content for enhanced text generation. In cloud computing, Microsoft announced new Azure cloud services and improvements aimed at fostering business innovation and scalability. Azure Arc received notable updates, facilitating efficient management of infrastructure and applications across various cloud environments. Additionally, Microsoft announced the general availability of Azure Synapse Analytics, a unified data platform combining data warehousing, big data processing, and machine learning.

Productivity and collaboration tools also saw significant enhancements. Copilot for Microsoft 365, an AI-powered assistant, was introduced to provide contextual suggestions and streamline workflow across Microsoft 365 applications. Microsoft Teams added new features like live translation and background replacement, enriching its functionality for remote and hybrid working environments. Microsoft Loop, an all-encompassing workplace platform, was unveiled, incorporating key features from the Ignite announcements and emerging as a major player in workplace productivity tools.

The Ignite conference highlighted Microsoft’s commitment to advancing AI technology in various sectors. The range of AI-driven improvements and new tools unveiled at the event underscores Microsoft’s focus on leveraging AI to transform different industries. The announcements made at Ignite 2023 reflect Microsoft’s ongoing innovation in AI, cloud services, and productivity solutions, marking important milestones in the company’s technological advancements.

  • Nurdle Emerges as AI Deployment Startup from Spectrum Labs

Nurdle, a new AI deployment startup, has emerged from Spectrum Labs, led by entrepreneur Justin Davis. This company, which was previously in stealth, is now making its presence known with the backing of notable investors like Greycroft, Intel Capital, and Twilio Ventures. Nurdle’s mission is to transform the landscape of AI deployment for enterprises, bringing innovative approaches to the field.

Nurdle’s proprietary technology, particularly its “lookalike” data technology, is aimed at revolutionizing how custom language models are developed and deployed. This technology promises to make the creation of these models faster, cheaper, and more accurate, significantly reducing the time and cost associated with data science tasks. With its innovative approach, Nurdle is positioning itself as a key player in the realm of AI deployment, especially for businesses looking to leverage AI for various applications

  • Microsoft Debuts In-House Maia and Cobalt AI Chips

Microsoft has announced the introduction of two custom-designed chips, Azure Maia 100 and Azure Cobalt 100, as part of its strategy to redefine cloud infrastructure for the AI era. The Azure Maia 100 is an AI accelerator designed for cloud-based training and inferencing of AI workloads such as OpenAI models, Bing, GitHub Copilot, and ChatGPT. This first-generation chip in the Maia series contains 105 billion transistors and is one of the largest chips fabricated using 5nm process technology. The Maia 100’s innovations span across various aspects including silicon, software, network, racks, and cooling capabilities, thus equipping the Azure AI infrastructure with comprehensive systems optimization tailored for advanced AI applications like GPT.

In addition to Maia 100, Microsoft is launching Azure Cobalt 100, its first custom in-house central processing unit (CPU) series. Built on Arm architecture, Cobalt 100 is designed for optimal performance per watt efficiency, catering to common cloud workloads within the Microsoft Cloud. The Cobalt 100 chip, also the first generation in its series, is a 64-bit 128-core processor that delivers a performance improvement of up to 40 percent over the current generations of Azure Arm chips. This chip powers services such as Microsoft Teams and Azure SQL, showcasing Microsoft’s commitment to optimizing and innovating at every layer of its infrastructure stack.

The networking innovations in both Maia 100 and Cobalt 100 chips include the use of hollow core fiber technology and the general availability of Azure Boost. These advancements enable faster networking and storage solutions in the cloud, achieving up to 12.5 GBs throughput and 650K input-output operations per second (IOPs) in remote storage performance for running data-intensive workloads. They also support up to 200 GBs in networking bandwidth for network-intensive workloads.

  • Airbnb Acquires AI Startup GamePlanner.AI

Airbnb has announced the acquisition of the AI startup GamePlanner.AI, a company co-founded by Adam Cheyer, one of the creators of Apple’s Siri. The acquisition, for an undisclosed sum, is part of Airbnb’s ongoing integration of AI technologies like large language models, computer vision models, and machine learning into its services. This move reflects the broader trend in the tech industry towards leveraging AI, as seen with other travel companies like Booking.com and Expedia, who are also integrating AI to improve customer experiences and offer better travel suggestions. The deal is reported by CNBC to be valued at just under $200 million.

  • Stanford, UC Berkeley’s S-LoRA discovery boosts fine-tuning in large language models

Fine-tuning large language models (LLMs) is crucial for businesses to customize AI for specific tasks, but it’s typically resource-intensive. To reduce costs, researchers at Stanford University and UC Berkeley developed S-LoRA, enhancing the low-rank adaptation (LoRA) technique used in fine-tuning LLMs. S-LoRA significantly lowers the deployment costs of fine-tuned LLMs, enabling numerous models to run on a single GPU.

S-LoRA addresses technical challenges like memory management and batch processing in LLM servers. It uses a dynamic memory management system and a “Unified Paging” mechanism to efficiently handle multiple queries. Additionally, its “tensor parallelism” system ensures compatibility with large transformer models across multiple GPUs. This innovation has shown promising results, boosting throughput and allowing simultaneous service of thousands of adapters with minimal computational overhead. The S-LoRA code is now available on GitHub, with plans to integrate it into popular LLM-serving frameworks.

  • Nvidia Unveils Upgraded H200 AI Chip

Nvidia announced on Monday the introduction of new features to its premier AI chip, the H200, set to be released next year and adopted by major tech companies like Amazon.com, Google, and Oracle. The H200 will surpass Nvidia’s current top chip, the H100, mainly through an enhancement in high-bandwidth memory. This memory is a critical component of the chip, determining the volume of data it can process swiftly.

The H200 is designed to significantly improve the performance of AI services, like OpenAI’s ChatGPT and other generative AI systems that produce human-like responses. The increase in high-bandwidth memory, from 80 gigabytes in the H100 to 141 gigabytes in the H200, along with a faster connection to the chip’s processing elements, will enable these AI services to deliver responses more quickly.

While Nvidia hasn’t disclosed the suppliers for the memory on the new H200 chip, Micron Technology has expressed intentions to become a supplier. Nvidia also sources memory from SK Hynix, a Korean company that noted a boost in sales due to AI chips. The H200 will be made available through various cloud service providers, including Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure, as well as specialized AI cloud providers like CoreWeave, Lambda, and Vultr.

  • Dell, Hugging Face Partner for LLM Deployment Ease

Dell and Hugging Face are joining forces to address the challenges enterprises face in adopting large language models (LLMs) and generative AI. This partnership is focused on easing the deployment of customized LLMs on-premises, a move aimed at making the most out of this evolving technology. To facilitate this, they will create a new Dell portal on the Hugging Face platform, which will include customized resources for deploying open-source models on Dell servers and data storage systems. The service, starting with Dell PowerEdge servers, will be accessible through the APEX console and is expected to extend to other Dell tools like Precision workstations.

This collaboration is set to simplify the complexities associated with generative AI, such as ensuring security and privacy of sensitive data and managing the resource-intensive process of fine-tuning AI models. Over time, the portal will release updated containers with optimized models for Dell infrastructure, supporting new-gen AI use cases and models. This initiative is part of Dell’s strategy to establish itself as a leader in generative AI and follows other moves by the company, such as expanding its AI portfolio and adding new tools geared towards AI and analytics workflows.

The partnership aims to empower organizations to become not just users but builders of AI, leveraging open-source models for customization. This approach addresses the challenges in progressing generative AI projects from proof of concept to production, ensuring data privacy, and managing ROI and cost effectively. The Dell-Hugging Face portal will offer curated sets of models selected for performance, accuracy, use cases, and licenses, enabling organizations to deploy AI within their infrastructure tailored to their specific needs.

  • Tencent Adapts to U.S. Chip Restrictions

Tencent Holdings, a major Chinese technology firm, revealed on Wednesday that it possesses a significant stockpile of AI chips from U.S. manufacturer Nvidia, but expressed concerns over the long-term impact of the U.S.’s expanded restrictions on high-end chip sales to China on its cloud services. During an analyst call post Tencent’s third-quarter earnings report, President Martin Lau mentioned that the U.S.’s recent ban on exporting more AI-related chips to China will compel Tencent to use its current chip stock more efficiently and to seek alternatives produced domestically.

Despite having enough Nvidia chips for further development of its “Hunyuan” AI model for several more iterations, the restrictions are expected to affect Tencent’s ability to offer cloud services that involve reselling computing power to clients. Nvidia, which holds about 90% of the Chinese AI chip market, sees its dominance threatened as Chinese companies are increasingly turning to domestic chipmakers like Huawei Technologies.

Lau also explained that Tencent plans to optimize the use of its Nvidia H800 AI chips, designed specifically for China, by focusing them on the most crucial aspects of AI model development, such as training. Meanwhile, Nvidia is reportedly preparing to market new AI chips for China that comply with the latest U.S. export regulations, featuring most of Nvidia’s advanced AI capabilities but with reduced computing power.

  • Google Sues Over Fake Bard AI Downloads

Google has initiated a lawsuit in a California federal court against unnamed individuals for allegedly marketing counterfeit downloads of its AI chatbot, Bard, to distribute malware. The tech giant accuses these anonymous scammers of misusing its trademarks, such as “Google AI” and “AIGoogleBard,” to deceive users into installing malware that compromises social media login credentials. Google’s general counsel, Halimah DeLaine Prado, highlighted that the scammers have misled people globally, prompting the company to issue nearly 300 takedown requests.

The lawsuit explains that the scammers advertise free downloads of Bard through social media and web pages, which in reality facilitate the installation of malware. The victims, including small businesses and Facebook advertisers, unknowingly download this malware, leading to the hijacking of their social media accounts. Google asserts that these actions violate its terms of service, as the scammers impersonate the company and use Google’s platforms like Google Sites and Google Drive to host the malware. The lawsuit aims to halt this scheme, seeking the scammers’ profits and other monetary damages.

  • Adobe previews new AI-powered audio tool to revolutionize voice

Adobe has unveiled Project Sound Lift, an innovative AI-powered audio tool designed to transform audio editing and manipulation. This advanced tool provides a one-click solution that enables users to effortlessly manipulate audio recordings in various scenarios. It enhances, transforms, and controls speech and sound independently, leveraging the capabilities of artificial intelligence.

Project Sound Lift’s standout feature is its ability to separate background noise from a person’s voice within an audio recording. Users can import an audio file and select specific sounds to filter out, such as applause, laughter, or traffic noise. The tool automatically detects and isolates the chosen sound, facilitating individual editing of each audio track.

The tool represents a significant advancement in the field of audio processing, offering unprecedented precision in separating audio layers within a recording. While it is still in the “Sneak” preview stage, demonstrations have showcased its potential. Project Sound Lift is expected to enable users to concentrate on specific sounds, like voices or musical instruments, by effectively filtering out unwanted background noise.

  • Microsoft to use AI tool Be My Eyes to serve visually impaired

Microsoft has announced a partnership with the app Be My Eyes, designed to assist blind and visually impaired individuals. Be My Eyes, established in 2015, aims to connect visually impaired or blind people with sighted volunteers for assistance with various tasks. This collaboration enhances the accessibility of Microsoft’s customer support for visually impaired users.

The partnership leverages OpenAI’s ChatGPT-4 model within the Be My Eyes app. This integration allows the app to provide written descriptions of images captured by the users, facilitating better support and independence for visually impaired or blind individuals. Microsoft has utilized the advanced capabilities of ChatGPT to offer virtual technical support, enabling users to feed images to the AI chatbot for assistance.

Be My Eyes has previously conducted tests using AI-powered visual customer service in collaboration with Microsoft, observing that only 10% of users who interacted with the AI felt the need to escalate their queries to a human agent. This indicates the effectiveness of the AI model in addressing the needs of visually impaired users, significantly improving their daily life experiences through enhanced technology and support​

About The Author

Bogdan Iancu

Bogdan Iancu is a seasoned entrepreneur and strategic leader with over 25 years of experience in diverse industrial and commercial fields. His passion for AI, Machine Learning, and Generative AI is underpinned by a deep understanding of advanced calculus, enabling him to leverage these technologies to drive innovation and growth. As a Non-Executive Director, Bogdan brings a wealth of experience and a unique perspective to the boardroom, contributing to robust strategic decisions. With a proven track record of assisting clients worldwide, Bogdan is committed to harnessing the power of AI to transform businesses and create sustainable growth in the digital age.