Jensen Huang's latest CES speech: AI Agents are expected to become the next robotics industry, with a scale reaching trillions of dollars

Recommended Reading
2025-01-07 17:27:42
Collection
NVIDIA is bringing AI from the cloud to personal devices and enterprises, covering all computing needs from developers to ordinary users.

整理:有新

At CES 2025, which opened this morning, NVIDIA founder and CEO Jensen Huang delivered a milestone keynote speech, revealing the future of AI and computing. From the core token concept of generative AI to the launch of the new Blackwell architecture GPU, and the AI-driven digital future, this speech will profoundly impact the entire industry from a cross-disciplinary perspective.

1) From Generative AI to Agentic AI: The Dawn of a New Era

  • The Birth of Tokens: As the core driving force of generative AI, tokens transform text into knowledge, breathe life into images, and open up new forms of digital expression.

  • The Evolution of AI: From perceptual AI and generative AI to Agentic AI capable of reasoning, planning, and acting, AI technology continues to reach new heights.

  • The Revolution of Transformers: Since its introduction in 2018, this technology has redefined the way we compute, completely overturning traditional tech stacks.

2) Blackwell GPU: Breaking Performance Limits

  • Next-Generation GeForce RTX 50 Series: Based on the Blackwell architecture, featuring 92 billion transistors, 4000 TOPS AI performance, and 4 PetaFLOPS computing power, it is three times the performance of the previous generation.

  • The Fusion of AI and Graphics: For the first time, programmable shaders are combined with neural networks, introducing neural texture compression and material shading technology, delivering stunning rendering effects.

  • Accessible High Performance: The RTX 5070 laptop achieves RTX 4090 performance at a price of $1299, promoting the democratization of high-performance computing.

3) Multi-Domain Expansion of AI Applications

  • Enterprise AI Agents: NVIDIA provides tools like Nemo and Llama Nemotron to help businesses build autonomous reasoning digital employees, achieving intelligent management and services.

  • Physic AI: Through the Omniverse and Cosmos platforms, AI integrates into industries, autonomous driving, and robotics, redefining global manufacturing and logistics.

  • Future Computing Scenarios: NVIDIA is bringing AI from the cloud to personal devices and enterprises, covering all computing needs from developers to ordinary users.

Key Points from Jensen Huang's Speech:

This is the birthplace of intelligence, a new kind of factory—the generator of generative tokens. It is the building block of AI, opening a new domain and taking the first step into an extraordinary world. Tokens transform text into knowledge, breathe life into images; they turn creativity into videos, helping us navigate any environment safely; teaching robots to move like masters, and inspiring us to celebrate victories in new ways. In our most desperate times, tokens can also bring inner peace. They give digital meaning, helping us better understand the world, predict potential dangers, and find ways to heal inner threats. They can make our visions come true, restoring everything we have lost.

All of this began in 1993 when NVIDIA launched its first product—the NV1. We wanted to create computers that could do things ordinary computers could not, making it possible to have gaming consoles in PCs. Then, in 1999, NVIDIA invented the programmable GPU, starting over 20 years of technological advancement, making modern computer graphics possible. Six years later, we launched CUDA, expressing the programmability of GPUs through rich algorithms. This technology was initially hard to explain, but by 2012, the success of AlexNet validated CUDA's potential, driving breakthrough developments in AI.

Since then, AI has developed at an astonishing pace. From perceptual AI to generative AI, and now to Agentic AI that can perceive, reason, plan, and act, AI's capabilities continue to enhance. In 2018, Google introduced the Transformer, and the world of AI truly took off. The Transformer not only completely changed the landscape of AI but also redefined the entire computing field. We realized that machine learning is not just a new application or business opportunity but a fundamental revolution in the way we compute. Every layer of the tech stack underwent significant changes, from manually writing instructions to optimizing neural networks with machine learning.

Today, AI applications are ubiquitous. Whether understanding text, images, sounds, or translating amino acids and physics, it can accomplish it all. Almost all AI applications can be distilled into three questions: What modality of information did it learn? What modality of information did it translate into? What modality of information did it generate? This fundamental concept drives every AI-driven application.

All these achievements are supported by GeForce. GeForce has brought AI to the masses, and now AI is returning to GeForce. With real-time ray tracing technology, we can render graphics with stunning effects. Through DLSS, AI can even surpass frame generation, predicting future frames. Out of 33 million pixels, only 2 million are computed, with the rest predicted and generated by AI. This miraculous technology showcases the powerful capabilities of AI, making computation more efficient and revealing the infinite possibilities of the future.

This is why so many amazing things are happening now. We have driven the development of AI with GeForce, and now AI is completely revolutionizing GeForce. Today, we announce the next generation of products—the RTX Blackwell family. Let's take a look.

This is the new GeForce RTX 50 series, based on the Blackwell architecture. This GPU is a performance monster, featuring 92 billion transistors, 4000 TOPS AI performance, and 4 PetaFLOPS AI computing power, three times higher than the previous Ada architecture. All of this is to generate the stunning pixels I just showcased. It also boasts 380 ray tracing Teraflops, providing the most beautiful image quality for pixels that require computation, along with 125 shading Teraflops. This graphics card uses Micron's G7 memory, achieving speeds of 1.8TB per second, double the performance of the previous generation.

We can now combine AI workloads with computer graphics workloads, and one extraordinary feature of this generation of products is that programmable shaders can also handle neural networks. This has led us to invent neural texture compression and neural material shading. These technologies learn textures and compression algorithms through AI, ultimately generating stunning visual effects that only AI can achieve.

Even in mechanical design, this graphics card is a marvel. It features a dual-fan design, and the entire card acts like a giant fan, with the internal voltage regulation module being state-of-the-art. Such an exceptional design is entirely thanks to the efforts of the engineering team.

Next, let's look at performance comparisons. The familiar RTX 4090, priced at $1599, is a core investment for home PC entertainment centers. Now, the RTX 50 series offers higher performance starting at just $549, with performance from RTX 5070 to RTX 5090 being twice that of the RTX 4090.

Even more astonishing, we have put this high-performance GPU into laptops. The RTX 5070 laptop, priced at $1299, delivers RTX 4090 performance. This design combines AI and computer graphics technology, achieving high efficiency and high performance.

The future of computer graphics will be neural rendering—the fusion of AI and computer graphics. The Blackwell series can even be implemented in laptops with a thickness of only 14.9 mm, with the entire product line from RTX 5070 to RTX 5090 being compatible with ultra-thin laptops.

GeForce has driven the popularization of AI, and now AI is completely transforming GeForce. This is a mutual promotion of technology and intelligence, and we are moving towards a higher realm.

### Three Scaling Laws of AI

Next, let's talk about the development direction of AI.

1) Pre-training Scaling Law

The AI industry is rapidly expanding, driven by a powerful model known as the "Scaling Law." This empirical rule has been repeatedly validated by researchers and industry practitioners, indicating that the larger the scale of training data, the larger the scale of the model, and the more computational power invested, the stronger the model's capabilities will be.

The growth rate of data is accelerating exponentially. It is estimated that in the coming years, the amount of data produced by humans each year will exceed the total produced throughout human history. This data is becoming multimodal, including forms such as video, images, and sounds. This vast amount of data can be used to train the foundational knowledge system of AI, laying a solid knowledge foundation for AI.

2) Post-training Scaling Law

In addition, two other Scaling Laws are emerging.

The second Scaling Law is the "Post-training Scaling Law," which involves technologies such as reinforcement learning and human feedback. In this way, AI generates answers based on human queries and continuously improves from human feedback. This reinforcement learning system helps AI refine skills in specific areas through high-quality prompts, such as becoming better at solving math problems or performing complex reasoning.

The future of AI is not just about perception and generation, but a process of continuous self-improvement and boundary-breaking. It is like having a mentor or coach who provides feedback after you complete a task. Through testing, feedback, and self-improvement, AI can also progress through similar reinforcement learning and feedback mechanisms. This post-training phase of reinforcement learning, combined with synthetic data generation techniques, resembles a self-practice process. AI can face complex and verifiable challenges, such as proving theorems or solving geometric problems, continuously optimizing its answers through reinforcement learning. Although this post-training requires substantial computational power, it can ultimately create extraordinary models.

3) Test-time Scaling Law

The test-time Scaling Law is also gradually emerging. This law shows unique potential when AI is actually used. AI can dynamically allocate resources during inference, no longer limited to parameter optimization but focusing on computational allocation to produce the high-quality answers required.

This process is akin to reasoning thought rather than direct inference or one-time answers. AI can break down problems into multiple steps, generate multiple solutions, and evaluate them, ultimately selecting the optimal solution. This long-duration reasoning is significantly effective in enhancing model capabilities.

We have seen the evolution of this technology, from ChatGPT to GPT-4, and now to the current Gemini Pro, all of which are experiencing gradual development in pre-training, post-training, and test-time scaling. Achieving these breakthroughs requires immense computational power, which is the core value of NVIDIA's Blackwell architecture.

Latest Introduction to the Blackwell Architecture

The Blackwell system is now in full production, and its performance is astonishing. Today, every cloud service provider is deploying these systems, manufactured by 45 factories worldwide, supporting up to 200 configurations, including liquid cooling, air cooling, x86 architecture, and NVIDIA Grace CPU versions.

Its core component, the NVLink system, weighs 1.5 tons and contains 600,000 parts, equivalent to the complexity of 20 cars, connected by 2 miles of copper wire and 5,000 cables. The entire manufacturing process is extremely complex, but the goal is to meet the ever-expanding demand for computing.

Compared to the previous generation architecture, Blackwell has improved performance per watt by 4 times and performance per dollar by 3 times. This means that for the same cost, the scale of training models can increase by 3 times, and the key behind these improvements is the generative AI tokens. These tokens are widely used in ChatGPT, Gemini, and various AI services, forming the foundation of future computing.

On this basis, NVIDIA has promoted a new computing paradigm: neural rendering, perfectly integrating AI with computer graphics. Under the Blackwell architecture, 72 GPUs form the world's largest single-chip system, providing up to 1.4 ExaFLOPS of AI floating-point performance, with a memory bandwidth of an astonishing 1.2 PB/s, equivalent to the total of all global internet traffic. This supercomputing capability enables AI to handle more complex reasoning tasks while significantly reducing costs, laying the foundation for more efficient computing.

AI Agent Systems and Ecosystem

Looking ahead, the reasoning process of AI will no longer be a simple single-step response but will be closer to "internal dialogue." Future AI will not only generate answers but will also reflect, reason, and continuously optimize. As the generation rate of AI tokens increases and costs decrease, the quality of AI services will significantly improve, meeting broader application needs.

To help enterprises build AI systems with autonomous reasoning capabilities, NVIDIA provides three key tools: NVIDIA NeMo, AI microservices, and acceleration libraries. By packaging complex CUDA software and deep learning models into containerized services, enterprises can deploy these AI models on any cloud platform, quickly developing AI agents tailored to specific domains, such as service tools supporting enterprise management or digital employees for user interaction.

These models open up new possibilities for enterprises, lowering the development threshold for AI applications and pushing the entire industry to take solid steps towards Agentic AI (autonomous AI). Future AI will become digital employees, easily integrated into enterprise tools like SAP and ServiceNow, providing intelligent services to customers in various environments. This is the next milestone in AI expansion and the core vision of NVIDIA's technology ecosystem.

Training evaluation systems. In the future, these AI agents will essentially work alongside employees, completing tasks for you. Therefore, introducing these specialized agents into your company is akin to onboarding new employees. We provide different toolkits to help these AI agents learn the unique language, vocabulary, business processes, and working methods of the company. You need to provide them with examples of work outcomes, and they will attempt to generate them, after which you can provide feedback, conduct evaluations, and so on. At the same time, you will also set limitations, such as clearly defining what actions they cannot perform, what they cannot say, and controlling the information they can access. This entire digital employee process is called Nemo. To some extent, every company's IT department will become the HR department for AI agents.

Today, IT departments manage and maintain a large number of software; in the future, they will manage, train, onboard, and improve a large number of digital agents to serve the company. Therefore, IT departments will gradually evolve into the HR departments for AI agents.

In addition, we provide many open-source blueprints for the ecosystem to use. Users can freely modify these blueprints. We have blueprints for various types of agents. Today, we also announce something very cool and smart: we are launching a new family of models based on Llama, the NVIDIA Llama Nemo Tron language foundation model series.

Llama 3.1 is a phenomenal model. Meta's Llama 3.1 has been downloaded approximately 350,650,000 times and has spawned around 60,000 other models. This is one of the core reasons driving almost all enterprises and industries to start researching AI. We realized that Llama models could be better fine-tuned for enterprise use cases. Utilizing our expertise and capabilities, we have fine-tuned it into the Llama Nemotron open model suite.

These models come in different sizes: small models respond quickly; the mainstream Super Llama Nemotron is a general-purpose model; and the ultra-large Ultra Model can serve as a teacher model for evaluating other models, generating answers, and determining their quality, or as a knowledge distillation model. All these models are now online.

These models perform excellently, ranking high in areas such as dialogue, instruction, and information retrieval, making them very suitable for AI agent functions globally.

Our collaboration with the ecosystem is also very close, such as with ServiceNow, SAP, and Siemens in industrial AI. Companies like Cadence and Perplexity are also undertaking outstanding projects. Perplexity has disrupted the search field, while Codium serves 30 million software engineers worldwide. AI assistants will greatly enhance the productivity of software developers, which is the next huge application area for AI services. There are 1 billion knowledge workers globally, and AI agents could be the next robotic industry, with potential reaching trillions of dollars.

AI Agent Blueprints

Next, let's showcase some AI agent blueprints completed in collaboration with partners.

AI agents are the new digital workforce, capable of assisting or replacing humans in completing tasks. NVIDIA's Agentic AI building blocks, NEM pre-trained models, and Nemo framework help organizations easily develop and deploy AI agents. These agents can be trained as domain-specific task experts.

Here are four examples:

  • Research Assistant Agent: Capable of reading complex documents such as lectures, journals, financial reports, etc., and generating interactive podcasts for easier learning;

  • Software Security AI Agent: Helps developers continuously scan for software vulnerabilities and prompts appropriate actions;

  • Virtual Laboratory AI Agent: Accelerates compound design and screening, quickly finding potential drug candidates;

  • Video Analysis AI Agent: Based on NVIDIA Metropolis blueprints, analyzes data from billions of cameras, generating interactive searches, summaries, and reports. For example, monitoring traffic flow, facility processes, and providing improvement suggestions.

The Era of Physical AI

We aim to bring AI from the cloud to every corner, including within companies and personal PCs. NVIDIA is working to transform Windows WSL 2 (Windows Subsystem) into the preferred platform for supporting AI. This will make it easier for developers and engineers to leverage NVIDIA's AI technology stack, including language models, image models, animation models, and more.

Additionally, NVIDIA has launched Cosmos, the first physical world foundation model development platform, focusing on understanding the dynamic characteristics of the physical world, such as gravity, friction, inertia, spatial relationships, and causality. It can generate videos and scenes that comply with physical laws, widely applied in training and validating robotics, industrial AI, and multimodal language models.

Cosmos provides physical simulation through the connection of NVIDIA Omniverse, generating realistic simulation results. This combination is the core technology for developing robotics and industrial applications.

NVIDIA's industrial strategy is based on three computing systems:

  • DGX systems for training AI;

  • AGX systems for deploying AI;

  • Digital twin systems for reinforcement learning and AI optimization;

Through the collaborative work of these three systems, NVIDIA is driving the development of robotics and industrial AI, building the future digital world. Rather than a three-body problem, we have a "three-computer" solution.

NVIDIA's vision for robotics allows me to show you three examples.

1) Applications of Industrial Visualization

Currently, there are millions of factories and hundreds of thousands of warehouses worldwide, forming the backbone of a $50 trillion manufacturing industry. In the future, all of this will need to be software-defined and automated, integrating robotics. We are collaborating with leading warehouse automation solution provider Keon and the world's largest professional services provider Accenture, focusing on digital manufacturing to create some very special solutions together. Our marketing approach is similar to other software and technology platforms, developed through developers and ecosystem partners, and more and more ecosystem partners are connecting to the Omniverse platform. This is because everyone wants to visualize the future of industry. In this $50 trillion global GDP, there is so much waste and so many automation opportunities.

Let's look at this example of collaboration between Keon and Accenture:

Keon (a supply chain solutions company), Accenture (a global leader in professional services), and NVIDIA are bringing physical AI into the trillion-dollar warehouse and distribution center market. Efficient warehouse logistics management requires navigating a complex network of decisions influenced by constantly changing variables, such as daily and seasonal demand fluctuations, space constraints, labor supply, and the integration of diverse robotics and automation systems. Today, predicting the key performance indicators (KPIs) of physical warehouse operations is nearly impossible.

To address these challenges, Keon is adopting Mega (an NVIDIA Omniverse blueprint) to build industrial digital twins to test and optimize robotic fleets. First, Keon's warehouse management solution assigns tasks to the industrial AI brain in the digital twin, such as moving goods from buffer zone locations to shuttle storage solutions. The robotic fleet executes tasks in the physical warehouse simulation environment within Omniverse, perceiving and reasoning to plan the next actions and take action. The digital twin environment uses sensor simulations, allowing the robotic brain to see the state after task execution and decide on the next actions. Under the precise tracking of Mega, the entire cycle continues while measuring operational KPIs such as throughput, efficiency, and utilization, all completed before making changes to the physical warehouse.

With NVIDIA's collaboration, Keon and Accenture are redefining the future of industrial autonomy.

In the future, every factory will have a digital twin that is fully synchronized with the actual factory. You can use Omniverse and Cosmos to generate a multitude of future scenarios, and AI will determine the optimal KPI scenarios, deploying them as constraints and AI programming logic for the actual factory.

2) Autonomous Vehicles

The autonomous driving revolution has arrived. After years of development, the successes of both Waymo and Tesla have proven the maturity of autonomous driving technology. Our solutions provide three computing systems for this industry: systems for training AI (like DGX systems), systems for simulation testing and generating synthetic data (like Omniverse and Cosmos), and in-vehicle computing systems (like AGX systems). Almost all major automotive companies globally are collaborating with us, including Waymo, Zoox, Tesla, and the world's largest electric vehicle company BYD. Companies like Mercedes, Lucid, Rivian, Xiaomi, and Volvo, which are set to launch innovative models, are also involved. Aurora is using NVIDIA technology to develop autonomous trucks.

Each year, 100 million vehicles are manufactured, and there are 1 billion vehicles on the roads globally, accumulating trillions of miles driven each year. These will gradually achieve high levels of automation or full automation. This industry is expected to become the first robotic industry worth trillions of dollars.

Today, we announce the launch of the next-generation in-vehicle computer, Thor. It is a universal robotic computer capable of processing vast amounts of data from cameras, high-resolution radar, lidar, and other sensors. Thor is an upgrade to the current industry standard, Orin, with 20 times its computing power, and is now in full production. At the same time, NVIDIA's Drive OS is the first AI computing operating system certified to meet the highest functional safety standards (ISO 26262 ASIL D).

Autonomous Driving Data Factory

NVIDIA utilizes Omniverse AI models and the Cosmos platform to create an autonomous driving data factory, significantly expanding training data through synthetic driving scenarios. This includes:

  • OmniMap: Integrating maps and geospatial data to build drivable 3D environments;

  • Neural Reconstruction Engine: Generating high-fidelity 4D simulation environments from sensor logs and creating scene variants for training data;

  • Edify 3DS: Searching or generating new assets from asset libraries to create scenes for simulation.

Through these technologies, we expand thousands of driving scenarios into billions of miles of data for the development of safer and more advanced autonomous driving systems.

3) General Robotics

The era of general robotics is approaching. The key to breakthroughs in this field lies in training. For humanoid robots, acquiring imitation data is relatively challenging, but NVIDIA's Isaac Groot provides a solution. It generates massive datasets through simulation and combines the multi-universe simulation engine of Omniverse and Cosmos for policy training, validation, and deployment.

For example, developers can remotely operate robots using Apple Vision Pro, capturing data without physical robots and teaching task actions in a risk-free environment. Through Omniverse's domain randomization and 3D-to-real scene expansion capabilities, exponentially growing datasets are generated, providing vast resources for robot learning.

In summary, whether it is industrial visualization, autonomous driving, or general robotics, NVIDIA's technology is leading the future transformation in the fields of physical AI and robotics.

Finally, I have one important piece of content to showcase, all of which stems from a project we initiated internally ten years ago called Project Digits, officially named Deep Learning GPU Intelligence Training System.

Before its official release, we adjusted the DGX to harmonize with the company's RTX, AGX, OVX, and other product lines. The advent of DGX1 truly changed the direction of AI development, marking a milestone in NVIDIA's journey in AI.

The Revolution of DGX1

The original intention of DGX1 was to provide researchers and startups with an out-of-the-box AI supercomputer. Imagine that previous supercomputers required users to build dedicated facilities and design complex infrastructures for their existence. DGX1, however, is a supercomputer specifically designed for AI development, requiring no complex operations and ready to use right out of the box.

I still remember delivering the first DGX1 to a startup—OpenAI—in 2016. At that time, Elon Musk, Ilya Sutskever, and many NVIDIA engineers were present, and we celebrated the arrival of DGX1 together. This device significantly propelled the development of AI computing.

Today, AI is everywhere. Not limited to research institutions and startup labs, as I mentioned at the beginning, AI has become a new way of computing and software development. Every software engineer, creative artist, and even ordinary users of computer tools need an AI supercomputer. But I have always wished DGX1 could be smaller.

The Latest AI Supercomputer

Here is NVIDIA's latest AI supercomputer. It still belongs to Project Digits, and we are currently looking for a better name, so suggestions are welcome. This is a truly amazing device.

This supercomputer can run NVIDIA's complete AI software stack, including DGX Cloud. It can serve as a cloud supercomputer, a high-performance workstation, or even an analysis workstation on your desktop. Most importantly, it is based on a new chip we secretly developed, codenamed GB110, which is our smallest Grace Blackwell.

I have a chip here to show you its internal design. This chip was developed in collaboration with the world's leading SoC company, MediaTek. This CPU SoC is custom-made for NVIDIA and connects to the Blackwell GPU using NVLink chip-to-chip interconnect technology. This small chip is now in full production. We expect this supercomputer to officially launch around May.

We even offer a "double the computing power" configuration that allows these devices to be connected via ConnectX, supporting GPU direct (GPUDirect) technology. It is a complete supercomputing solution that meets various needs for AI development, analytical work, and industrial applications.

Additionally, we announced the mass production of three new Blackwell system chips, the world's first physical AI foundation model, and breakthroughs in three major robotics fields—autonomous AI agent robots, humanoid robots, and autonomous vehicles.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
banner
ChainCatcher Building the Web3 world with innovators