NTT R&D Forum2023 Special session2:Natured Inspired Intelligence and a New Paradigm for LLM

NTT official channel
7 Dec 202335:28

TLDRAt the NTT R&D Forum 2023, Sakana AI founders discussed their vision for a new AI paradigm inspired by nature. The company, named after the Japanese word for fish, emphasizes collective intelligence and evolution in AI. Co-founder Lion Jones, known for co-creating the Transformer architecture, advocated for character-level language modeling, highlighting its suitability for Japanese and potential for accurate spelling in AI-generated content. The forum underscored the need for Japan to develop its own AI ecosystem, distinct from Western-centric models, to foster innovation and maintain cultural relevance in AI advancements.

Takeaways

  • 🐟 Sakana AI is a company founded on the principles of nature-inspired intelligence, focusing on concepts like collective intelligence and evolution.
  • 🧠 The co-founders, Lion Jones and the speaker, both have strong research backgrounds, with Lion Jones being a co-inventor of the Transformer architecture.
  • 🌐 The speaker's background includes a unique journey from studying neural networks to working in investment banking, and eventually joining Google's Brain team.
  • 🏗️ Sakana AI challenges the conventional approach to AI development, advocating for a new paradigm that is more adaptive and less reliant on large, rigid models.
  • 🌟 The company's research is centered around generative AI, with a history of projects that explore evolution, multi-agent systems, and language models.
  • 🔋 Sakana AI believes that current large language models are energy-inefficient and prone to security issues, suggesting a need for a more nature-inspired approach.
  • 🌐 The speaker emphasizes the importance of Japan developing its own AI R&D ecosystem, advocating for a more diverse and decentralized AI landscape.
  • 🐠 The company's name, Sakana, symbolizes the collective behavior of fish, representing the idea of independent entities forming a coherent whole.
  • 📈 Sakana AI is exploring the potential of character-level language modeling, which could be particularly beneficial for languages like Japanese.
  • 🌱 The speaker highlights the potential of using principles from complex adaptive systems to create more resilient and adaptable AI models.

Q & A

  • What is the meaning behind the name 'Sakana AI'?

    -The name 'Sakana AI' is derived from the Japanese word 'Sakana' which means fish. The logo represents a swarm of fish forming a coherent entity from independent rules, symbolizing the company's research inspiration from nature, such as collective intelligence and evolution.

  • What is the background of the co-founder of Sakana AI mentioned in the transcript?

    -The co-founder has a research background from Google and has worked on generative AI. He studied engineering science at the University of Toronto with a focus on neural networks and later worked on Wall Street as a derivatives trader before joining Google's Brain team.

  • What is the significance of the Transformer architecture in the AI field?

    -The Transformer architecture, co-invented by Lion Jones, is significant as it powers various AI models from chatbots like Chat GPT to image generation models like Stable Diffusion. It has become a foundational element in modern AI research and applications.

  • What is the main critique of current large language models according to the speaker?

    -The speaker critiques that current large language models are highly energy inefficient, difficult to maintain due to their rigidity, prone to security attacks, and require fine-tuning to avoid generating undesirable language. He suggests that a new approach inspired by complex adaptive systems is needed.

  • Why does the speaker believe that character-level language modeling is advantageous?

    -The speaker believes that character-level language modeling is advantageous because it simplifies the vocabulary creation process, reduces the number of parameters needed, and can potentially lead to more accurate spelling in generated text, especially beneficial for languages like Japanese.

  • What is the connection between the research presented at NTT R&D Forum and the concept of 'collective intelligence'?

    -The research presented at NTT R&D Forum is connected to 'collective intelligence' through the concept of training AI systems based on principles found in nature, such as swarm behavior and evolution, where multiple agents or models work together to achieve a task more effectively.

  • How does the speaker's experience in Investment Banking relate to his current work in AI?

    -The speaker's experience in Investment Banking provided him with insights into the complexities of financial markets and the limitations of models in capturing real-world intricacies, which has informed his approach to developing AI systems that can better adapt and evolve like natural systems.

  • What is the 'Evo Jax' project mentioned in the transcript?

    -The 'Evo Jax' project is a framework built on top of Jax that incorporates evolutionary and collective intelligence algorithms. It was used to train artificial life creatures at Google scale, demonstrating the potential of these nature-inspired approaches in AI.

  • Why does the speaker think Japan is a good place to start an AI company?

    -The speaker believes Japan is an ideal place to start an AI company due to its supportive culture for AI, the growing interest in AI among the population, and the opportunity to develop a strong, independent AI R&D ecosystem that contributes to a more diverse global AI landscape.

  • What is the main argument for using character-level language models over word-level models in the context of Japanese language?

    -The main argument for using character-level language models over word-level models in the context of Japanese is that character-level models can more effectively handle the morphological variations inherent in the language, leading to better performance in tasks like question answering and image generation.

Outlines

00:00

🐟 Introduction to Sakana AI and Its Vision

The speaker begins by expressing gratitude for the opportunity to speak at the NTT R&D Forum and introduces Sakana AI, an AI R&D company co-founded with his friend, Lion Jones. The name 'Sakana' is derived from the Japanese word for fish, symbolizing the collective intelligence and independent thinking that the company embodies. The speaker highlights the company's focus on nature-inspired research, such as collective intelligence and evolution, and the importance of non-conformity in pursuing innovative AI solutions. The speaker's background includes studying engineering science with a focus on neural networks at the University of Toronto and later working on Wall Street. He then transitioned to Google's Brain Team, where he contributed to generative AI research, including projects on multi-agent systems, evolution, and language models. The talk aims to cover Sakana AI's team, technical vision, and the rationale for starting the company in Japan.

05:01

🧠 Challenging AI Norms and the Need for New Foundation Models

The speaker discusses his skepticism about the current approach to AI development, particularly the belief that scaling up existing models will lead to strong AI. He advocates for a new type of foundation model inspired by complex adaptive systems, which are more adaptive and integrated with their environment. The speaker critiques the energy inefficiency and rigidity of large language models, suggesting that they are prone to security attacks and require constant fine-tuning. He references his own research, including a paper on 'collective intelligence for deep learning,' which explores nature-inspired systems. The speaker also mentions his work on 'weight agnostic neuron networks,' which demonstrated that neural network architectures can function effectively even with randomized weights, challenging the prevailing paradigm of AI model training.

10:03

🌐 Sakana AI's Focus on Japan and the Global AI Ecosystem

The speaker explains the decision to establish Sakana AI in Japan, noting the country's growing interest in AI and the potential for developing a strong AI R&D ecosystem. He expresses concern about the current AI field being too Western or China-centric and emphasizes the importance of Japan, as a democratic country, becoming a leader in AI. The speaker believes that Japan's economic strength and cultural support for AI provide a solid foundation for creating a global AI ecosystem. He also addresses the challenge of attracting AI talent to Japan, countering the notion that it is difficult by arguing that Japan's appeal makes it easier to draw top talent. Since the company's launch, they have received numerous resumes from qualified candidates worldwide, indicating a strong interest in joining their efforts.

15:04

🔤 The Case for Character-Level Language Modeling

The speaker, known for his work on the Transformer model, discusses his research on character-level language modeling. He explains the concept and its advantages, such as not requiring a predefined vocabulary and potentially being more efficient for languages like Japanese. Despite the increased computational demands, character-level models can handle out-of-vocabulary words better and may be more suitable for morphologically rich languages. The speaker highlights his early work on character-level models for question answering and how pre-training these models improved their performance. He also touches on the cultural bias in AI research towards English and the need for Japanese companies to embrace character-level modeling given its benefits for the Japanese language.

20:06

📈 The Evolution and Impact of Language Model Pre-Training

The speaker delves into the history and impact of language model pre-training, starting with his 2016 paper that suggested its use for character-based models. He discusses how pre-training has become a standard practice, particularly with the success of Transformer models. The speaker's work showed significant improvements in language modeling performance when using character-level models, leading to a substantial reduction in bits per character. He emphasizes the importance of these findings, which were pivotal in the development of large-scale language models like GPT. Despite the success of word-level models, the speaker maintains that character-level modeling is a more logical and efficient approach, especially for languages with complex character systems.

25:07

🔠 Addressing the Limitations of Word-Level Language Models

The speaker addresses the limitations of word-level language models, particularly their inability to process characters within words. He uses examples, such as the difficulty of spelling words in reverse, to illustrate the challenges these models face. The speaker argues that character-level models can overcome these issues, as they can spell and manipulate characters more accurately. He also points out that large language models, despite their impressive capabilities, still struggle with spelling in image generation tasks when using word-level tokenization. The speaker concludes by advocating for a shift to character-level modeling, which he believes is not only simpler but also a better fit for languages like Japanese and potentially others with character-based writing systems.

30:07

🗣️ The Importance of Language Diversity in AI Development

In the final segment, the speaker engages in a discussion about the importance of language diversity in AI, particularly the Japanese language's role in developing AI models. He agrees with the idea that understanding Japanese can provide a different perspective on problem-solving, which is valuable for AI development. The speaker also touches on the potential for character-level modeling to be applied to other languages, such as Korean and Chinese, and the benefits of multilingual training for AI models. He concludes by suggesting that a universal language structure may exist, as evidenced by AI's ability to translate between language pairs it was not explicitly trained on, indicating a deeper understanding of language at a core level.

Mindmap

Keywords

💡NTT R&D Forum

The NTT R&D Forum is a research and development event organized by Nippon Telegraph and Telephone Corporation (NTT), a leading telecommunications company in Japan. The forum serves as a platform for discussing and presenting cutting-edge research and innovations in technology. In the context of the video, it is the venue where the speaker, representing Sakana AI, is invited to talk about their company and its vision for AI development.

💡Sakana AI

Sakana AI is an AI research and development company founded by the speaker and his friend, Lion Jones. The name 'Sakana' is derived from the Japanese word for fish, symbolizing the concept of collective intelligence and the independent yet coordinated behavior of fish schools. The company aims to explore nature-inspired AI concepts, such as collective intelligence and evolution, to develop new paradigms in AI technology.

💡Collective Intelligence

Collective intelligence refers to the idea that groups of individuals can exhibit higher intelligence or problem-solving capabilities when working together than they can individually. In the video, the speaker discusses how Sakana AI is inspired by this concept, aiming to develop AI systems that mimic the coordinated behavior seen in nature, such as fish schools or swarms, to create more adaptive and intelligent AI models.

💡Evolution

In the context of the video, evolution is discussed as a source of inspiration for AI development. The speaker mentions using evolutionary algorithms and the concept of mutation to improve AI systems, suggesting that AI can benefit from principles similar to biological evolution, where diversity and adaptation lead to more robust and capable systems.

💡Transformer Architecture

The Transformer architecture is a type of deep learning model introduced by Lion Jones, co-founder of Sakana AI. It has become foundational in various AI applications, including natural language processing. The architecture is known for its ability to process sequential data effectively and is a key component in models like GPT (Generative Pre-trained Transformer).

💡Generative AI

Generative AI refers to AI systems that can create new content, such as text, images, or music, based on learned patterns. The speaker discusses his work on generative AI at Google, including projects that involved training AI to generate new types of content, like novel kanji characters or sketches, showcasing the creative potential of AI.

💡Nature-Inspired AI

Nature-inspired AI is an approach to AI development that draws inspiration from natural systems and biological processes. The speaker emphasizes this approach throughout the video, highlighting projects that use principles like evolution, self-play, and multi-agent systems to develop AI that can perform tasks in new and adaptive ways.

💡Large Language Models

Large language models are AI systems trained on vast amounts of text data to understand and generate human-like text. The speaker critiques the current approach to training these models, arguing that their size and rigidity make them inefficient and difficult to maintain. He suggests that a new approach, inspired by nature, is needed to develop more adaptive and resilient AI systems.

💡Character-Level Language Modeling

Character-level language modeling is an approach where AI models process and generate text one character at a time, rather than by words or word pieces. Lion Jones, in the video, advocates for this method, arguing that it simplifies the model's vocabulary and can lead to more accurate spelling and better handling of morphologically complex languages like Japanese.

💡Deep Learning

Deep learning is a subset of machine learning that uses neural networks with many layers, or 'deep' architectures, to model and understand complex patterns in data. The speaker discusses his research in deep learning, particularly in the context of generative AI and nature-inspired algorithms, aiming to push the boundaries of what AI can achieve.

Highlights

Speaker expresses honor in being invited to speak at the NTT R&D Forum.

Introduction of Sakana AI, an AI R&D company co-founded by the speaker and Lion Jones.

The inspiration behind Sakana AI's name, signifying a swarm of fish forming a coherent entity.

The speaker's background in engineering science and early work on neural networks.

Lion Jones's contribution as the co-inventor of the Transformer architecture.

The speaker's unconventional journey from Wall Street to Google Brain.

Nature-inspired research ideas at Sakana AI, including collective intelligence and evolution.

The speaker's work on generative AI and multi-agent systems at Google.

Innovations in training AI systems to generate high-resolution images and sketches.

The belief that current machine learning models will not lead to strong AI.

The need for a new type of foundation model inspired by complex adaptive systems.

Critique of the current approach to training large language models as energy-inefficient and rigid.

The potential of collective intelligence and evolution in AI development.

The vision for Sakana AI to challenge conventional AI development methods.

The decision to start the company in Japan to contribute to a local AI R&D ecosystem.

The importance of developing an AI ecosystem in Japan amidst geopolitical trends.

The potential of Japan to become a global AI leader in a democratic context.

The advantage of character-level language modeling, especially for languages like Japanese.

The speaker's advocacy for character-level language models over word-level models.

The cultural and computational advantages of character-level models for Japanese.

The potential for character-level models to improve spelling accuracy in image generation.

The call to action for the industry to adopt character-level modeling, particularly for Japanese.