Last week's (4.16~4.23) AI news overview:

This week was relatively quiet, with more manufacturers entering the LLM track. Google is making efforts to catch up with OpenAI in terms of products, and SnapChat has unveiled its own chat AI bot.

Now let me review the big AI news from last week.

April 17th
Kunlun Universe launched the billion-level large language model "Tiangong" and started internal testing.

Developed in collaboration between Kunlun Universe and the AI team Qidian Zhiyuan, "Tiangong" is a large-scale language model with dual billion-level capacity comparable to ChatGPT. It is also another innovative generative AI product from Kunlun Universe following the AI drawing product "Tiagong Qiaohui". Kunlun Universe released the AIGC series of algorithms and models in December 2022, covering multi-modal AI content generation capabilities such as images, music, text, and programming. According to Kunlun Universe, the current version of "Tiangong" supports text conversations of over 10,000 words and can achieve more than 20 rounds of user interaction.

It is reported that the entire project has invested hundreds of millions of RMB and formed a research and development team of hundreds of people, and will continue to increase investment in the future.

Internal testing address:
https://tiangong.kunlun.com/

April 18th
Meta released DinoV2.

DINOv2 is a new self-supervised high-performance computer vision model for training (self-supervised means that the model learns from unlabeled data without human annotations). DINOv2 has achieved outstanding results on several computer vision benchmarks, such as image classification, object detection, and segmentation, due to its novel contrastive learning method that encourages the model to focus on salient regions of the image while ignoring the background. It can learn from any collection of images without fine-tuning for different tasks.

Demo address: https://dinov2.metademolab.com/

Paper: https://arxiv.org/abs/2304.07193

GITHUB: https://github.com/facebookresearch/dinov2

April 19th
Aydar Bulatov and others released a technique that uses RMT to extend Transformer to over 1 million tokens.

This technique report introduces the use of recursive memory to extend the context length of BERT, one of the most effective Transformer-based models in natural language processing. By leveraging the Recursive Memory Transformer architecture, they have successfully increased the effective context length of the model to an unprecedented 2 million tokens while maintaining high memory retrieval accuracy. This method allows for storing and processing both local and global information and enables information flow between different segments of the input sequence using recursion. During the inference process, the model effectively utilizes memory from 4,096 segments with a total length of 2,048,000 tokens, far exceeding the maximum input size of reported Transformer models (64K tokens for CoLT5 and 32K tokens for GPT-4). In their experiments, this enhancement kept the memory size of the base model at 3.6GB.

Paper address: https://arxiv.org/abs/2304.11062

GITHUB: https://github.com/booydar/t5-experiments/tree/scaling-report

April 20th

The manufacturer of the well-known image production tool Stable Diffusion, Stability-AI, announced the launch of LLM-StableLM. This is a language model that can generate stable and consistent text in different fields and tasks. The alpha version has 3 billion and 7 billion parameters and performs well (GPT-3 has 175 billion parameters). There will be models with 15 billion to 65 billion parameters in the future. "Developers are free to inspect, use, and adapt our StableLM base model for commercial or research purposes, but must comply with the terms of the CC BY-SA-4.0 license" (one thing to note is that although the Base Model is under a Creative Commons license, Fine-tuned is under a Non-Commercial Creative Commons license, which means it cannot be used for commercial purposes).

GITHUB: https://github.com/stability-AI/stableLM/

On the same day, Snapchat launched AI chatbot functionality for all its users worldwide.

This chatbot, named Snapbot, allows users to have conversations with an AI agent that can answer questions, tell jokes, play games, and send snaps. Snapbot also learns from users' preferences and behaviors and occasionally sends snaps to users based on their interests. Snapchat claims that Snapbot is not intended to replace human interaction but to enhance it and make it more fun and engaging. Snapbot is powered by a deep neural network and can generate natural language responses and images. Snapchat states that Snapbot complies with privacy and data protection laws, and users can opt out of this feature at any time.

Snapchat

@Snapchat

·Follow

Say hi to My AI, our new chatbot located at the top of your chat. Write a song for your bestie who loves cheese, find the best IYKYK restaurant, or Snap it a photo of your garden to find the perfect recipe. Now free for all Snapchatters. #SnapPartnerSummit

Watch on X

6:00 PM · Apr 19, 2023

454

Read 2.0K replies

April 21st
Google's AI Bard has opened up its ability to write code, supporting 20 languages, and can also debug.

Jack Krawczyk

@JackK

·Follow

Today we’re updating Bard with the ability to help people with programming and software development tasks. We’re launching these capabilities in 20+ programming languages including C++, Go, Java, Javascript, Python and Typescript. blog.google/technology/ai/… 1/

1:24 PM · Apr 21, 2023

2.9K

Read 125 replies

On the same day, Fudan University's Natural Language Processing Laboratory launched a new MOSS model, becoming the first open-source large-scale language model with plugin-enhanced capabilities similar to ChatGPT in China.

MOSS is an open-source dialogue language model that supports Chinese, English, and multiple plugins. The moss-moon series models have 160 billion parameters and can run on a single A100/A800 or two 3090 graphics cards with FP16 precision. With INT4/8 precision, it can run on a single 3090 graphics card. The MOSS base language model is pre-trained on approximately 700 billion Chinese and English words, as well as code words. It has the ability for multi-turn dialogue and the use of multiple plugins through dialogue instruction fine-tuning, plugin-enhanced learning, and human preference training.

The MOSS model is derived from the team of Professor Xipeng Qiu at Fudan University's Natural Language Processing Laboratory and is named after the AI in the movie "The Wandering Earth".

Apply for trial: https://moss.fastnlp.top

GITHUB: https://github.com/OpenLMLab/MOSS

If this article is helpful, please subscribe and share, and you can also follow my Twitter. I will bring you more information about Web3, Layer2, AI, and Japan-related news:

https://twitter.com/cryptonerdcn