Author: YBB Capital Researcher Zeke
Preface
In previous articles, we have discussed many times about the current status of AI Meme and the future development of AI Agent. However, the rapid development and dramatic evolution of the narrative of AI Agent still makes people a little overwhelmed. In the short two months since the launch of Agent Summer, the narrative of the combination of AI and Crypto has changed almost every week. Recently, the market's attention has begun to focus on "framework" projects dominated by technical narratives. In the past few weeks alone, the sub-sector has produced several dark horses with market capitalizations of over 100 million or even over 1 billion. Such projects have also spawned a new asset issuance paradigm, where projects are published in Github code bases. Coins, Agents built on the framework can also issue coins again. The framework is the bottom, and the Agent is the top. It looks like an asset issuance platform, but in fact it is an infrastructure model unique to the AI era. How should we view this new Trend? This article will start with an introduction to the framework and combine it with my own thinking to interpret what the AI framework means for Crypto.
1. What is a framework?
By definition, an AI framework is a low-level development tool or platform that integrates a set of pre-built modules, libraries, and tools to simplify the process of building complex AI models. These frameworks usually also include software for processing data, training models, and In short, you can also simply think of the framework as an operating system in the AI era, just like Windows and Linux in the desktop operating system, or iOS and Android in the mobile terminal. It has its own advantages and disadvantages, and developers can choose freely according to specific needs.
Although the term "AI framework" is still a new concept in the Crypto field, from its origin, since the birth of Theano in 2010, the development of AI framework has actually been nearly 14 years. Both academia and industry already have very mature frameworks to choose from, such as Google's TensorFlow, Meta's Pytorch, Baidu's PaddlePaddle, and Byte's MagicAnimate. These frameworks have their own advantages for different scenarios.
The framework projects that have emerged in Crypto are based on the large number of Agent demands at the beginning of this wave of AI craze, and then they are derived from other tracks of Crypto, and finally form AI frameworks in different sub-sectors. Let’s expand on this sentence by taking several mainstream frameworks in the circle as examples.
1.1 Eliza
First, let's take Eliza from ai16z as an example. This framework is a multi-agent simulation framework that is specifically used to create, deploy, and manage autonomous AI agents. Based on TypeScript as the programming language, its advantage is better compatibility and easier API integration.
According to the official documentation, Eliza is mainly aimed at social media, such as multi-platform integration support. The framework provides full-featured Discord integration and supports voice channels, automated accounts on the X/Twitter platform, Telegram integration, and direct APIs. Access. In terms of media content processing, it supports PDF document reading and analysis, link content extraction and summarization, audio transcription, video content processing, image analysis and description, and conversation summarization.
Eliza currently supports four main use cases:
AI assistant applications: customer support agents, community managers, personal assistants;
Social media roles: automated content creator, interactive robot, brand representative;
Knowledge workers: research assistants, content analysts, document processors;
Interactive roles: role-playing characters, educational counselors, entertainment robots.
Eliza currently supports the following models:
Open source model local reasoning: such as Llama3, Qwen1.5, BERT;
Cloud-based reasoning using OpenAI’s API;
The default configuration is Nous Hermes Llama 3.1B;
Integrate with Claude for complex queries.
1.2 GAME
GAME (Generative Autonomous Multimodal Entities Framework) is a multimodal AI framework for automatic generation and management launched by Virtual. It is mainly designed for intelligent NPCs in games. Another special feature of this framework is that it is low-code or even code-free. Basic users can also use it. According to its trial interface, users only need to modify parameters to participate in Agent design.
In terms of project architecture, the core design of GAME is a modular design in which multiple subsystems work together. The detailed architecture is shown in the figure below.
Agent Prompting Interface: The interface for developers to interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, agent ID, user ID, etc.
Perception Subsystem: The perception subsystem is responsible for receiving input information, synthesizing it and sending it to the strategic planning engine. It also handles the response of the dialogue processing module;
Strategic Planning Engine: The strategic planning engine is the core of the entire framework and is divided into a high-level planner and a low-level policy. The high-level planner is responsible for formulating long-term goals and plans, while the low-level policy puts these plans into practice. Translate into concrete action steps;
World Context: The world context contains data such as environment information, world state, and game state. This information is used to help the agent understand the current situation;
Dialogue Processing Module: The dialogue processing module is responsible for processing messages and responses. It can generate dialogues or reactions as output.
On Chain Wallet Operator: On-chain wallet operators may be involved in the application scenarios of blockchain technology, but their specific functions are unclear;
Learning Module: The learning module learns from feedback and updates the agent’s knowledge base;
Working Memory: Working memory stores short-term information such as the agent's recent actions, results, and current plans;
Long Term Memory Processor: The long-term memory processor is responsible for extracting important information about the agent and its working memory and sorting it according to factors such as importance score, recency and relevance;
Agent Repository: The agent repository stores the agent's goals, reflections, experiences, and personality attributes.
Action Planner: The action planner generates a specific action plan based on the low-level strategy;
Plan Executor: The plan executor is responsible for executing the action plan generated by the action planner.
Workflow: The developer starts the agent through the agent prompt interface, and the perception subsystem receives input and passes it to the strategic planning engine. The strategic planning engine uses the information in the memory system, world context, and agent library to develop and execute action plans. Learning module The results of the agent's actions are continuously monitored and the agent's behavior is adjusted according to the results.
Application scenarios: From the perspective of the entire technical architecture, this framework mainly focuses on the decision-making, feedback, perception and personality of the Agent in the virtual environment. In terms of use cases, it is applicable to Metaverse in addition to games. In the list below Virtual, you can see that there are already A large number of projects have been built using this framework.
1.3 Rig
Rig is an open source tool written in Rust, designed to simplify the development of large language model (LLM) applications. It enables developers to easily communicate with multiple LLM service providers by providing a unified operating interface. It interacts with many vector databases such as OpenAI and Anthropic, and MongoDB and Neo4j.
Core Features:
Unified interface: Regardless of the LLM provider or vector storage, Rig provides consistent access, greatly reducing the complexity of integration work;
Modular architecture: The framework adopts a modular design, including key parts such as "provider abstraction layer", "vector storage interface" and "intelligent proxy system", ensuring the flexibility and scalability of the system;
Type safety: Rust's features are used to implement type-safe embedding operations, ensuring code quality and runtime security;
Efficient performance: Supports asynchronous programming mode and optimizes concurrent processing capabilities; built-in logging and monitoring functions facilitate maintenance and troubleshooting.
Workflow: When a user request enters the Rig system, it first passes through the "provider abstraction layer", which is responsible for standardizing the differences between different providers and ensuring consistency in error handling. Next, in the core layer, the intelligent agent Various tools or query vector storage can be called to obtain the required information. Finally, through advanced mechanisms such as retrieval-augmented generation (RAG), the system can combine document retrieval and context understanding to generate accurate and meaningful responses and return them to the user. .
Application scenarios: Rig is not only suitable for building question-answering systems that require fast and accurate answers, but can also be used to create efficient document search tools, chatbots or virtual assistants with contextual awareness, and even support content creation based on existing data patterns. Automatically generate text or other forms of content.
1.4 ZerePy
ZerePy is an open source Python framework that aims to simplify the process of deploying and managing AI Agents on the X (formerly Twitter) platform. It was born out of the Zerebro project and inherits its core functionality, but in a more modular and easier to extend format. Its goal is to enable developers to easily create personalized AI agents and implement various automated tasks and content creation on X.
ZerePy provides a command line interface (CLI) to facilitate users to manage and control their deployed AI Agents [1]. Its core architecture is based on modular design, allowing developers to flexibly integrate different functional modules, such as:
LLM integration: ZerePy supports OpenAI and Anthropic's Large Language Model (LLM), and developers can choose the model that best suits their application scenario. This enables Agents to generate high-quality text content;
X platform integration: The framework directly integrates the API of the X platform, allowing Agents to post, reply, like, forward, etc.
Modular connection system: This system allows developers to easily add support for other social platforms or services, extending the functionality of the framework;
Memory system (future plan): Although it may not be fully implemented in the current version, the design goals of ZerePy include integrating a memory system that allows agents to remember previous interactions and contextual information, thereby generating more coherent and personalized content.
While both ZerePy and a16z's Eliza project are dedicated to building and managing AI agents, they differ slightly in architecture and goals. Eliza focuses more on multi-agent simulation and broader AI research, while ZerePy focuses on simplifying The process of deploying AI Agent on a specific social platform (X) tends to be more simplified in practical applications.
2. A replica of the BTC ecosystem
In fact, in terms of development path, AI Agent has a lot in common with the BTC ecosystem at the end of 2023 and the beginning of 2024. The development path of the BTC ecosystem can be simply summarized as: BRC20-Atomical/Rune and other multi-protocol competition-BTC L2-Babylon BTCFi is the core of AI. AI Agent has developed more rapidly based on the mature traditional AI technology stack, but its overall development path does have many similarities with the BTC ecosystem. I will briefly summarize it as follows: GOAT/ACT- Competition among social agent/analytic AI agent frameworks. From a trend perspective, infrastructure projects focusing on agent decentralization and security are likely to take over this wave of framework enthusiasm and become the main theme of the next stage.
Will this track become homogenized and bubbled like the BTC ecosystem? I think not. First of all, the narrative of AI Agent is not to reproduce the history of smart contract chain. Secondly, whether the existing AI framework projects are really technically feasible or not, they are still not worth pursuing. The strength is still stagnant at the PPT stage or ctrl c+ctrl v, at least they provide a new way of thinking for infrastructure development. Many articles compare AI frameworks to asset issuance platforms and Agents to assets. In fact, compared with Memecoin Launchpad and Inscription Protocol, Personally, I think the AI framework is more like the public chain of the future, and Agent is more like the Dapp of the future.
In today's Crypto, we have thousands of public chains and tens of thousands of Dapps. Among the general chains, we have BTC, Ethereum, and various heterogeneous chains, while the forms of application chains are more diverse, such as Game chain, storage chain, Dex chain. The public chain corresponds to the AI framework. In fact, the two are very similar, and Dapp can also correspond to Agent very well.
Crypto in the AI era is likely to move towards this form. The future debate will shift from the debate between EVM and heterogeneous chains to the debate between frameworks. The current issue is more about how to decentralize or Chaining? I think the subsequent AI infrastructure projects will be carried out on this basis. Another point is what is the significance of doing this on the blockchain?
3. What is the significance of chaining?
No matter what blockchain is combined with, it will eventually face a question: Is it meaningful? In last year's article, I criticized GameFi for putting the cart before the horse and the transition of Infra development ahead of time. In the previous articles on AI, I also expressed I am not optimistic about the combination of AI x Crypto in the practical field at this stage. After all, the driving force of narrative has become weaker and weaker for traditional projects. The few traditional projects that performed well in price last year have basically lost their popularity. Have the ability to match or exceed the price of the currency. What can AI do for Crypto? I previously thought of the relatively vulgar but necessary ideas of Agents acting as agents to achieve their intentions, Metaverse, Agents as employees, etc. But these needs are There is no need to fully put it on the chain, and from a business logic point of view, it is impossible to close the loop. The Agent browser mentioned in the previous issue can achieve the intention, and it can derive the needs of data labeling, reasoning computing power, etc., but the combination of the two is not tight enough and In terms of computing power, centralized computing power still has the upper hand in many aspects.
Rethinking the success of DeFi, the reason why DeFi can get a share of the traditional finance is that it has higher accessibility, better efficiency, lower costs, and no need to trust the security of the centralization. Following this line of thinking, I think there may be several reasons to support Agent chaining.
1. Can the chaining of Agents achieve lower usage costs and thus higher accessibility and selectivity? Ultimately, the AI “rental rights” that are exclusive to Web2 giants can be shared by ordinary users;
2. Security. According to the simplest definition of Agent, an AI that can be called Agent should be able to interact with the virtual or real world. If Agent can intervene in reality or my virtual wallet, then the security solution based on blockchain It is also a rigid need;
3. Can Agent realize a set of financial gameplay unique to blockchain? For example, LP in AMM allows ordinary people to participate in automatic market making. For example, Agent needs computing power, data labeling, etc., and users are optimistic about Invest in the protocol in the form of U. Or Agents based on different application scenarios can form new financial gameplay;
4. DeFi does not have perfect interoperability at present. If the Agent combined with blockchain can achieve transparent and traceable reasoning, it may be more attractive than the agent browser provided by the traditional Internet giants mentioned in the previous article. .
4. Creativity?
Framework projects will also provide a similar entrepreneurial opportunity as GPT Store in the future. Although it is still very complicated for ordinary users to publish an Agent through a framework, I think that frameworks that simplify the Agent building process and provide some complex functional combinations will still be popular in the future. Gaining the upper hand will form a Web3 creative economy that is more interesting than GPT Store.
The current GPT Store still tends to be practical in traditional fields and most of the popular apps are created by traditional Web2 companies, and the revenue is also exclusively owned by the creators. According to OpenAI’s official explanation, this strategy is only for some outstanding developers in the United States. Provide financial support and grant a certain amount of subsidies.
From the perspective of demand, Web3 still has many aspects that need to be filled, and in terms of the economic system, it can also make the unfair policies of Web2 giants more fair. In addition, we can naturally introduce community economy to make Agent more perfect. The creative economy of Agent will be an opportunity for ordinary people to participate, and the future AI Meme will be much smarter and more interesting than the Agent released on GOAT and Clanker.
Reference articles:
1. Historical evolution and trend exploration of AI frameworks
2.Bybit: AI Rig Complex (ARC): AI agent framework
3. Deep Value Memetics: Horizontal comparison of the four major Crypto×AI frameworks: adoption status, advantages and disadvantages, and growth potential