Grass's positioning and usage scenarios

Grass Network: How to gain benefits by sharing Internet resources? Grass is a project deployed on the Solana chain that combines AI, Depin and Solana technologies, and is positioned as the data layer for AI. It is a decentralized web scraping platform designed to help companies and non-profit organizations train artificial intelligence (AI) by utilizing unused Internet bandwidth. It implements web scraping through browser extension applications, utilizes individuals' unused Internet bandwidth, and rewards users with Grass Points. Grass aims to redefine the Internet incentive structure by allowing users to share unused Internet bandwidth resources, allowing users to benefit directly from the network and ensure that the value of the Internet is in the hands of users. Currently, the network has more than 2 million users running nodes, crawling a large amount of data for AI models.

Technical Architecture

Grass Sovereign Data Rollup is a network specially built by Grass on Solana, which enables the protocol to handle all transactions from data sources to processing, verification, and building data sets. The network is built around validators (which issue data collection instructions), routers (which manage web request distribution), and Grass nodes (which users use to contribute their idle network resources). The specific architecture is as follows:

Grass Network: How to gain benefits by sharing Internet resources? Validator: Receives, verifies, and batches web transactions from the router. Then, generates ZK proofs to check session data on the chain. The on-chain proof can be referenced in the dataset to verify the source of the data and track its lineage throughout its life cycle. The validator set will transition from the initial centralized framework of a single validator to a decentralized validator committee.

Router: Connects GRASS nodes to validators. Routers keep the node network traceable and relay bandwidth. GRASS incentivizes its operation in proportion to the total verification bandwidth provided through relays. Routers are responsible for reporting the following metrics to validators in the network: the size of each incoming and outgoing request (in bytes); the latency of each node and the latency of the validator; and the network status of each connected node.

Grass Node: Utilizes users' unused bandwidth and relays traffic so that the network can scrape public web data (not users' personal data). Running a node is free, and the person who runs the node (node operator) is paid according to the data relayed through it.

ZK Processor: Batch processes the validity proof of session data for all web requests and submits the proof to the L1 blockchain. This operation permanently records every crawling behavior performed on the network. This also lays the foundation for a comprehensive understanding of the source of AI training data.

Grass Data Ledger: This is the link between the captured data and the L1 settlement layer. The ledger is an immutable data structure that hosts the complete data set and links the data to its corresponding on-chain proof. It is a data repository that ensures the source of the data.

Edge Embedding Models: This is the process of converting unstructured web data into a structured model. This includes all the necessary preprocessing steps to ensure that the collected raw data is cleaned, normalized, and structured in a format that meets the requirements of the AI model.

Technical characteristics

In the above architecture, the Grass network sits between the client and the web server, with the client making web requests, which are sent through the validator and ultimately routed through the Grass nodes. No matter which website the client requests, its server responds to the web request, allowing its data to be captured and sent back over the wire. It will then be cleaned, processed, and prepared for training the next generation of AI models.

This process requires understanding two main additional features: the Grass data ledger and the ZK processor.

The Grass data ledger is where all data is ultimately stored. It is a permanent ledger of every dataset captured by Grass, embedded with metadata, recording its earliest lineage from the moment of origin. The metadata proof of each dataset will be stored on Solana's settlement layer, and the settlement data itself is also provided through the ledger.

The purpose of the ZK processor is to help record the origin of the datasets crawled on the Grass network. The process is as follows: when a node on the network (that is, a user with the Grass extension installed) sends a web request to a given website, it returns an encrypted response that includes all the data requested by the node. This is the moment when the dataset is born, the moment of origin that needs to be recorded, and also the moment when the metadata is recorded. It contains many fields such as session keys, crawled website URLs, target website IP addresses, transaction timestamps, and of course the data itself. Thanks to this necessary information and datasets with clear website origins, AI models can be trained correctly and faithfully.

ZK processors can keep data that needs to be settled on-chain hidden from Solana validators. In addition, the large number of web requests that will be executed on Grass in the future will exceed the throughput that L1 can sustain. Grass will soon scale to the level of executing tens of millions of web requests per minute, and the metadata of each request will need to be settled on-chain. It is impossible to submit these transactions to L1 without ZK processors to prove and batch them first. Therefore, Rollup is the only possible way to achieve the planned goals.

In addition to recording the website where the dataset originated, the metadata also indicates which node on the network it was routed through. This means that every time a node crawls the network, it can be rewarded for its contribution without revealing any of its own identity information. This allows Grass to reward nodes proportionally, and nodes that crawl more and more valuable data will receive more incentives. This mechanism will significantly increase rewards in the hottest regions of the world, ultimately encouraging people in these regions to sign up and increase network capacity. The larger the network you join, the greater the capacity Grass can crawl and the larger the repository of network data it can store. More data means that Grass can provide more data to artificial intelligence labs that need training data, thereby incentivizing the network to continue growing.

Grass node operation and security mechanism

Grass nodes are free to run and act as the network's gateway to the internet. Node operators (i.e. application users) are rewarded for the traffic relayed through their nodes and receive network traffic based on their reputation score and geographic demand.

GRASS nodes have two main purposes: they pass traffic (i.e. web requests) initiated by clients and directed by authenticators; and they return encrypted web server responses to designated routers.

Grass Network: How to gain benefits by sharing Internet resources? The systems supported by the node are shown in the figure above. The process of running a node is also very simple: create an account, download the Grass desktop application, and connect to the network.

Once connected, nodes automatically register on the network. Operators are responsible for maintaining network uptime so that nodes can forward network requests to public network servers. Each request sent to a Grass node is an encrypted data packet. The data packet only provides directions to the node at each packet's destination. Network requests are authenticated through digital signatures from all parties involved. These signatures verify the legitimacy of the request, determining whether it should be forwarded to the destination network server (i.e., public website). This cryptographic process prevents data tampering and ensures that validators can accurately measure the reputation of each node.

The node reputation score mainly includes the following points:

Completeness: Evaluate whether the data is complete and whether the dataset contains all necessary data points required for the intended use case.

Consistency: Check the consistency of data across different datasets or over time within the same dataset.

Timeliness: Measures whether data is current when needed.

Availability: Evaluates the data availability of each node.

In terms of security mechanisms, the Grass network does not use user nodes (i.e. computers) or view any actions performed by users on their computers. All it does is route Internet traffic through the user's IP address and has nothing to do with the user's activities. This means that Grass has zero access to the user's personal data, and 100% of the data captured comes from public network data.

In addition, Grass uses bandwidth encryption to ensure that all users are protected when sharing an Internet connection. Grass has also partnered with AppEsteem, a leading cybersecurity compliance audit company, which monitors Grass' products 24 hours a day for vulnerabilities, leaks, backdoors, and malware to ensure the security of users. AppEsteem certification enjoys a high reputation in the cybersecurity industry, and obtaining this certification means that Grass' products are also whitelisted by top anti-malware applications, including Avast, Microsoft Defender, McAfee, AVG, etc.

Functions of Grass Token

Grass token holders can participate in the Grass network in the following ways:

Transactions and repurchases: After decentralization, GRASS will be used to support web scraping transactions, dataset purchases, and LCR (real-time context retrieval) usage.

Stake and Rewards: Stake Grass to routers to facilitate network traffic and receive rewards for contributing to network security.

Network governance: Participate in the development of the GRASS network, including proposing and voting for network improvements, coordinating which organizations to work with, and determining incentives for all stakeholders.

According to statistics from the Dune website, the annualized return on Grass staking is currently around 45%, about 33% of Grass tokens are involved in staking, and the number of pledged tokens exceeds 26 million.

Router Staking and Revenue

The Router acts as a decentralized hub, connecting all network nodes and managing incoming and outgoing web requests for validators. Router operation is incentivized, with rewards proportional to the amount of stake delegated to each Router. All traffic routed through the Router relay is encrypted and metered to ensure security and performance.

Grass Network: How to gain benefits by sharing Internet resources? Currently, the staked amount of each Router is shown in the figure above. Users can stake Grass to the Router to obtain income. The commission of each Router is different.

Currently, DBunker's Grass pledge volume is about 1.43 million, the minimum pledge period is 7 days, and the commission is 10%. (Data source: https://www.grassfoundation.io/stake/delegations ) Users only need to click STAKE to connect their wallet, pledge Grass, and obtain Router pledge income.

summary

Grass is committed to building a fair, open and decentralized data layer, aiming to solve the ethical issues and data quality issues of current Internet data extraction, and oppose the data monopoly controlled by a few large companies. In terms of technical architecture and features, Grass introduces a metadata mechanism to record the source of all data sets by building a data Rollup. The ZK proofs of these data are stored on the L1 settlement layer, and the metadata itself will eventually be bound to its underlying data set because these data sets themselves are stored on Grass's data ledger. Therefore, ZK proofs lay the foundation for improving transparency and providing node providers with rewards proportional to the amount of work they perform, which is also an important factor in motivating the expansion of the Grass network.

Grass focuses on data at the intersection of cryptocurrency and AI. Unlike traditional participants in closed-source, centralized AI, it is the original decentralized source of AI data. As an important participant in the web3 wave, Grass uses decentralized technology to build a fair and open data layer for AI companies and protocols, taking market demand as the entry point, and its development prospects are promising.