Deserialize YAML

Prerequisites

We're also using Microsoft Visual Studio Code as a text editor. Here are the installation instructions for a variety of popular operating systems.

Overview

In this lesson we’re building a program to deserialize a generic subgraph manifest. Before we go any further, let's get on the same page about deserialization and subgraph manifests.

Deserialization

Deserialization: The process whereby a lower-level format (e.g. that has been transferred over a network, or stored in a data store) is translated into a readable object or other data structure.

-MDN Web Docs

Subgraph manifests

The subgraph manifest subgraph.yaml defines the smart contracts your subgraph indexes, which events from these contracts to pay attention to, and how to map event data to entities that Graph Node stores and allows to query.

-The Graph Official Docs

For the full subgraph manifest specification visit this link, but here's an abbreviated, list-based tour of subgraph manifest fields, subfields, and their types (in no particular order):

  • specVersion: A Semver version indicating which version of this API is being used.

    • String type

  • repository: An optional link to where the subgraph lives.

    • String type

  • description: An optional description of the subgraph’s purpose

    • String type

  • schema: The GraphQL schema of this subgraph.

    • Schema type

    • Subfields

      • file: The path of the GraphQL IDL file, either local or on IPFS.

  • dataSources: Each data source spec defines the data that will be ingested as well as the transformation logic to derive the state of the subgraph’s entities based on the source data.

    • Subfields type

      • kind: The type of data source. Possible values: ethereum/contract.

        • String type

      • name: The name of the source data. Will be used to generate APIs in the mapping and also for self-documentation purposes.

        • String type

      • network: For blockchains, this describes which network the subgraph targets. Developers can look for an up to date list in the graph-cli.

        • String type

      • source: The source data on a blockchain such as Ethereum.

      • mapping: The transformation logic applied to the data prior to being indexed.

Rust code

Along with previously covered Rust concepts (see other guides), here’s a quick overview of the topics we’ll encouter in this lesson

  • define custom structs that represent the generic properties of a subgraph manifest

  • leverage serde_yaml crate and serde crate’s derive macro to deserialize the subgraph manifest YAML into custom structs.

  • validate a program with a basic test

  1. Open your terminal/command line, create a new cargo project, then open it with VSCode

cargo new parse_subgraph_manifest
cd parse_subgraph_manifest
code .
  1. Click Cargo.toml in the VSCode Explorer then modify the file with the following dependencies (add the following code below [dependencies]) then save your changes

reqwest = { version = "0.11.13", features = ["json"] }
tokio = { version = "1.23.0", features = ["full"] }
serde = { version = "1.0.152",  features = ["derive"]}
serde_yaml = "0.9.16"
  • reqwest “provides a convenient, higher-level HTTP Client”

  • tokio is an “event-driven, non-blocking I/O platform for writing asynchronous applications with the Rust programming language”

  • serde is a “framework for serializing and deserializing Rust data structures efficiently and generically”

  • serde_yaml is a library for using the Serde serialization framework with data in YAML file format.

  1. Right click the src directory in the VSCode Explorer then select "New file..." and create a file called utils.rs. Add the following code to the newly created file.

use std::collections::HashMap;
use std::string::String;
use serde::Deserialize;
  1. Next, add some struct statements to the same file then save

#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct SubgraphManifest {
    pub dataSources: Vec<DataSource>,
    pub description: Option<String>,
    pub repository: Option<String>,
    pub specVersion: String,
    pub schema: SchemaAddress,
}

#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct SchemaAddress {
    pub file: HashMap<String, String>,
}

#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct DataSource {
    pub kind: String,
    pub mapping: Mapping,
    pub name: String,
    pub network: String,
    pub source: Source,
}

#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct Mapping {
    pub abis: serde_yaml::Sequence,
    pub apiVersion: String,
    pub entities: serde_yaml::Sequence,
    pub eventHandlers: serde_yaml::Sequence,
    pub file: HashMap<String, String>,
    pub kind: String,
    pub language: String,
}

#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct Source {
    pub abi: String,
    pub address: String,
    pub startBlock: u32,
}

Description of structs

  • SubgraphManifest - maps to a subgraph manifest and some of it’s fields

  • SchemaAddress - `maps to a manifest’s schema address on IPFS

  • DataSource - maps to a single entry in a manifest’s dataSources

  • Mapping - maps to mapping field of a dataSource entry

  • Source - maps to source field of a dataSource entry

See Subgraph Manifest docs for full specification details.

  1. Click on src/main.rs in the VSCode Explorer to open the file. Delete all the existing file contents then add the following use and mod statements to the file.

use std::error::Error;

mod utils;
use crate::utils::SubgraphManifest;
  • use std::error::Error - a trait representing the basic expectations for error values, i.e., values of type E in Result<T, E>.

  • mod utils - will look for a file named utils.rs and will insert its contents inside a module named utils under this scope

  • use crate::utils::SubgraphManifest - will bind full crate::utils::SubgraphManifest path to SubgraphManifest for easier access

  1. Now add a main function with the following content below the use and mod statements

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {

    let manifest_response = reqwest::get("https://ipfs.io/ipfs/QmbW34MGRyp7LWkpDyKXDLWsKrN8iqZrNAMjGTYHN2zHa1")
    .await?
    .text()
    .await?;

    let manifest_data: SubgraphManifest = serde_yaml::from_str(&manifest_response).unwrap();

    println!("{:?}", manifest_data);

    Ok(())
}

Some notes:

  • our main function is async, powered by tokio

    • doesn’t return a value so we use the unit type in our result

      • also note the unit type in Ok(())

    • Boxing errors from our result with Box<dyn Error>

  • we perform a GET request to IPFS then store response text in manifest_response variable

  • we leverage serde_yaml to deserialize a reference to manifest_response into a variable manifest_data of type SubgraphManifest

  • finally we print out manifest_data to our terminal

  1. Save your changes then run the program from the integrated terminal in VSCode

cargo run
  1. To wrap things up let’s add a test below the main function. Check out Chapter 11 of The Rust Programming Language book for a more thorough discussion of tests in Rust. We’re leveraging tokio again to help with our async testing.

#[tokio::test]
async fn deserialize_everest_subgraph_manifest_repo()-> Result<(), Box<dyn Error>> {
    let manifest_response = reqwest::get("https://ipfs.io/ipfs/QmVsp1bC9rS3rf861cXgyvsqkpdsTXKSnS4729boXZvZyH")
    .await?
    .text()
    .await?;

    let manifest_data: SubgraphManifest = serde_yaml::from_str(&manifest_response).unwrap();

    let subgraph_manifest_repository = "https://github.com/graphprotocol/everest";

    assert_eq!(manifest_data.repository, subgraph_manifest_repository);

    Ok(())

Instead of printing results to our terminal, we use assert_eq macro to compare the deserialized manifest repository URL with a hard-coded value we provide. Additionally we are testing against the Everest subgraph in this function.

  1. Go ahead and run your test.

cargo test

Last updated