Working with File Types

Vendia supports working with all the standard JSON Schema scalar types, including strings, numbers, and Boolean values, as well as formatting extensions for specialized string types, such as URLs and date/time formats. However, multi-media files such as images and videos, large data files such as machine learning training sets, and other objects cannot be easily represented as any of these scalar types, nor can they be cost effectively stored in a database - even a serverless and NoSQL database.

To address this, Vendia provides full, built-in support for working with these objects, known as "files". The files themselves are stored in the cloud service provider object storage service (Amazon S3 on AWS, e.g.), while metadata representing the object, including its provenance, is stored in the blockchain database. Despite the differences in storage technologies used, Vendia ensures that all data appears "on chain", including a full, consistent history, built in tamperproofing, and ownership tracking.

Vendia's approach to file storage has several benefits:

  1. Cost effective - Because the "cost per bit" in file storage is much lower than storage in cloud databases, this representation provides a significant cost savings versus attempting to store large objects directly in a database.
  2. Simple - Because Vendia manages the files directly, no special code is required in the copying case, and few or no code changes are required to support links. File support is automatically built into all Vendia unis; no special action is required to enable it. Deletion, versioning, lineage tracing, tamperproofing and other capabilities are all automatic.
  3. Safe - Unlike manual ("off chain") management of file links in strings, using Vendia to manage files provides synchronized updates and safe, secure access to previous versions. Because the storage is automatically managed, common mistakes - such as accidental public exposure of data - is greatly minimized.

Using File Types

Files in Vendia blockchains are modeled as a "control" type that provides sufficient information to copy the object from its original source location to an "on-chain" location, and then to use it securely once there. To add a new file type, issue the following mutation:

addVendia_File(
  input: {
    sourceBucket: <bucket>,
    sourceKey: <key>,
    sourceRegion: <region>,
    destinationKey: <key>,
    copyStrategy: <strategy>,
    read: [List of Nodes],
    write: [List of Nodes]
  },
  syncMode: ASYNC
) { transaction { _id } }

In addition to creating files through the addVendia_File mutation, you can update files using updateVendia_File, delete them using removeVendia_File, retrieve their metadata using getVendia_File, and list the available files using listVendia_FileItems. To learn more about the File API, methods, properties and limits, see the File API documentation.

You can also use the S3 bucket created by Vendia directly to list and retrieve file objects. However, you cannot make direct PUT calls (or other changes) directly to the Vendia-managed S3 bucket. Instead, make those changes to an external bucket and then instruct Vendia to perform an add or update.

The S3 bucket used to store Vendia file objects is versioned. Updates and deletions preserve previous versions, all of which remain available.

Granting Permissions to the Source Bucket

When you add or update a file, Vendia copies the object from its original location - which must be in an S3 bucket you have access to - and then stores a copy of it in each participant in your Uni. While the participant-to-participant copies are managed directly by Vendia, the initial copy (sometimes referred to as "on-chaining" the file) requires you to have granted read access to the Vendia proxy account assigned to your node.

Getting the Account Id for your Uni

You can find the Account ID by using the following GraphQL query:

query getNodeAccountInfo {
  getVendia_UniInfo {
    localNodeName
    name
    nodes {
      name
      vendiaAccount {
        accountId
        csp
      }
    }
  }
}

This will return account information for where the node is deployed. That account ID needs to be added as a valid reader to your bucket. Below is an example response from the getNodeAccountInfo GraphQL query that contains the account ID for our node called SomeNode.

{
  "data": {
    "getVendia_UniInfo": {
      "localNodeName": "SomeNode",
      "name": "an-example-uni.unis.vendia.net",
      "nodes": [
        {
          "name": "SomeNode",
          "vendiaAccount": {
            "accountId": "999888777666",
            "csp": "AWS"
          }
        }
      ]
    }
  }
}

Update the Bucket Policy on your S3 bucket

Paste that account ID twice into the IAM policy document below and fill in your S3 bucket name for the resource. The /* resource can be scoped to any specific prefix of the S3 bucket, such as /my-archive/photos/*.

{
  "Version": "2012-10-17",
  "Id": "AllowVendiaToReadFiles",
  "Statement": [{
      "Sid": "AllowBucketInfoRead",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::[YOUR NODE ACCOUNT ID]:root"
      },
      "Action": [
        "s3:GetBucketLocation",
        "s3:GetBucketVersioning"
      ],
      "Resource": "arn:aws:s3:::your-bucket-name-here"
    },
    {
      "Sid": "AllowObjectRead",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::[YOUR NODE ACCOUNT ID]:root"
      },
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion"
      ],
      "Resource": "arn:aws:s3:::your-bucket-name-here/*"
    }
  ]
}

This gives Vendia access to read from your bucket by giving Vendia the necessary permissions.

Adding Files from Existing Buckets

As with all updates, file updates are stored in the ledger, making it easy to determine the provenance of any object over time, even if that item is changed by multiple participants.

With the bucket policy in place, files can be added via add_File:

mutation NewLandscapePhoto {
  addVendia_File(
    input: {
      sourceBucket: "your-bucket-name-here",
      sourceKey: "grand-canyon.jpg",
      sourceRegion: "us-west-2",
      destinationKey: "grand-canyon.jpg"
    },
    syncMode: ASYNC  
  ) {
    transaction {
      _id
      transactionId
    }
  }
}

This mutation will copy the file from your bucket and replicate it to the storage of the Vendia Uni, making it available on chain and preserving its version information.

Querying File metadata

You can query the node for File metadata, e.g.:

query listFiles {
  listVendia_FileItems {
    Vendia_FileItems {
      createdTime
      destinationKey
      etag
      sourceBucket
      sourceKey
      sourceRegion
      sourceVersion
      temporaryUrl
      _id
    }
  }
}

This data includes the DestinationKey that can be used to retrieve the File contents from the node bucket.

Granting client permissions to access files in Vendia

In order to view the File stored in your Vendia Uni, you need to grant the client AWS account access to the Vendia Uni. You can do this using the setting aws_S3ReadAccounts and providing your AWS Account Id. For example,

mutation updateSetting {
  updateVendia_Settings(
    input: {
      aws: {
        s3ReadAccounts: [
          "YOUR AWS ACCOUNT ID"
        ]
      }
    },
    syncMode: ASYNC
  ) {
    transaction {
      _id
    }
  }
}

Note: update_Settings will overwrite any existing settings, be sure to include any previous settings if necessary. Additionally, anyone with access to the AWS Account you provide will have access to the Files in your Uni Node.

Now your AWS Account will be able to access the Files in your Vendia Uni. To do so, you'll also need your Node's Bucket name.

Getting your Node's Bucket

To get the Node's Bucket information you can access the Uni Dashboard and select your Uni. Under your Node's Resources will be a property for S3 Bucket ARN. The portion following the arn:aws:s3::: is your bucket's name.

You can also get this information using the share CLI and executing the command:

share uni get --uni YOUR_UNI_NAME

As part of the output you will see a section called aws_FileStorage with a property name, this is the Node's bucket name.

Retrieving the File contents

To retrieve the File contents, use an S3 client to perform a GetObject request using the node bucket and DestinationKey attributes on the File metadata.

For example, using the AWS CLI,

aws s3 cp s3://my-node-destination-bucket/my-destination-key local-file

If you are using a Free Tier Uni, you need to provide the Requester Pays header as part of any call. All Vendia Free Tier buckets are set up as Requester Pays buckets. We handle all costs except for accesses directly to the Uni File bucket.

For non-Free Tier unis, the File metadata will include a pre-signed TemporaryUrl that can be used to retrieve the object directly using any HTTP client.

Next Steps

Developing and Using Unis

Defining Your Data Model

Integrating a Vendia chain with other Cloud, Web, and Mobile Services

Learning More

Terms and Definitions