Introduction

  /$$$$$$                                         
 /$$__  $$                                        
| $$  \ $$  /$$$$$$   /$$$$$$  /$$    /$$ /$$$$$$ 
| $$$$$$$$ /$$__  $$ |____  $$|  $$  /$$//$$__  $$
| $$__  $$| $$  \ $$  /$$$$$$$ \  $$/$$/| $$$$$$$$
| $$  | $$| $$  | $$ /$$__  $$  \  $$$/ | $$_____/
| $$  | $$|  $$$$$$$|  $$$$$$$   \  $/  |  $$$$$$$
|__/  |__/ \____  $$ \_______/    \_/    \_______/
           /$$  \ $$                              
          |  $$$$$$/                              
           \______/

The Agave Platform (https://agaveplatform.org) is an open source, science-as-a-service API platform for powering your digital lab. Agave allows you to bring together your public, private, and shared high performance computing (HPC), high throughput computing (HTC), Cloud, and Big Data resources under a single, web-friendly REST API.

Run code
Manage data
Collaborate meaningfully
Integrate anywhere

The Agave documentation site contains documentation, guides, tutorials, and lots of examples to help you build your own digital lab.

Conventions

Throughout the documentation you will regularly encounter the following variables. These represent user-specific values that should be replaced when attempting any of the calls using your account. Once you log into this site, these values will be replaced with values appropriate for you to use when copying and pasting the examples on your own.

Variable	Description	Example
${API_HOST}	Base hostname of the API.	sandbox.agaveplatform.org
${API_VERSION}	Version of the API endpoint.	v2
${API_USERNAME}	Username of the current user.	nryan
${API_KEY}	Client key used to request an access token from the Agave Auth service.	hZ_z3f4Hf3CcgvGoMix0aksN4BOD6
${API_SECRET}	Client secret used to request an access token from the Agave Auth service.	gTgpCecqtOc6Ao3GmZ_FecVSSV8a
${API_TOKEN}		de32225c235cf47b9965997270a1496c

JSON Notation

{
    "active": true,
    "created": "2014-09-04T16:59:33.000-05:00",
    "frequency": 60,
    "id": "0001409867973952-5056a550b8-0001-014",
    "internalUsername": null,
    "lastCheck": [
      {
        "created": "2014-10-02T13:03:25.000-05:00",
        "id": "0001412273000497-5056a550b8-0001-015",
        "message": null,
        "result": "PASSED",
        "type": "STORAGE"
      },
      {
        "created": "2014-10-02T13:03:25.000-05:00",
        "id": "0001411825368981-5056a550b8-0001-015",
        "message": null,
        "result": "FAILED",
        "type": "LOGIN"
      }
    ],
    "lastSuccess": "2014-10-02T11:03:13.000-05:00",
    "lastUpdated": "2014-10-02T13:03:25.000-05:00",
    "nextUpdate": "2014-10-02T14:03:15.000-05:00",
    "owner": "systest",
    "target": "demo.storage.example.com",
    "updateSystemStatus": false,
    "_links": {
        "checks": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/0001409867973952-5056a550b8-0001-014/checks"
        },
        "notifications": {
            "href": "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=0001409867973952-5056a550b8-0001-014"
        },
        "owner": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/systest"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/0001409867973952-5056a550b8-0001-014"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/demo.storage.example.com"
        }
    }
}

When describing the JSON objects passed back and forth with the APIs, Javascript dot notation will be used to refer to individual properties. For example, consider the following JSON object.

active refers to the top level active attribute in the response object.
lastCheck.[].result generically refers to the result attribute contained within any of the objects contained in the lastCheck array.
lastCheck.[0].result specifically refers to the result attribute contained within the first object in the lastCheck array.
_links.self.href refers to the href attribute in the checks object within the _links object.

Versioning

The current major version of Agave is given in the URI immediately following the API resource name. For example, if the endpoint is https://sandbox.agaveplatform.org/jobs/v2/, the API version would be v2. The current major version of agave is v2.

Slugs

In certain situations, usually where file system paths and names are involved in some way, Agave will generate slugify object names to make them safe to use. Slugs will be created on the fly by applying the following rules:

Lowercase the string
Replace spaces with a dash
Remove any special characters and punctuation that might require encoding in the URL. Allowed characters are alphanumeric characters, numbers, underscores, and periods.

Secure communication

Agave uses SSL to secure communication with the clients. If HTTPS is not specified in the request, the request will be redirected to a secure channel.

Rate limiting

To make the API fast for everybody, rate limits apply. Unsigned requests are processed at the lowest rate limit. Signed requests with a valid access token benefit from higher rate limits — this is true even if endpoint doesn’t require an access token to be passed in the call.

Requests

The Agave API is based on REST principles: data resources are accessed via standard HTTPS requests in UTF-8 format to an API endpoint. Where possible, the API strives to use appropriate HTTP verbs for each action

Verb	Description
GET	Used for retrieving resources.
POST	Used for creating resources.
PUT	Used for manipulating resources or collections.
DELETE	Used for deleting resources.

Standard query parameters

Several URL query parameters are common across all services. The following table lists them for reference

Name	Values	Purpose
offset	integer (zero-based)	Skips the first offset results in the response.
limit	integer	Limits the number of responses to, at most, this number.
pretty	boolean	If true, pretty prints the response. Default false.
naked	boolean	If true, returns only the value of the result attribute in the standard response wrapper.
filter	string	A comma-delimited list of fields to return for each object in the response. Each field may be referenced using JSON notation. See the Response Customization for more info.

Experimental query parameters

Starting with the 2.1.10 release, two new query parameters have been introduced into the jobs api as an experimental feature. The following table lists them for reference

Name	Values	Purpose
sort	asc, desc	The sort order of the response. asc by default.
sortBy	string	The field by which to sort the response. Any field present in the full representation of the resource that you are querying is supported. Multiple values are not currently supported.

Responses

All data is received and returned as a JSON object. The Live Docs provide a description of all the retrievable objects.

Response Details

{
    "status": "error",
    "message": "Permission denied. You do not have permission to view this system",
    "version": "2.1.27-r8228",
    "result": {}
}

Apart from the response code, all responses from Agave are in the form of a json object. The object takes the following form.

Key	Value Type	Value Description
status	string	“success” if the call succeeded or “error” indicating that the call failed.
message	string	A short description of the cause of the error.
result	object, array	The JSON response object or array
version	string	The current full release version of Agave. Ex “2.2.0-r8228”

Here, for example, is the response that occurs when trying to fetch information for system to which you do not have access:

Naked Responses

In situations where you do not care to parse the wrapper for the raw response data, you may request a naked response from the API by adding naked=true in to the request URL. This will return just the value of the result attribute in the response wrapper.

Formatting

By default, all responses are serialized JSON. To receive pre-formatted JSON, add pretty=true to any query string.

Pagination

Pagination using limit and offset query parameters.

curl -sk -H \
    "Authorization: Bearer ${API_KEY}" \
    "https://sandbox.agaveplatform.org/jobs/v2/?offset=50&limit=25"

jobs-list -o 50 -l 25

All resource collections support a way of paging the dataset, taking an offset and limit as query parameters:

Note that offset numbering is zero-based and that omitting the offset parameter will return the first X elements. By default, all search and listing responses from the Science APIs are paginated in groups of 250 objects. The lone exception being the Files API which will return all results by default.

Check the documentation for the specific endpoint to see specific information.

Timestamps

Timestamps are returned in ISO 8601 format offset for Central Standard Time (-05:00) YYYY-MM-DDTHH:MM:SSZ-05:00.

CORS

Many modern applications choose to implement client-server communication exclusively in Javascript. For this reason, Agave provides cross-origin resource sharing (CORS) support so AJAX requests from a web browser are not constrained by cross-origin requests and can safely make GET, PUT, POST, and DELETE requests to the API.

Hypermedia

{
    "associationIds": [],
    "created": "2013-11-16T11:25:38.900-06:00",
    "internalUsername": null,
    "lastUpdated": "2013-11-16T11:25:38.900-06:00",
    "name": "color",
    "owner": "nryan",
    "uuid": "0001384622738900-5056a550b8-0001-012",
    "value": "red",
    "_links": {
        "self": {
            "href": "https://sandbox.agaveplatform.org/meta/v2/data/0001384622738900-5056a550b8-0001-012"
        },
        "owner": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
        }
    }
}

Agave strives to be a fully descriptive hypermedia API. Given any endpoint, you should be able to walk the API through the links provided in the _links object in each resource representation. The following user metadata object contains two referenced objects. The first, self is common to all objects, and contains the URL of that object. The second, owner contains the URL to the profile of the user who created the object.

Customizing Responses

Returns the user id, name, and email for the authenticated user

curl -sk -H \
    "Authorization: Bearer ${API_KEY}" \
    "https://sandbox.agaveplatform.org/profiles/v2/me?filter=username,email

profiles-list -v --filter=username,email me

The response would look something like the following:

{
  "username": "nryan",
  "email": "nryan@rangers.mlb.com"
}

Returns the name, status, app id, and the url to the archived job output for every user job

curl -sk -H \
    "Authorization: Bearer ${API_KEY}" \
    "https://sandbox.agaveplatform.org/jobs/v2/?limit=2&filter=name,status,appId,_links.archiveData.href

jobs-list -v --limit=2 --filter=name,status,appId,_links.archiveData

The response would look something like the following:

[
  {
    "name" : "demo-pyplot-demo-advanced test-1414139896",
    "status": "FINISHED",
    "appId" : "demo-pyplot-demo-advanced-0.1.0",
    "_links": {
      "archiveData": {
        "href": "https://agave.iplantc.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/listings"
      }
    }
  },
  {
    "name": "demo-pyplot-demo-advanced test-1414270831",
    "status": "FINISHED",
    "appId" : "demo-pyplot-demo-advanced-0.1.0",
    "_links": {
      "archiveData": {
        "href": "https://agave.iplantc.org/jobs/v2/3259859908028273126-242ac115-0001-007/outputs/listings"
      }
    }
  }
]

Returns the system id, type, whether it is your default system, and the hostname from the system’s storage config

/systems/v2/?filter=id,type,default,storage.host

systems-list -v --limit=2 --filter=id,type,default,storage.host

The response would look something like the following:

[
  {
    "id": "data.agaveplatform.org",
    "type": "STORAGE",
    "default": true,
    "storage": {
      "host": "dtn01.prod.agaveplatform.org"
    }
  },
  {
    "id": "docker.tacc.utexas.edu",
    "type": "EXECUTION",
    "default": true,
    "storage": {
      "host": "129.114.6.50"
    }
  }
]

In many situations, Agave may return back too much or too little information in the response to a query. For example, when searching jobs, the inputs and parameters fields are not included in the default summary response objects. You can customize the responses you receive from all the Science APIs using the filter query parameter.

The filter query parameter takes a comma-delimited list of fields to return for each object in the response. Each field may be referenced using JSON notation similar to the search syntax (minus the .[operation] suffix. The examples to the right show sample requests and responses.

Status Codes

The API uses the following response status codes, as defined in the RFC 2616 on successful and unsuccessful requests.

Success Codes

Response Code	Meaning	Description
200	Success	The request succeeded. Life is good.
201	Created	The request succeeded and a new resource was created. Only applicable on PUT and POST actions.
202	Accepted	The request has been accepted for processing, but the processing has not been completed. Common for all async actions such as job submissions, file transfers, etc.
206	Partial Content	The server has fulfilled the partial GET request for the resource. This will always be the return status of a request using a `Range` header.
301	Moved Permanently	The requested resource has been assigned a new permanent URI. You should follow the `Location` header, repeating the request.
304	Not Modified	You requested an action that succeeded, but did not modify the resource. Sound, fury, that whole thing.

Error Codes

Response Code	Meaning	Description
400	Bad Request	Your request was invalid
401	Unauthorized	Authentication required, but not provided
403	Forbidden	You do not have permission to access the given resource
404	Not Found	No resource was found at the given URL
405	Method Not Allowed	You tried to access a resource with an invalid method
406	Not Acceptable	You requested a response format that isn’t supported
410	Gone	The resource you requested has been removed and/or deleted
429	Too Many Requests	Curb your enthusiasm. You’re going way to fast.
500	Internal Server Error	It’s not you, it’s us. We had a problem processing your request. Try again later.
503	Service Unavailable	The service is temporarially unavailable. Please try again later.
504	Gateway Timeout	The service, while acting as a gateway or proxy, did not receive a timely response from the upstream server.

SDK

The Agave client SDK make it easy to add data management, code execution, collaborative features, and third-party integrations into your application. Officially supported SDK are available in Python, Javascript, Java, an PHP. Community provided and autogenerated libraries are available in several other languages

AngularJS

Install from bower, npm, or yarn

bower install agaveplatform/agave-angularjs-sdk
npm install agaveplatform/agave-angularjs-sdk
yarn install agaveplatform/agave-angularjs-sdk

Checkout the source code

git clone https://github.com/agaveplatform/agave-angularjs-sdk.git

The AngularJS SDK is a native Angularjs module with complete coverage of the Agave Science API. It features individual Angular services for each API and domain objects to assist with marshalling requests and responses.

Python

Install from pip

pip install agavepy

Checkout the source code

git clone https://github.com/tacc/agavepy.git

The Python SDK, agaveypy, is a simple Python binding for the Agave Platform. It provides both sync and async interfaces for long-running tasks as well as advanced token management.

Java (beta)

Checkout the source code

git clone https://github.com/agaveplatform/java-sdk.git
cd java-sdk
mvn clean install

Reference in your pom file

<dependency>
    <groupId>Agave</groupId>
    <artifactId>Agave</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <scope>compile</scope>
</dependency>

The Java SDK is a Java 7+ library to the Science APIs. It features a full domain model to interact with the Science APIs and support services. This is currently a preview version of the library and feedback is welcome to help improve the developer experience.

PHP (beta)

Install with composer

composer require agaveplatform/php-sdk

Checkout the source code

git clone https://github.com/agaveplatform/php-sdk.git

The PHP SDK is a PHP 5.5+ library to the Science APIs. It features full coverage of the Science APIs as well as a rich object model to simplify interactions. This is currently a preview version of the library and feedback is welcome to help improve the developer experience.

Web API

The Agave Science APIs power the Science-as-a-Service functionality of the Agave Platform. These web APIs allow you to manage all aspects of your code, collaborations, data, and your digital lab.

The Science APIs follow basic REST concepts and use JSON to exchange data. Formal documentation of all endpoints is available in Swagger 2.0 format. You may access the Swagger definitions directly in JSON and YAML formats.

Interactive API Explorer

Often it is easier to explore a new API using an interactive tool rather than writing code. We provide our Live Docs, an interactive API browser based on the Swagger UI project, to help you kick the tires on the API and get example requests and responses to help with your onboarding efforts.

Visit Agave Live Docs

Guides

The Agave REST APIs enable applications to create and manage digital laboratories that spans campuses, the cloud, and multiple data centers using a cohesive set of web-friendly interfaces.

Authorization

  /$$$$$$   /$$$$$$              /$$     /$$      
 /$$__  $$ /$$__  $$            | $$    | $$      
| $$  \ $$| $$  \ $$ /$$   /$$ /$$$$$$  | $$$$$$$
| $$  | $$| $$$$$$$$| $$  | $$|_  $$_/  | $$__  $$
| $$  | $$| $$__  $$| $$  | $$  | $$    | $$  \ $$
| $$  | $$| $$  | $$| $$  | $$  | $$ /$$| $$  | $$
|  $$$$$$/| $$  | $$|  $$$$$$/  |  $$$$/| $$  | $$
 \______/ |__/  |__/ \______/    \___/  |__/  |__/

Most requests to the Agave REST APIs require authorization; that is, the user must have granted permission for an application to access the requested data. To prove that the user has granted permission, the request header sent by the application must include a valid access token.

Before you can begin the authorization process, you will need to register your client application. That will give you a unique client key and secret key to use in the authorization flows.

Supported Authorization Flows

The Agave REST APIs currently supports four authorization flows:

The Authorization Code flow first gets a code then exchanges it for an access token and a refresh token. Since the exchange uses your client secret key, you should make that request server-side to keep the integrity of the key. An advantage of this flow is that you can use refresh tokens to extend the validity of the access token.
The Implicit Grant flow is carried out client-side and does not involve secret keys. The access tokens that are issued are short-lived and there are no refresh tokens to extend them when they expire.
Resource Owner Password Credentials flow is suitable for native and mobile applications as well as web services, this flow allows client applications to obtain an access token for a user by directly providing the user credentials in an authentication request. This flow exposes the user’s credentials to the client application and is primarily used in situations where the client application is highly trusted such as the command line.
The Client Credentials flow enables users to interact with their own protected resources directly without requiring browser interaction. This is a critical addition for use at the command line, in scripts, and in offline programs. This flow assumes the person registering the client application and the user on whose behalf requests are made be the same person.

Flow	Can fetch a user’s data by requesting access?	Uses secret key? (key exchange must happen server-side!)	Access token can be refreshed?
Authorization Code	Yes	Yes	Yes
Implicit Grant	Yes	No	No
Resource Owner Password Credentials	Yes	Yes	Yes
Client Credentials	No	Yes	No
Unauthorized	No	No	No

Authorization Code

The method is suitable for long-running applications in which the user logs in once and the access token can be refreshed. Since the token exchange involves sending your secret key, this should happen on a secure location, like a backend service, not from a client like a browser or mobile apps. This flow is described in RFC-6749. This flow is also the authorization flow used in our REST API Tutorial.

Authorization Code Flow Diagram

1. Your application requests authorization

A typical request will look something like this

https://sandbox.agaveplatform.org/authorize/?client_id=gTgp...SV8a&response_type=code&redirect_uri=https%3A%2F%2Fexample.com%2Fcallback&scope=PRODUCTION&state=866

The authorization process starts with your application sending a request to the Agave authorization service. (The reason your application sends this request can vary: it may be a step in the initialization of your application or in response to some user action, like a button click.) The request is sent to the /authorize endpoint of the Authorization service:

The request will include parameters in the query string:

Request body parameter	Value
response_type	Required. As defined in the OAuth 2.0 specification, this field must contain the value “code”.
client_id	Required. The application’s client ID, obtained when the client application was registered with Agave (see Client Registration).
redirect_uri	Required. The URI to redirect to after the user grants/denies permission. This URI needs to have been entered in the Redirect URI whitelist that you specified when you registered your application. The value of `redirect_uri` here must exactly match one of the values you entered when you registered your application, including upper/lowercase, terminating slashes, etc.
scope	Optional. A space-separated list of scopes. Currently only PRODUCTION is supported.
state	Optional, but strongly recommended. The state can be useful for correlating requests and responses. Because your redirect_uri can be guessed, using a state value can increase your assurance that an incoming connection is the result of an authentication request. If you generate a random string or encode the hash of some client state (e.g., a cookie) in this state variable, you can validate the response to additionally ensure that the request and response originated in the same browser. This provides protection against attacks such as cross-site request forgery. See RFC-6749.

2. The user is asked to authorize access within the scopes

The Agave Authorization service presents details of the scopes for which access is being sought. If the user is not logged in, they are prompted to do so using their API username and password.

When the user is logged in, they are asked to authorize access to the actions and services defined in the scopes.

3. The user is redirected back to your specified URI

Let’s assume you provided the following callback URL.

https://example.com/callback

After the user accepts (or denies) your request, the Agave Authorization service redirects back to the redirect_uri. If the user has accepted your request, the response query string contains a code parameter with the access code you will use in the next step to retrieve an access token.

Sample success redirect back from the server

https://example.com/callback?code=Pq3S..M4sY&state=866

Query parameter	Value
access_token	An access token that can be provided in subsequent calls, for example to Agave Profiles API.
token_type	Value: “bearer”
expires_in	The time period (in seconds) for which the access token is valid.
state	The value of the `state` parameter supplied in the request.

If the user has denied access, there will be no access token and the final URL will have a query string containing the following parameters:

# Sample denial redirect back from the server
https://example.com/callback?error=access_denied&state=867

Query parameter	Value
error	The reason authorization failed, for example: “access_denied”
state	The value of the state parameter supplied in the request.

4. Your application requests refresh and access tokens

POST https://sandbox.agaveplatform.org/token

When the authorization code has been received, you will need to exchange it with an access token by making a POST request to the Agave Authorization service, this time to its /token endpoint. The body of this POST request must contain the following parameters:

Request body parameter	Value
grant_type	Required. As defined in the OAuth 2.0 specification, this field must contain the value “authorization_code”.
code	Required. The authorization code returned from the initial request to the Account’s `/authorize` endpoint.
redirect_uri	Required. This parameter is used for validation only (there is no actual redirection). The value of this parameter must exactly match the value of `redirect_uri` supplied when requesting the authorization code.
client_id	Required. The application’s client ID, obtained when the client application was registered with Agave (see Client Registration).
client_secret	Required. The application’s client secret key, obtained when the client application was registered with Agave (see Client Registration).

5. The tokens are returned to your application

# An example cURL request
curl -X POST -d "grant_type= authorization_code"
    -d "code=Pq3S..M4sY"
    -d "client_id=gTgp...SV8a"
    -d "client_secret=hZ_z3f...BOD6"
    -d "redirect_uri=https%3A%2F%2Fwww.foo.com%2Fauth"
    https://sandbox.agaveplatform.org/token

The response would look something like this:

{
    "access_token": "a742...12d2",
    "expires_in": 14400,
    "refresh_token": "d77c...Sacf",
    "token_type": "bearer"
}

On success, the response from the Agave Authorization service has the status code 200 OK in the response header, and a JSON object with the fields in the following table in the response body:

Key	Value type	Value description
access_token	string	An access token that can be provided in subsequent calls, for example to Agave REST APIs.
token_type	string	How the access token may be used: always “Bearer”.
expires_in	int	The time period (in seconds) for which the access token is valid. (Maximum 14400 seconds, or 4 hours.)
refresh_token	string	A token that can be sent to the Spotify Accounts service in place of an authorization code. (When the access code expires, send a POST request to the Accounts service `/token` endpoint, but use this code in place of an authorization code. A new access token will be returned. A new refresh token might be returned too.)

6. Use the access token to access the Agave REST APIs

Make a call to the API

curl -H "Authorization: Bearer a742...12d2"
    https://sandbox.agaveplatform.org/profiles/v2/me?pretty=true&naked=true

The response would look something like this:

{
    "create_time": "20140905072223Z",
    "email": "rjohnson@mlb.com",
    "first_name": "Randy",
    "full_name": "Randy Johnson",
    "last_name": "Johnson",
    "mobile_phone": "(123) 456-7890",
    "phone": "(123) 456-7890",
    "status": "Active",
    "uid": 0,
    "username": "rjohnson"
}

Once you have a valid access token, you can include it in Authorization header for all subsequent requests to APIs in the Platform.

7. Requesting access token from refresh token

curl -sku "Authorization: Basic Qt3c...Rm1y="
    -d grant_type=refresh_token
    -d refresh_token=d77c...Sacf
    https://sandbox.agaveplatform.org/token

The response would look something like this.

{
    "access_token": "61e6...Mc96",
    "expires_in": 14400,
    "token_type": "bearer"
}

Access tokens are deliberately set to expire after a short time, usually 4 hours, after which new tokens may be granted by supplying the refresh token originally obtained during the authorization code exchange.

The request is sent to the token endpoint of the Agave Authorization service:

POST https://sandbox.agaveplatform.org/token

The body of this POST request must contain the following parameters:

Request body parameter	Value
grant_type	Required. Set it to “refresh_token”. refresh_token
refresh_token	Required. The refresh token returned from the authorization code exchange.

The header of this POST request must contain the following parameter:

Header parameter	Value
Authorization	Required. Base 64 encoded string that contains the client ID and client secret key. The field must have the format: `Authorization: Basic` . (This can also be achieved with curl using the `-u` option and specifying the raw colon separated client_id and client_secret.)

Implicit Grant

Implicit grant flow is for clients that are implemented entirely using JavaScript and running in resource owner’s browser. You do not need any server side code to use it. This flow is described in RFC-6749.

Implicit Flow

1. Your application requests authorization

https://sandbox.agaveplatform.org/authorize?client_id=gTgp...SV8a&redirect_uri=http:%2F%2Fexample.com%2Fcallback&scope=PRODUCTION&response_type=token&state=867

The flow starts off with your application redirecting the user to the /authorize endpoint of the Authorization service. The request will include parameters in the query string:

Request body parameter	Value
response_type	Required. As defined in the OAuth 2.0 specification, this field must contain the value “token”.
client_id	Required. The application’s client ID, obtained when the client application was registered with Agave (see Client Registration).
redirect_uri	Required. This parameter is used for validation only (there is no actual redirection). The value of this parameter must exactly match the value of `redirect_uri` supplied when requesting the authorization code.
scope	Required. A space-separated list of scopes. Currently only PRODUCTION is supported.
state	Optional, but strongly recommended. The state can be useful for correlating requests and responses. Because your redirect_uri can be guessed, using a state value can increase your assurance that an incoming connection is the result of an authentication request. If you generate a random string or encode the hash of some client state (e.g., a cookie) in this state variable, you can validate the response to additionally ensure that the request and response originated in the same browser. This provides protection against attacks such as cross-site request forgery. See RFC-6749.
show_dialog	Optional. Whether or not to force the user to approve the app again if they’ve already done so. If `false` (default), a user who has already approved the application may be automatically redirected to the URI specified by `redirect_uri`. If `true`, the user will not be automatically redirected and will have to approve the app again.

2. The user is asked to authorize access within the scopes

The Agave Authorization service presents details of the scopes for which access is being sought. If the user is not logged in, they are prompted to do so using their API username and password.

When the user is logged in, they are asked to authorize access to the services defined in the scopes. By default all of the Core Science APIs fall under a single scope called, PRODUCTION.

3. The user is redirected back to your specified URI

Let’s assume we specified the following callback address.

https://example.com/callback

A valid success response would be

https://example.com/callback#access_token=Vr17...amUa&token_type=bearer&expires_in=3600&state=867

After the user grants (or denies) access, the Agave Authorization service redirects the user to the redirect_uri. If the user has granted access, the final URL will contain the following data parameters in the query string.

Query parameter	Value
access_token	An access token that can be provided in subsequent calls, for example to Agave Profiles API.
token_type	Value: “bearer”
expires_in	The time period (in seconds) for which the access token is valid.
state	The value of the `state` parameter supplied in the request.

If the user has denied access, there will be no access token and the final URL will have a query string containing the following parameters:

A failed response would resemble something like

https://example.com/callback?error=access_denied&state=867

Query parameter	Value
error	The reason authorization failed, for example: “access_denied”
state	The value of the state parameter supplied in the request.

4. Use the access token to access the Agave REST APIs

A call to the profiles API to fetch the profile of the authenticated user would look like the following

curl -H "Authorization: Bearer 61e6...Mc96" https://sandbox.agaveplatform.org/profiles/v2/me?pretty=true

profiles-list -v me

The response would look something like this:

{
    "create_time": "20140905072223Z",
    "email": "nryan@mlb.com",
    "first_name": "Nolan",
    "full_name": "Nolan Ryan",
    "last_name": "Ryan",
    "mobile_phone": "(123) 456-7890",
    "phone": "(123) 456-7890",
    "status": "Active",
    "uid": 0,
    "username": "nryan"
}

The access token allows you to make requests to any of the Agave REST APIs on behalf of the authenticated user.

Resource Owner Password Credentials

The method is suitable for scenarios where there is a high degree of trust between the end-user and the client application. This could be a Desktop application, shell script, or server-to-server communication where user authorization is needed. This flow is described in RFC-6749.

1. Your application requests authorization

curl -sku "Authorization: Basic Qt3c...Rm1y="
    -d grant_type=password
    -d username=rjohnson
    -d password=password
    -d scope=PRODUCTION
    https://sandbox.agaveplatform.org/token

auth-tokens-create -u rjohnson -p password

The response would look something like this:

{
    "access_token": "3Dsr...pv21",
    "expires_in": 14400,
    "refresh_token": "dyVa...MqR0",
    "token_type": "bearer"
}

The request is sent to the /token endpoint of the Agave Authentication service. The request will include the following parameters in the request body:

Request body parameter	Value
grant_type	Required. Set it to “refresh_token”.
username	Required. The username of an active API user.
password	Required. The password of an active API user.
scope	Required. A space-separated list of scopes. Currently only PRODUCTION is supported.

The header of this POST request must contain the following parameter:

Header parameter	Value
Authorization	Required. Base 64 encoded string that contains the client ID and client secret key. The field must have the format: `Authorization: Basic` . (This can also be achieved with curl using the `-u` option and specifying the raw colon separated client_id and client_secret.)

https://example.com/callback?error=access_denied

If the user has not accepted your request or an error has occurred, the response query string contains an error parameter indicating the error that occurred during login. For example:

2. Use the access token to access the Agave REST APIs

curl -H "Authorization: Bearer 3Dsr...pv21"
    https://sandbox.agaveplatform.org/profiles/v2/me?pretty=true

The response would look something like this:

{
    "create_time": "20140905072223Z",
    "email": "rjohnson@mlb.com",
    "first_name": "Randy",
    "full_name": "Randy Johnson",
    "last_name": "Johnson",
    "mobile_phone": "(123) 456-7890",
    "phone": "(123) 456-7890",
    "status": "Active",
    "uid": 0,
    "username": "rjohnson"
}

The access token allows you to make requests to any of the Agave REST APIs on behalf of the authenticated user.

3. Requesting access token from refresh token

curl -sku "Authorization: Basic Qt3c...Rm1y="
    -d grant_type=refresh_token
    -d refresh_token=dyVa...MqR0
    -d scope=PRODUCTION
    https://sandbox.agaveplatform.org/token

The response would look something like this:

{
    "access_token": "8erF...NGly",
    "expires_in": 14400,
    "token_type": "bearer"
}

Access tokens are deliberately set to expire after a short time, usually 4 hours, after which new tokens may be granted by supplying the refresh token obtained during original request.

The request is sent to the token endpoint of the Agave Authorization service. The body of this POST request must contain the following parameters:

Request body parameter	Value
grant_type	Required. Set it to “refresh_token”. refresh_token
refresh_token	Required. The refresh token returned from the authorization code exchange.
scope	Required. A space-separated list of scopes. Required. Currently only PRODUCTION is supported.

The header of this POST request must contain the following parameter:

Header parameter	Value
Authorization	Required. Base 64 encoded string that contains the client ID and client secret key. The field must have the format: `Authorization: Basic` . (This can also be achieved with curl using the `-u` option and specifying the raw colon separated client_id and client_secret.)

Client Credentials

The method is suitable for authenticating your requests to the Agave REST API. This flow is described in RFC-6749.

1. Your application requests authorization

curl -sku "Authorization: Basic Qt3c...Rm1y="
    -d grant_type=client_credentials
    -d scope=PRODUCTION
    https://sandbox.agaveplatform.org/token

The response would look something like this:

{
    "access_token": "61e6...Mc96",
    "expires_in": 14400,
    "token_type": "bearer"
}

The request is sent to the /token endpoint of the Agave Authentication service. The request must include the following parameters in the request body:

Request body parameter	Value
grant_type	Required. Set it to “client_credentials”.
scope	Optional. A space-separated list of scopes. Currently on PRODUCTION is supported.

The header of this POST request must contain the following parameter:

Header parameter	Value
Authorization	Required. Base 64 encoded string that contains the client ID and client secret key. The field must have the format: `Authorization: Basic` . (This can also be achieved with curl using the `-u` option and specifying the raw colon separated client_id and client_secret.)

2. Use the access token to access the Agave REST APIs

curl -H "Authorization: Bearer 61e6...Mc96"
     https://sandbox.agaveplatform.org/profiles/v2/me

The response would look something like this:

{
    "email": "nryan@mlb.com",
    "firstName" : "Nolan",
    "lastName" : "Ryan",
    "position" : "null",
    "institution" : "Houston Astros",
    "phone": "(123) 456-7890",
    "fax" : null,
    "researchArea" : null,
    "department" : null,
    "city" : "Houston",
    "state" : "TX",
    "country" : "USA",
    "gender" : "M",
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/profiles/v2/nryan"
      },
      "users" : {
        "href" : "https://sandbox.agaveplatform.org/profiles/v2/nryan/users"
      }
    }
}

The access token allows you to make requests to any of the Agave REST APIs on behalf of the authenticated user.

Token lifetimes

There are two kinds of tokens you will obtained: access and refresh. Access token lifetimes are configured by the organization operating each tenant and vary based on the flow used to obtain them. By default, access tokens are valid for 4 hours.

Authorization Flow	Access Token Lifetime	Refresh Token Lifetime
Authorization	4 hours	infinite
Implicit	1 hour	n/a
User Credential Password	4 hours	infinite
Client Credentials	4 hours	n/a

Token management

Agave will return a unique access token for each Client Application used to authenticate a user with a specific OAuth flow.

This means that a client application authenticating a user using an Implicit flow will receive a different access token than if it authenticated the same user using a Client Credentials flow.

It also means that a client application repeatedly authenticating a user with the same OAuth flow will receive the same access token (an refresh, if applicable for the flow) in the response until the token expires or is manually revoked.

One implication of this behavior is that, if you have a distributed application that requires different parts to interact with Agave on behalf of a user, then it is important that you abstract out management of user tokens to a separate service to avoid refreshing the token in one of your services and simultaneously invaliding it all the others.

Revoking Tokens

curl -sku "$API_KEY:$API_SECRET" -XPOST -d "token=61e6...Mc96" https://sandbox.agaveplatform.org/revoke

auth-tokens-revoke

An empty response will be returned.

Access tokens will automatically expire after a predetermined amount of time. You may also manually revoke a token by making a POST request to the token revocation service using the same client key and secret used to obtain the token. After revocation, both the access and refresh token (if applicable) are instantly invalidated. All attempts to use them from that moment on will return a 401 response.

Clients and API Keys

  /$$$$$$  /$$ /$$                       /$$
 /$$__  $$| $$|__/                      | $$
| $$  \__/| $$ /$$  /$$$$$$  /$$$$$$$  /$$$$$$   /$$$$$$$
| $$      | $$| $$ /$$__  $$| $$__  $$|_  $$_/  /$$_____/
| $$      | $$| $$| $$$$$$$$| $$  \ $$  | $$   |  $$$$$$
| $$    $$| $$| $$| $$_____/| $$  | $$  | $$ /$$\____  $$
|  $$$$$$/| $$| $$|  $$$$$$$| $$  | $$  |  $$$$//$$$$$$$/
 \______/ |__/|__/ \_______/|__/  |__/   \___/ |_______/

By now you already have a user account. Your user account identifies you to the web applications you interact with. A username and password is sufficient for interacting with an application because the application has a user interface, so it knows that the authenticated user is the same one interacting with it. The Agave API does not have a user interface, so simply providing it a username and password is not sufficient. Agave needs to know both the user on whose behalf it is acting as well as the client application that is making the call. Whereas every person has a single user account, they may leverage multiple services to do their daily work. They may start out using Agave ToGo to kick of an analysis, then switch to MyPlant to discuss some results, then receive an Slack notice that new data has been shared with them, click a PostIt link that allows them to download the data directly to their desktop, edit the file locally, and save it in a local folder that syncs with their iPlant cloud storage in the background.

In each of the above interactions, the user is the same, but the context with which they interact with the Agave is different. Further, the above interactions all involved client applications developed by the same organization. The situation is further complicated when one or more 3rd party client applications are used to leverage the infrastructure. Agave needs to track both the users and client applications with whom it interacts. It does this through the issuance of API keys.

Agave uses OAuth2 to authenticate users and make authorization decisions about what APIs client applications have permission to access. A discussion of OAuth2 is out of the context of this tutorial. You can read more about it on the OAuth2 website or from the websites of any of the many other service providers using it today. In this section, we will walk you through getting your API keys so we can stay focused on learning how to interact with the Agave’s APIs.

Creating a new client application

In order to interact with any of the Agave APIs, you will need to first get a set of API keys. You can get your API keys from the Clients service. The example below shows how to get your API keys using both curl and the Agave CLI.

curl -sku "$API_USERNAME:$API_PASSWORD" -X POST -d "client_name=my_cli_app" -d "description=Client app used for scripting up cool stuff" https://sandbox.agaveplatform.org/clients/v2

clients-create -S -v -N my_cli_app -D "Client app used for scripting up cool stuff"

Note: the -S option will store the new API keys for future use so you don’t need to manually enter then when you authenticate later.

The response to this call will look something like:

{  
   "callbackUrl":"",
   "key":"gTgp...SV8a",
   "secret":"hZ_z3f...BOD6",
   "description":"Client app used for scripting up cool stuff",
   "name":"my_cli_app",
   "tier":"Unlimited",
   "_links":{  
      "self":{  
         "href":"https://sandbox.agaveplatform.org/clients/v2/my_cli_app"
      },
      "subscriber":{  
         "href":"https://sandbox.agaveplatform.orgprofiles/v2/nryan"
      },
      "subscriptions":{  
         "href":"https://sandbox.agaveplatform.org/clients/v2/my_cli_app/subscriptions/"
      }
   }
}

Your API keys should be kept in a secure place and not shared with others. This will prevent other, unauthorized client applications from impersonating your application. If you are developing a web application, you should also provide a valid callbackUrl when creating your keys. This will reduce the risk of your keys being reused even if they are compromised. You should also create a unique set of API keys for each client application you develop. This will allow you to better monitor your usage on a client application-to-application basis and reduce the possibility of inadvertently hitting usage quotas due to cumulative usage across client applications.

Listing your existing client applications

curl -sku "$API_USERNAME:$API_PASSWORD" https://sandbox.agaveplatform.org/clients/v2

clients-list -v

The response to this call will look something like:

[  
   {  
      "callbackUrl":"",
      "key":"xn8b...0y3d",
      "description":"",
      "name":"DefaultApplication",
      "tier":"Unlimited",
      "_links":{  
         "self":{  
            "href":"https://sandbox.agaveplatform.org/clients/v2/DefaultApplication"
         },
         "subscriber":{  
            "href":"https://sandbox.agaveplatform.orgprofiles/v2/nryan"
         },
         "subscriptions":{  
            "href":"https://sandbox.agaveplatform.org/clients/v2/DefaultApplication/subscriptions/"
         }
      }
   },
   {  
      "callbackUrl":"",
      "key":"gTgp...SV8a",
      "description":"Client app used for scripting up cool stuff",
      "name":"my_cli_app",
      "tier":"Unlimited",
      "_links":{  
         "self":{  
            "href":"https://sandbox.agaveplatform.org/clients/v2/my_cli_app"
         },
         "subscriber":{  
            "href":"https://sandbox.agaveplatform.orgprofiles/v2/nryan"
         },
         "subscriptions":{  
            "href":"https://sandbox.agaveplatform.org/clients/v2/my_cli_app/subscriptions/"
         }
      }
   }
]

Over time you may develop several client applications. Managing several sets of API keys can become tricky. You can see which applications you have created by querying the Clients service.

Deleting client registrations

curl -sku "$API_USERNAME:$API_PASSWORD" -X DELETE https://sandbox.agaveplatform.org/clients/v2/my_cli_app

clients-delete -v my_cli_app

The response to this call is simply a null result object.

At some point you may need to delete a client. You can do this by requesting a DELETE on your client in the Clients service.

Listing current subscriptions

curl -sku "$API_USERNAME:$API_PASSWORD" https://sandbox.agaveplatform.org/clients/v2/my_cli_app/subscriptions

clients-subscriptions-list -v my_cli_app

The response to this call will look something like:

[
  {
     "context":"/apps",
     "name":"Apps",
     "provider":"admin",
     "status":"PUBLISHED",
     "version":"v2",
     "tier":"Unlimited",
     "_links":{
        "api":{
           "href":"https://sandbox.agaveplatform.org/apps/v2/"
        },
        "client":{
           "href":"https://sandbox.agaveplatform.org/clients/v2/systest_test_client"
        },
        "self":{
           "href":"https://sandbox.agaveplatform.org/clients/v2/systest_test_client/subscriptions/"
        }
     }   
  },
  {
     "context":"/files",
     "name":"Files",
     "provider":"admin",
     "status":"PUBLISHED",
     "version":"v2",
     "tier":"Unlimited"
     "_links":{
        "api":{
           "href":"https://sandbox.agaveplatform.org/files/v2/"
        },
        "client":{
           "href":"https://sandbox.agaveplatform.org/clients/v2/systest_test_client"
        },
        "self":{
           "href":"https://sandbox.agaveplatform.org/clients/v2/systest_test_client/subscriptions/"
        }
     }
  },
  ...
]

When you register a new client application and get your API keys, you are given access to all the Agave APIs by default. You can see the APIs you have access to by querying the subscriptions collection of your client.

Updating client subscriptions

curl -sku "$API_USERNAME:$API_PASSWORD" -X POST \
    -d "apiName=transforms" \
    -d "apiVersion=v2" \
    -d "apiProvider=admin" \
    -d "tier=UNLIMITED" \
    https://sandbox.agaveplatform.org/clients/v2/my_cli_app/subscriptions

clients-subscriptions-update -v -N uuids -R v2 -P admin -T UNLIMITED  my_cli_app

You can also use a wildcard to resubscribe to all the default science APIs to which all new clients are subscribed.

curl -sku "$API_USERNAME:$API_PASSWORD" -X POST \
    -d "apiName=*" \
    https://sandbox.agaveplatform.org/clients/v2/my_cli_app/subscriptions

clients-subscriptions-update -v -N * my_cli_app

The response to this call will be a JSON array identical to the one returned when listing your subscriptions.

Over time, new APIs will be deployed. When this happens you will need to subscribe to the new APIs. You can do this by POSTing a request to the subscription collection with the information about the new API.

Systems

  /$$$$$$                     /$$
 /$$__  $$                   | $$
| $$  \__//$$   /$$ /$$$$$$$/$$$$$$   /$$$$$$ /$$$$$$/$$$$
|  $$$$$$| $$  | $$/$$_____|_  $$_/  /$$__  $| $$_  $$_  $$
 \____  $| $$  | $|  $$$$$$  | $$   | $$$$$$$| $$ \ $$ \ $$
 /$$  \ $| $$  | $$\____  $$ | $$ /$| $$_____| $$ | $$ | $$
|  $$$$$$|  $$$$$$$/$$$$$$$/ |  $$$$|  $$$$$$| $$ | $$ | $$
 \______/ \____  $|_______/   \___/  \_______|__/ |__/ |__/
          /$$  | $$
         |  $$$$$$/
          \______/

A system in Agave represents a server or collection of servers. A server can be physical, virtual, or a collection of servers exposed through a single hostname or ip address. Systems are identified and referenced in Agave by a unique ID unrelated to their ip address or hostname. Because of this, a single physical system may be registered multiple times. This allows different users to configure and use a system in whatever way they need to for their specific needs.

Systems come in two flavors: storage and execution. Storage systems are only used for storing and interacting with data. Execution systems are used for running apps (aka jobs or batch jobs) as well as storing and interacting with data.

The Systems service gives you the ability to add and discover storage and compute resources for use in the rest of the API. You may add as many or as few storage systems as you need to power your digital lab. When you register a system, it is private to you and you alone. Systems can also be published into the public space for all users to use. Depending on who is administering Agave for your organization, this may have already happened and you may already have one or more storage systems available to you by default.

In this tutorial we walk you through how to discovery, manage, share, and configure systems for your specific needs. This tutorial is best done in a hands-on manner, so if you do not have a compute or storage system of your own to use, you can grab a VM from our sandbox.

Discovering systems

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/systems/v2/

systems-list -v

The response will be something like this:

[
  {
    "id" : "data.agaveplatform.org",
    "name" : "iPlant Data Store",
    "type" : "STORAGE",
    "description" : "The iPlant Data Store is where your data are stored. The Data Store is cloud-based and is the central repository from which data is accessed by all of iPlant&#039;s technologies.",
    "status" : "UP",
    "public" : true,
    "default" : true,
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
      }
    }
  },
  {
    "id" : "docker.iplantcollaborative.org",
    "name" : "Demo Docker VM",
    "type" : "EXECUTION",
    "description" : "Atmosphere VM used for Docker demonstrations and tutorials.",
    "status" : "UP",
    "public" : true,
    "default" : false,
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/docker.iplantcollaborative.org"
      }
    }
  }
]

The Systems service allows you to list and search for systems you have registered and systems that have been shared with you. To get a list of all your systems, make a GET request on the Systems collection.

System description can get rather verbose, so a summary object is returned when listing a resource collection. The summary object contains the most critical fields in order to reduce response size when retrieving a user’s systems. You can customize this behavior using the filter query parameter.

Filtering results

List all systems (up to the page limit)

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/systems/v2/?type=storage

systems-list -v -S

Only execution systems

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/systems/v2/?type=execution

systems-list -v -E

Only public systems

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/systems/v2/?publicOnly=true

systems-list -v -P

Only private systems

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/systems/v2/?privateOnly=true

systems-list -v -Q

Only return default systems

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/systems/v2/?default=true

systems-list -v -D

You can further filter the results by type, scope, and default status. See the search section for further filtering options.

System details

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org

systems-list -v data.agaveplatform.org

The response will be something like this:

{
  "site": "agaveplatform.org",
  "id": "data.agaveplatform.org",
  "revision": 4,
  "default": true,
  "lastModified": "2016-09-30T21:43:11.000-05:00",
  "status": "UP",
  "description": "Cloud storage system for the Agave Public tenant",
  "name": "Agave Cloud Storage",
  "owner": "dooley",
  "_links": {
    "roles": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org/roles"
    },
    "credentials": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org/credentials"
    },
    "self": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
    },
    "metadata": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/?q=%7B%22associationIds%22%3A%224602981590618992154-242ac116-0001-006%22%7D"
    }
  },
  "globalDefault": true,
  "available": true,
  "uuid": "4602981590618992154-242ac116-0001-006",
  "public": true,
  "type": "STORAGE",
  "storage": {
    "mirror": false,
    "port": 22,
    "homeDir": "/home",
    "protocol": "SFTP",
    "host": "corral.tacc.utexas.edu",
    "publicAppsDir": "/apps",
    "proxy": null,
    "rootDir": "/gpfs/corral3/repl/projects/agave/root",
    "auth": {
      "type": "SSHKEYS"
    }
  }
}

To query for detailed information about a specific system, add the system id to the url and make another GET request.

This time, the response will be a JSON object with a full system description. The following is the description of a storage system. In the next section we talk more about storage systems and how to register one of your own.

Storage systems

A storage systems can be thought of as an individual data repository that you want to access through Agave. The following JSON object shows how a basic storage systems is described.

{
   "id":"sftp.storage.example.com",
   "name":"Example SFTP Storage System",
   "type":"STORAGE",
   "description":"My example storage system using SFTP to store data for testing",
   "storage":{
      "host":"storage.example.com",
      "port":22,
      "protocol":"SFTP",
      "rootDir":"/",
      "homeDir":"/home/systest",
      "auth":{
         "username":"systest",
         "password":"changeit",
         "type":"PASSWORD"
      }
   }
}

The first four attribute are common to both storage and execution systems. The storage attribute describes the connectivity and authentication information needed to connect to the remote system. Here we describe a SFTP server accessible on port 22 at host storage.example.com. We specify that we want the rootDir, or virtual system root exposed through Agave, to be the system’s physical root directory, and we want the authenticated user’s home directory to be the homeDir, or virtual home directory and base of all relative paths given to Agave. Finally, we tell Agave to use password based authentication and provided the necessary credentials.

The full list of storage system attributes is described in the following table.

Attribute	Type	Description
available	boolean	Whether the system is currently available for use in the API. Unavailable systems will not be visible to anyone but the owner. This differs from the `status` attribute in that a system may be UP, but not available for use in Agave. Defaults to true
description	string	Verbose description of this system.
id	string	Required: A unique identifier you assign to the system. A system id must be globally unique across a tenant and cannot be reused once deleted.
name	string	Required: Common display name for this system.
site	string	The site associated with this system. Primarily for logical grouping.
status	UP, DOWN, MAINTENANCE, UNKNOWN	The functional status of the system. Systems must be in UP status to be used.
storage	JSON Object	Required: Storage configuration describing the storage config defining how to connect to this system for data staging.
type	STORAGE, EXECUTION	Required: Must be STORAGE.

Supported data and authentication protocols

The example above described a system accessible by SFTP. Agave supports many different data and authentication protocols for interacting with your data. Sample configurations for many protocol combinations are given below.

Sample storage system definition with each supported data protocol and authentication configuration.

{
   "id":"sftp.storage.example.com",
   "name":"Example SFTP Storage System",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using SFTP to store data for testing",
   "site":"example.com",
   "storage":{
      "host":"storage.example.com",
      "port":22,
      "protocol":"SFTP",
      "rootDir":"/",
      "homeDir":"/home/systest",
      "auth":{
         "username":"systest",
         "password":"changeit",
         "type":"PASSWORD"
      }
   }
}

{
   "id":"sftp.storage.example.com",
   "name":"Example SFTP Storage Host",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using SFTP to store data for testing",
   "site":"example.com",
   "storage":{
      "host":"texas.rangers.mlb.com",
      "port":22,
      "protocol":"SFTP",
      "rootDir":"/",
      "homeDir":"/home/nryan",
      "auth":{
         "username":"nryan",
         "publicKey": "ssh-rsa AAAAB3NzaC1yc2EBBAADAQABMQPRgQChJ6bzejqSuJdTi+VwMif8qotuSSlYwrVt0EWVduKZHpzOnS1zlknAyYXmQQFcaJ+vNAQayVMTqv+A+1lzxppTdgZ0Dn42EOYWRa6B/IEMPzDuKb7F0qNFiH9m+OZJDYdIWS1rlN1oK32jHUi0xV8kM3KOLf2TIjDBUyZRpMGyQ== Generated by Nova",
         "privateKey": "-----BEGIN RSA PRIVATE KEY-----nMIVCXAIBAAKBgQRhJ6bzejqSuJdTi+VwMif8qoyuSSlYwrVt0EWVduKZHpzOnSManlknAyYXmQQFcaJ+vNAQayVqTqv+A+1lzxppTdgZ0Dn42EOYWRa6B/IEMPzDuKb7Fn0uNFiH9x+OZJDYdIWS1rN1oK4DjHUi0xV8kMN3OPSIU23asx1UyZRpMGyQIDAQABnAoGATrW4NAkJ3Kltt6+HQ1Ir95sxFNrE6AZJaLYllke3iwPJpCX1dDdpDcXa8AGbVnjFXJUGA+dPrJqbyGCHA7E3H342837k/twSRGkcCNpRx/MMdWnw3asea/K5L4XVeunXAn79vo/e28D4Uue62dSwIvDJKIFWMSAgUoD53ImushqlLUCQQDPkObaowzkboLCnv3Nyj16KFZ5Lp7r5q5MYfRxO7t53Z7AWoflr++KrAT3UbSKtqmC68CqbPzxSd6qHnbnkWaD0HAkEAxsJZh7xorwAtdYznMFOsO0w5HDHOB7MuAnjwUvYZVaM0wA7HkE4rnH5SFAwEMlwx82OJxv83CnkRdlXOexn95rwJBALd8cnboGCd/AZzCvX2R+5K5lZtvnhLvczkWho3qrcoG/aUw4l1K78h4VFOFKMJOwv53BXQisF9kW6+qY3/XM49UCQHqDn4AYQOALvPBZCdVtPqFGg6W8csCAE7a5ud8zbj8A+6swcEB0+YcyEkvzID8en1ekmno/ET1wwRnhH6g/tdJlcCQM55QS4Z7rR4psgFDkFvA+wmxlqTGsXJD32sw15g4A0bmzSXnbfFg8TBAjGTDW7l0P8prFrtQ8Wml14390b29l1ptAyE=n-----END RSA PRIVATE KEY-----",
         "type": "SSHKEYS"
      }
   }
}

{
   "id":"sftp.storage.example.com",
   "name":"Example SFTP Tunnel Storage Host",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using SFTP via an ssh tunnel to store data for testing",
   "site":"example.com",
   "storage":{
      "host":"storage.example.com",
      "port":22,
      "protocol":"SFTP",
      "rootDir":"/",
      "homeDir":"/home/nryan",
      "auth":{
         "username":"systest",
         "password":"changeit",
         "type":"PASSWORD"
      },
      "proxy":{
         "name":"My gateway proxy server",
         "host":"proxy.example.com",
         "port":22
      }
   }
}

{
   "id":"irods.storage.example.com",
   "name":"Example IRODS Storage Host",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using IRODS to store data for testing",
   "site":"example.com",
   "storage":{
      "host":"storage.example.com",
      "port":1247,
      "protocol":"IRODS",
      "homeDir":"/systest",
      "rootDir":"/demoZone/home",
      "auth":{
         "username":"systest",
         "password":"changeit",
         "type":"PASSWORD"
      },
      "resource":"demoResc",
      "zone":"demoZone"
   }
}

{
   "id":"irods.storage.example.com",
   "name":"Example IRODS Storage Host",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using IRODS with PAM authentication to store data for testing",
   "site":"example.com",
   "storage":{
      "host":"storage.example.com",
      "port":1247,
      "protocol":"IRODS",
      "homeDir":"/systest",
      "rootDir":"/demoZone/home",
      "auth":{
         "username":"systest",
         "password":"changeit",
         "type":"PAM"
      },
      "resource":"demoResc",
      "zone":"demoZone"
   }
}

{
   "id":"irods.storage.example.com",
   "name":"Example IRODS Storage Host",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using IRODS to store data for testing",
   "site":"example.com",
   "storage":{
      "host":"storage.example.com",
      "port":1247,
      "protocol":"IRODS",
      "homeDir":"/systest",
      "rootDir":"/demoZone/home",
      "auth":{
         "username":"systest",
         "password":"changeit",
         "type":"X509",
         "server":{
            "name":"IRODS MyProxy Server",
            "endpoint":"myproxy.example.com",
            "port":7512,
            "protocol":"MYPROXY"
         }
      },
      "resource":"demoResc",
      "zone":"demoZone"
   }
}

{
   "id":"irods.storage.example.com",
   "name":"Example IRODS Storage Host",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using IRODS to store data for testing",
   "site":"example.com",
   "storage":{
      "host":"storage.example.com",
      "port":1247,
      "protocol":"IRODS4",
      "homeDir":"/systest",
      "rootDir":"/demoZone/home",
      "auth":{
         "username":"systest",
         "password":"changeit",
         "type":"PASSWORD"
      },
      "resource":"demoResc",
      "zone":"demoZone"
   }
}

{
  "id": "demo.storage.example.com",
  "name": "Demo GRIDFTP demo vm",
  "status": "UP",
  "type": "STORAGE",
  "description": "My example storage system using GridFTP to store data for testing",
  "site": "example.com",
  "storage": {
    "host": "gridftp.example.com",
    "port": 2811,
    "protocol": "GRIDFTP",
    "rootDir": "/",
    "homeDir": "/home/systest",
    "auth": {
      "credential": "-----BEGIN CERTIFICATE-----nMIIDqjCCApKgAwIBAgIDJSFGMA0GCSqGSIb3DQEBBQUAMHsxCzAJBgNVBAYTAlVTnMTgwNgYDVQQKEy9OYXRpb25hbCBDZW50ZXIgZm9yIFN1cGVyY29tcHV0aW5nIEFwncGxpY2F0aW9uczEgMB4GA1UECxMXQ2VydGlmaWNhdGUgQXV0aG9yaXRpZXMxEDAOnBgNVBAMTB015UHJveHkwHhcNMTMxMDE0MDcyMjE4WhcNMTMxMDE0MTkyNzE4WjBnnMQswCQYDVQQGEwJVUzE4MDYGA1UEChMvTmF0aW9uYWwgQ2VudGVyIGZvciBTdXBlncmNvbXB1dGluZyBBcHBsaWNhdGlvbnMxHjAcBgNVBAMTFWlwbGFudCBDb21tdW5pndHkgVXNlcjCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAwfHbmtmJ1OUVwgDdn5oA8EsqihwRAi2xhZJYG/FFmOs38+0y7wTfORhVX/79XQMD3NqRJN8xhHQpmuoRynH9l9sbA9gbKaQsrpIYyExygrJ+qaZY0PccD+VAyPDjdLD86316AzWltEdV2E9b+OnCVioz62esJWSqOho8wya4Vo5svUCAwEAAaOBzjCByzAOBgNVHQ8BAf8EBAMCBLAwnHQYDVR0OBBYEFIJXT/jYmxaRywDbZudb1EXbxla5MB8GA1UdIwQYMBaAFNf8pQJ2nOvYT+iuh4OZQNccjx3tRMAwGA1UdEwEB/wQCMAAwNAYDVR0gBC0wKzAMBgorBgEEnAaQ+ZAIFMAwGCiqGSIb3TAUCAgMwDQYLKoZIhvdMBQIDAgEwNQYDVR0fBC4wLDAqnoCigJoYkaHR0cDovL2NhLm5jc2EudWl1Yy5lZHUvZjJlODlmZTMuY3JsMA0GCSqGnSIb3DQEBBQUAA4IBAQBDyW3FJ0xEIXEqk2NtiMqOM99MgufDPL0bxrR8CvPY5GRNn58EXU8RnSSJIuxL95PKclRPPOhGdB48eeF2H1MusOEUEEnHwzrZ1OUFUEpwKuqG6n0h411l3niRRx9wdJL4YITzAWZwpadzwj3d8aO9O/ttVJjGRc8A93I/d3fFAvHyvKnmlEaDrQZNBp1EtClW8xuxsfeUmyXkFlkRiKwqjkJGB8xBuzr8DfLomWq/mXaOkHznCo9nQxAs3gntszLOh+8U9aMxaeCsychRWxG3Y6Z33hrE0yz4AaVonVXu3Z7M+EN+nKbSVRblAzeKfQYYDOgsoFrugYbR9klv1so3Dt+n6n-----END CERTIFICATE-----n-----BEGIN RSA PRIVATE KEY-----nMIICWwIBAAKBgQDB8dua2YnU5RXCAN3mgDwSyqKHBECLbGFklgb8UWY6zfz7TLvBnN85GFVf/v1dAwPc2pEk3zGEdCma6hHIf2X2xsD2BsppCyukhjITHKCsn6ppljQ9xnwP5UDI8ON0sPzrfXoDNaW0R1XYT1v44JWKjPrZ6wlZKo6GjzDJrhWjmy9QIDAQABnAoGAcjrJZYMLM2FaV1G7YK/Wshq3b16JxZSoKF5U7vfihnAcuMaRL1R3IcAgfHlunIq2E7aIFnd+6sygVKXYo4alv5denekiucvKAyXK9F/VTTtLtajUnrvekLvSycKiEnbN9IgQ0ABCnlWyjgQMf64UUYBQtvU+lbRCs4jbuHxuyn5WECQQD8fJhlBHgA49hjnZBKnU9Xb+LEKhWDCEyIiOMMGY+2XhrGVvGF5KqJVusZEv8lbXNjzgSQFgLohEXVzn9v8tDFMzAkEAxKS5qCYHsTfgPlw3l1DLJRmG3SXrpevXSccBGpXQiUne9gfc9mlgnVTr5QQCXvvI673Y2LnNcnd94KEgvSrzhNwJACeS38/1g1mgXKo3ZTUUztBLinQ7sn463sQHsI6U8xGCbm/n8LMrxA8CsJadg6A6J3vdLpnm2U3YbZm1mqVhGNkQJAdsxxnoUVAdm8kWWhK6W6VG9e9I1OqdrXxfY/tecsyjg6D1a1Qb8mfuj4DoaKjCme69To8nZ3moZXRBWkypzYQopwJAB/zr1UpFz6vY4sIm3Gw3ll/ruNGCr2dzjTyLSGglCOf0nUljJ1FGLyW647JzGPMLcfdb0iEexzCEii9YUFUN1Ow==n-----END RSA PRIVATE KEY-----",
      "type": "X509"
    }
  }
}

{
  "id": "demo.storage.example.com",
  "name": "Demo GRIDFTP + MyProxy Storage System",
  "status": "UP",
  "type": "STORAGE",
  "description": "My example storage system using GridFTP with MyProxy to store data for testing",
  "site": "example.com",
  "storage": {
    "host": "gridftp.example.com",
    "port": 2811,
    "protocol": "GRIDFTP",
    "rootDir": "/",
    "homeDir": "/home/systest",
    "auth": {
      "username": "systest",
      "password": "changeit",
      "type": "X509",
      "server": {
        "name": "XSEDE MyProxy Server",
        "endpoint": "myproxy.example.com",
        "port": 7512,
        "protocol": "MYPROXY"
      }
    }
  }
}

{
  "id": "demo.storage.example.com",
  "name": "Demo GRIDFTP + MyProxy Storage System",
  "status": "UP",
  "type": "STORAGE",
  "description": "My example storage system using GridFTP with MyProxy to store data for testing",
  "site": "example.com",
  "storage": {
    "host": "gridftp.example.com",
    "port": 2811,
    "protocol": "GRIDFTP",
    "rootDir": "/",
    "homeDir": "/home/systest",
    "auth": {
      "type": "X509",
      "server": {
        "name": "My Trusted MPG Server",
        "endpoint": "https://api.example.com/myproxy/v2/",
        "port": 443,
        "protocol": "MPG"
      }
    }
  }
}

{
   "id":"demo.storage.example.com",
   "name":"Example Amazon S3 Storage System",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using Amazon S3 to store data for testing",
   "site":"aws.amazon.com",
   "storage":{
      "host": "s3-website-us-east-1.amazonaws.com",
      "port": 443,
      "protocol": "S3",
      "homeDir": "/",
      "rootDir": "/",
      "container": "mybucket",
      "auth": {
          "publicKey": "AKCA...1RCF",
          "privateKey": "8xj3...g/4+",
          "type": "APIKEYS"
      }
   }
}

{
   "id":"local.storage.example.com",
   "name":"Example LOCAL Storage Host",
   "status":"UP",
   "type":"STORAGE",
   "description":"My example storage system using the local file system to store data for testing",
   "site":"example.com",
   "storage":{
      "host":"localhost",
      "protocol":"LOCAL",
      "rootDir":"/",
      "homeDir":"/home/systest"
   }
}

In each of the examples above, the storage objects were slightly different, each unique to the protocol used. Descriptions of every attribute in the storage> object and its children are given in the following tables.

storage attributes give basic connectivity information describing things like how to connect to the system and on what port.

Attribute	Type	Description
auth	JSON object	Required: A JSON object describing the default authentication credential for this system.
container	string	The container to use when interacting with an object store. Specifying a container provides isolation when exposing your cloud storage accounts so users do not have access to your entire storage account. This should be used in combination with delegated cloud credentials such as an AWS IAM user credential.
homeDir	string	The path on the remote system, relative to `rootDir` to use as the virtual home directory for all API requests. This will be the base of any requested paths that do not being with a ’/’. Defaults to ’/’, thus being equivalent to `rootDir`.
host	string	Required: The hostname or ip address of the storage server
port	int	Required: The port number of the storage server.
mirror	boolean	Whether the permissions set on the server should be pushed to the storage system itself. Currently, this only applies to IRODS systems.
protocol	FTP, GRIDFTP, IRODS, IRODS4, LOCAL, S3, SFTP	Required: The protocol used to authenticate to the storage server.
publicAppsDir	string	The path on the remote system where apps will be stored if this system is used as the default public storage system.
proxy	JSON Object	The proxy server through with Agave will tunnel when submitting jobs. Currently proxy servers will use the same authentication mechanism as the target server.
resource	string	The name of the default resource to use when defining an IRODS system.
rootDir	string	The path on the remote system to use as the virtual root directory for all API requests. Defaults to ’/’.
zone	string	The name of the default zone to use when defining an IRODS system.

storage.auth attributes give authentication information describing how to authenticate to the system specified in the storage config above.

Attribute	Type	Description
credential	string	The credential used to authenticate to the remote system. Depending on the authentication protocol of the remote system, this could be an OAuth Token, X.509 certificate.
internalUsername	string	The username of the internal user associated with this credential.
password	string	The password on the remote system used to authenticate.
privateKey	string	The private ssh key used to authenticate to the remote system.
publicKey	string	The public ssh key used to authenticate to the remote system.
server	JSON object	A JSON object describing the authentication server from which a valid credential may be obtained. Currently only auth type X509 supports this attribute.
type	APIKEYS, LOCAL, PAM, PASSWORD, SSHKEYS, or X509	Required: The path on the remote system where apps will be stored if this system is used as the default public storage system.
username	string	The remote username used to authenticate.

storage.auth.server attributes give information about how to obtain a credential that can be used in the authentication process. Currently only systems using the X509 authentication can leverage this feature to communicate with MyProxy and MyProxy Gateway servers.

Attribute	Type	Description
name	string	A descriptive name given to the credential server
endpoint	string	Required: The endpoint of the authentication server.
port	integer	Required: The port on which to connect to the server.
protocol	MPG, MYPROXY	Required: The protocol with which to obtain an authentication credential.

system.proxy configuration attributes give information about how to connect to a remote system through a proxy server. This often happens when the target system is behind a firewall or resides on a NAT. Currently proxy servers can only reuse the authentication configuration provided by the target system.

Attribute	Type	Description
name	string	Required: A descriptive name given to the proxy server.
host	string	Required: The hostname of the proxy server.
port	integer	Required: The port on which to connect to the proxy server. If null, the port in the parent storage config is used.

Creating a new storage system

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -F "fileToUpload=@sftp-password.json" https://sandbox.agaveplatform.org/systems/v2

systems-addupdate -v -F sftp-password.json

The response from the service will be similar to the following:

{
  "site": null,
  "id": "sftp.storage.example.com",
  "revision": 1,
  "default": false,
  "lastModified": "2016-09-06T17:46:42.621-05:00",
  "status": "UP",
  "description": "My example storage system using SFTP to store data for testing",
  "name": "Example SFTP Storage System",
  "owner": "nryan",
  "globalDefault": false,
  "available": true,
  "uuid": "4036169328045649434-242ac117-0001-006",
  "public": false,
  "type": "STORAGE",
  "storage": {
    "mirror": false,
    "port": 22,
    "homeDir": "/home/systest",
    "protocol": "SFTP",
    "host": "storage.example.com",
    "publicAppsDir": null,
    "proxy": null,
    "rootDir": "/",
    "auth": {
      "type": "PASSWORD"
    }
  },
  "_links": {
    "roles": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com/roles"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "credentials": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com/credentials"
    },
    "self": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com"
    },
    "metadata": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/?q=%7B%22associationIds%22%3A%224036169328045649434-242ac117-0001-006%22%7D"
    }
  }
}

Congratulations, you just added your first system. This storage system can now be used by the Files service to manage data, the Transfer service as a source or destination of data movement, the Apps service as a application repository, and the Jobs Service as both a staging and archiving destination.

Notice that the JSON returned from the Systems service is different than what was submitted. Several fields have been added, and several other have been removed. On line 3, the UUID of the system has been added. This is the same UUID that is used in notifications and metadata references. On line 5, the status value was added in and assigned a default value since we did not specify it. Ditto for the site attribute on line 8.

Three new fields were added on lines 9-11. revision is the number of times this system has been updated. This being our first time registering the system, it is set to 1. public tells whether this system is published as a shared resource for all users. We will cover this more in the section on System scope. lastModified is a timestamp of the last time the system was updated.

In the storage object, the publicAppsDir and mirror fields were both added and set to their default values. In this example we are not using a proxy server, so it was defaulted to null. Last, and most important, all authentication information has been omitted from the response object. Regardless of the authentication type, no user credential information will ever be returned once they are stored.

Execution Systems

In contrast to storage systems, execution systems specify compute resources where application binaries can be run. In addition to the storage attribute found in storage systems, execution systems also have a login attribute describing how to connect to the remote system to submit jobs as well as several other attributes that allow Agave to determine how to stage data and run software on the system. The full list of execution system attributes is given in the following tables.

Name	Type	Description
available	boolean	Whether the system is currently available for use in the API. Unavailable systems will not be visible to anyone but the owner. This differs from the `status` attribute in that a system may be UP, but not available for use in Agave. Defaults to true
description	string	Verbose description of this system.
environment	String	List of key-value pairs that will be added to the environment prior to execution of any command.
executionType	HPC, Condor, CLI	Required: Specifies how jobs should go into the system. HPC and Condor will leverage a batch scheduler. CLI will fork processes.
id	string	Required: A unique identifier you assign to the system. A system id must be globally unique across a tenant and cannot be reused once deleted.
maxSystemJobs	integer	Maximum number of jobs that can be queued or running on a system across all queues at a given time. Defaults to unlimited.
maxSystemJobsPerUser	integer	Maximum number of jobs that can be queued or running on a system for an individual user across all queues at a given time. Defaults to unlimited.
name	string	Required: Common display name for this system.
queues	JSON Array	An array of batch queue definitions providing descriptive and quota information about the queues you want to expose on your system. If not specified, no other system queues will be available to jobs submitted using this system.
scheduler	LSF, LOADLEVELER, PBS, SGE, CONDOR, FORK, COBALT, TORQUE, MOAB, SLURM, CUSTOM_LSF, CUSTOM_LOADLEVELER, CUSTOM_PBS, CUSTOM_GRIDENGINE, CUSTOM_CONDOR, FORK, CUSTOM_COBALT, CUSTOM_TORQUE, CUSTOM_MOAB, CUSTOM_SLURM, UNKNOWN	Required: The type of batch scheduler available on the system. This only applies to systems with executionType HPC and CONDOR. The *_CUSTOM version of each scheduler provides a mechanism for you to override the default scheduler directives added by Agave and explicitly add your own through the customDirectives field in each of the batchQueue definitions for your system.
scratchDir	string	Path to use for a job scratch directory. This value is the first choice for creating a job`s working directory at runtime. The path will be resolved relative to the `rootDir` value in the storage config if it begins with a “/”, and relative to the system `homeDir` otherwise.
site	string	The site associated with this system. Primarily for logical grouping.
startupScript	String	Path to a script that will be run prior to execution of any command on this system. The path will be a standard path on the remote system. A limited set of system macros are supported in this field. They are rootDir, homeDir, systemId, workDir, and homeDir. The standard set of runtime job attributes are also supported. Between the two set of macros, you should be able to construct distinct paths per job, user, and app. Any environment variables defined in the system description will be added after this script is sourced. If this script fails, output will be logged to the .agave.log file in your job directory. Job submission will still continue regardless of the exit code of the script.
status	UP, DOWN, MAINTENANCE, UNKNOWN	The functional status of the system. Systems must be in UP status to be used.
storage	JSON Object	Required: Storage configuration describing the storage config defining how to connect to this system for data staging.
type	STORAGE, EXECUTION	Required: Must be EXECUTION.
workDir	string	Path to use for a job working directory. This value will be used if no `scratchDir` is given. The path will be resolved relative to the `rootDir` value in the storage config if it begins with a “/”, and relative to the system `homeDir` otherwise.

Startup startupScript

Every time Agave establishes a connection to an execution system, local or remote, it will attempt to source the startupScript provided in your system definition. The value of startupScript may be an absolute path on the system (ie. “/usr/local/bin/common_aliases.sh”, “/home/nryan/.bashrc”, etc.) or a path relative to physical home directory of the account used to authenticate to the system (“.bashrc”, “.profile”, “agave/scripts/startup.sh”, etc).

The startupScript field supports the use of template variables which Agave will resolve at runtime before establishing a connection. If you would prefer to specify the startup script as a virtualized path on the system, prepend ${SYSTEM_ROOT_DIR} to the path. If the system will be made public, you can specify a file relative to the home directory of the calling user by prefixing your startupScript value with ${SYSTEM_ROOT_DIR}/${SYSTEM_HOME_DIR}/${USERNAME} A full list of the variables available is given in the following table.

Variable	Description
SYSTEM_ID	ID of the system (ex. ssh.execute.example.com)
SYSTEM_UUID	fThe UUID of the system
SYSTEM_STORAGE_PROTOCOL	The protocol used to move data to and from this system
SYSTEM_STORAGE_HOST	The storage host for this sytem
SYSTEM_STORAGE_PORT	The storage port for this system
SYSTEM_STORAGE_RESOURCE	The system resource for iRODS systems
SYSTEM_STORAGE_ZONE	The system zone for iRODS systems
SYSTEM_STORAGE_ROOTDIR	The virtual root directory exposed on this system
SYSTEM_STORAGE_HOMEDIR	The home directory on this system relative to the STORAGE_ROOT_DIR
SYSTEM_STORAGE_AUTH_TYPE	The storage authentication method for this system
SYSTEM_STORAGE_CONTAINER	The the object store bucket in which the rootDir resides.
SYSTEM_LOGIN_PROTOCOL	The protocol used to establish a session with this system (eg SSH, GSISSH, etc)
SYSTEM_LOGIN_HOST	The login host for this system
SYSTEM_LOGIN_PORT	The login port for this system
SYSTEM_LOGIN_AUTH_TYPE	The login authentication method for this system
SYSTEM_OWNER	The username of the user who created the system.
AGAVE_JOB_NAME	The slugified version of the name of the job. See the section on Conventions for more information about slugs.
AGAVE_JOB_ID	The unique identifier of the job.
AGAVE_JOB_APP_ID	The appId for which the job was requested.
AGAVE_JOB_BATCH_QUEUE	The batch queue on the AGAVE_JOB_EXECUTION_SYSTEM to which the job was submitted.
AGAVE_JOB_EXECUTION_SYSTEM	The Agave execution system id where this job is running.
AGAVE_JOB_ARCHIVE_PATH	The path on the archiveSystem where the job output will be copied if archiving is enabled.
AGAVE_JOB_OWNER	The username of the job owner.
AGAVE_JOB_TENANT	The id of the tenant to which the job was submitted.
MONITOR_ID	The ID of the monitor.
MONITOR_CHECK_ID	The ID of the monitor check making the request.
MONITOR_OWNER	The username of the user who created the monitor.

Schedulers and system execution types

Agave supports job execution both interactively and through batch queueing systems (aka schedulers). We cover the mechanics of job submission in the Job Management tutorial. Here we just point out that regardless of how your job is actually run on the underlying system, the process of submitting, monitoring, sharing, and otherwise interacting with your job through Agave is identical. Describing the scheduler and execution types for your system is really just a matter of picking the most efficient and/or available mechanism for running jobs on your system.

As you saw in the table above, executionType refers to the classification of jobs going into the system and scheduler refers to the type of batch scheduler used on a system. These two fields help limit the range of job submission options used on a specific system. For example, it is not uncommon for a HPC system to accept jobs from both a Condor scheduler and a batch scheduler. It is also possible, though generally discouraged, to fork jobs directly on the command line. With so many options, how would users publishing apps on such a system know what mechanism to use? Specifying the execution type and scheduler help narrow down the options to a single execution mechanism.

Thankfully, picking the right combination is pretty simple. The following table illustrates the available combinations.

`executionType`	`scheduler`	Description
HPC	LSF, LOADLEVELER, PBS, SGE, TORQUE, MOAB, SLURM, CUSTOM_LSF, CUSTOM_LOADLEVELER, CUSTOM_PBS, CUSTOM_GRIDENGINE, CUSTOM_SLURM	Jobs will be submitted to the local scheduler using the appropriate scheduler commands. Systems with this execution type will not allow forked jobs.
CONDOR, CUSTOM_CONDOR	CONDOR	Jobs will be submitted to the condor scheduler running locally on the remote system. Agave will not do any installation for you, so the setup and administration of the Condor server is up to you.
CLI	FORK	Jobs will be started as a forked process and monitored using the system process id.

Defining system queues

Agave supports the notion of multiple submit queues. On HPC systems, queues should map to actual batch scheduler queues on the target server. Additionally, queues are used by Agave as a mechanism for implementing quotas on job throughput in a given queue or across an entire system. Queues are defined as a JSON array of objects assigned to the queues attribute. The following table summarizes all supported queue parameters.

Name	Type	Description
name	string	Arbitrary name for the queue. This will be used in the job submission process, so it should line up with the name of an actual queue on the execution system.
maxJobs	integer	Maximum number of jobs that can be queued or running within this queue at a given time. Defaults to 10. -1 for no limit
maxUserJobs	integer	Maximum number of jobs that can be queued or running by any single user within this queue at a given time. Defaults to 10. -1 for no limit
maxNodes	integer	Maximum number of nodes that can be requested for any job in this queue. -1 for no limit
maxProcessorsPerNode	integer	Maximum number of processors per node that can be requested for any job in this queue. -1 for no limit
maxMemoryPerNode	string	Maximum memory per node for jobs submitted to this queue in ###.#[E\|P\|T\|G]B format.
maxRequestedTime	string	Maximum run time for any job in this queue given in hh:mm:ss format.
customDirectives	string	Arbitrary text that will be appended to the end of the scheduler directives in a batch submit script. This could include a project number, system-specific directives, etc.
default	boolean	True if this is the default queue for the system, false otherwise.

Configuring quotas

Sample batch queue definitions specifying various use cases.

{
    "name":"short_job",
    "mappedName": null,
    "maxJobs":100,
    "maxUserJobs":10,
    "maxNodes":32,
    "maxMemoryPerNode":"64GB",
    "maxProcessorsPerNode":12,
    "maxRequestedTime":"00:15:00",
    "customDirectives":null,
    "default":true
}

# Restrict the queue to having at most 10 total jobs in it at once. Jobs may run for no more than an hour.
{
    "name":"small_q",
    "mappedName": null,
    "maxJobs":10,
    "maxUserJobs":1,
    "maxNodes":32,
    "maxMemoryPerNode":"64GB",
    "maxProcessorsPerNode":12,
    "maxRequestedTime":"01:00:00",
    "customDirectives":null,
    "default":true
}

# Restrict the queue to only running single node jobs.
{
    "name":"short_job",
    "mappedName": null,
    "maxJobs":100,
    "maxUserJobs":10,
    "maxNodes":1,
    "maxMemoryPerNode":"64GB",
    "maxProcessorsPerNode":12,
    "maxRequestedTime":"24:00:00",
    "customDirectives":null,
    "default":true
}

# Create two queues.
# - "big_mem" allows single node jobs with memory up to 1TB.
# - "big_compute" allows jobs with up to 256 nodes, and 16GB of memory per node.
[
  {
    "name":"big_mem",
    "mappedName": null,
    "maxJobs":10,
    "maxUserJobs":1,
    "maxNodes":1,
    "maxMemoryPerNode":"1TB",
    "maxProcessorsPerNode":12,
    "maxRequestedTime":"12:00:00",
    "customDirectives":null,
    "default":true
  },
  {
    "name":"big_compute",
    "mappedName": null,
    "maxJobs":10,
    "maxUserJobs":10,
    "maxNodes":256,
    "maxMemoryPerNode":"16GB",
    "maxProcessorsPerNode":12,
    "maxRequestedTime":"24:00:00",
    "customDirectives":null,
    "default":true
  }
]

In the batch queues table above, several attributes exist to specify limits on the number of total jobs and user jobs in a given queue. Corresponding attributes exist in the execution system to specify limits on the number of total and user jobs across an entire system. These attributes, when used appropriately, can be used to tell Agave how to enforce limits on the concurrent activity of any given user. They can also ensure that Agave will not unfairly monopolize your systems as your application usage grows.

If you have ever used a shared HPC system before, you should be familiar with batch queue quotas. If not, the important thing to understand is that they are a critical tool to ensure fair usage of any shared resource. As the owner/administrator for your registered system, you can use the batch queues you define to enforce whatever usage policy you deem appropriate.

Consider one example where you are using a VM to run image analysis routines on demand through Agave, your server will become memory bound and experience performance degradation if too many processes are running at once. To avoid this, you can set a limit using a batch queue configuration that limits the number of simultaneous tasks that can run at once on your server.

Another example where quotas can be helpful is to help you properly partitioning your system resources. Consider a user analyzing unstructured data. The problem is computationally and memory intensive. To preserve resources, you could create one queue with a moderate value of `maxJobs` and conservative `maxMemoryPerNode`, `maxProcessorsPerNode`, and `maxNodes` values to allow good throughput of small job. You could then create another queue with large `maxMemoryPerNode`, `maxProcessorsPerNode`, and `maxNodes` values while only allowing a single job to run at a time. This gives you both high throughput and high capacity on a single system.

The following sample queue definitions illustrate some other interesting use cases.

Customizing custom scheduler directives

Pseudocode for generating scheduler directives for each scheduler type


#!/bin/bash
#BSUB -J <& Slug.slugify(job.name) &>
#BSUB -oo <& Slug.slugify(job.name) + "-" + job.uuid &>.out
#BSUB -e <& Slug.slugify(job.name) + "-" + job.uuid &>.err
#BSUB -W <& roundToMinute(job.maxRunTime) &>
#BSUB -q <& job.batchQueue.mappedName &>
#BSUB -L bash

<& if (job.app.parallelism == ParallelismType.PTHREAD) { &>
  <& "#BSUB -n " + job.nodeCount &>
  <& "#BSUB -R 'span[ptile=1]'" &>
<& } else if (job.app.parallelism == ParallelismType.SERIAL) { &>
  <& "#BSUB -n " + job.nodeCount &>
  <& "#BSUB -R 'span[ptile=1]'" &>
<& } else { &>
  <& "#BSUB -n " + (job.nodeCount * job.processorsPerNode) &>
  <& "#BSUB -R 'span[ptile=" + job.processorsPerNode + "]'" &>
<& } &>

#BSUB <& job.batchQueue.customDirectives &>


#!/bin/bash
#PBS -N <& Slug.slugify(job.name) &>
#PBS -o <& Slug.slugify(job.name) + "-" + job.uuid &>.out
#PBS -e <& Slug.slugify(job.name) + "-" + job.uuid &>.err
#PBS -l cput=<& job.maxRunTime &>
#PBS -l walltime=<& job.maxRunTime &>
#PBS -q <& job.batchQueue.mappedName &>
#PBS -l nodes=<& job.nodeCount &>:ppn=<& job.processorsPerNode &>
#PBS <& job.batchQueue.customDirectives &>


#!/bin/bash
#@ - <& Slug.slugify(job.name) &>
#@ environment = COPY_ALL
#@ output = <& Slug.slugify(job.name) + "-" + job.uuid &>.out
#@ error = <& Slug.slugify(job.name) + "-" + job.uuid &>.err
#@ class = NORMAL
#@ acct_no = NONE
#@ wall_cock_limit = <& job.maxRunTime &>

<& if (job.app.parallelism == ParallelismType.PTHREAD) { &>
  #@ job_type = MPICH
  #@ nodes = 1
  #@ tasks_per_node = <& job.processorsPerNode &>
<& } else if (job.app.parallelism == ParallelismType.SERIAL) { &>
  #@ job_type = MPICH
  #@ nodes = 1
  #@ tasks_per_node = <& job.processorsPerNode &>
<& } else { &>
  <& "#@ -n " + (job.nodeCount * job.processorsPerNode) &>
  <& "#@ -R 'span[ptile=" + job.processorsPerNode + "]'" &>
<& } &>

#BSUB <& job.batchQueue.customDirectives &>


#!/bin/bash
#PBS -N <& Slug.slugify(job.name) &>
#PBS -o <& Slug.slugify(job.name) + "-" + job.uuid &>.out
#PBS -e <& Slug.slugify(job.name) + "-" + job.uuid &>.err
#PBS -l cput=<& job.maxRunTime &>
#PBS -l walltime=<& job.maxRunTime &>
#PBS -q <& job.batchQueue.mappedName &>
#PBS -l nodes=<& job.nodeCount &>:ppn=<& job.processorsPerNode &>
#PBS <& job.batchQueue.customDirectives &>


#!/bin/bash
#$ -N <& Slug.slugify(job.name) &>
#$ -cwd
#$ -V
#$ -o <& Slug.slugify(job.name) + "-" + job.uuid &>.out
#$ -e <& Slug.slugify(job.name) + "-" + job.uuid &>.err
#$ -l h_rt=<& job.maxRunTime &>
#$ -pe <& job.nodeCount &> way <& job.processorsPerNode &>
#$ -q <& job.batchQueue.mappedName &>
#$ <& job.batchQueue.customDirectives &>


#!/bin/bash
#SBATCH -J <& Slug.slugify(job.name) &>
#SBATCH -o <& Slug.slugify(job.name) + "-" + job.uuid &>.out
#SBATCH -e <& Slug.slugify(job.name) + "-" + job.uuid &>.err
#SBATCH -t <& job.maxRunTime &>
#SBATCH -q <& job.batchQueue.mappedName &>
#SBATCH -N <& job.nodeCount &> -p <& job.processorsPerNode &>
#SBATCH <& job.batchQueue.customDirectives $>

If your system definition is configured to use a scheduler, Agave will automatically inject the appropirate default scheduler directives into the header of your wrapper template prior to submission. Pseudocode for how the headers are generated for each scheduler type are defined below.

You may add additional scheduler directives on a queue-by-queue basis in your system definition. If you need a higher degree of customization, update your system definition prefixing your existing schedulerType value with “CUSTOM_”. This will tell Agave to use a minimal set of scheduler directives any time it finds a value defined for the queue’s customDirectives. To allow you the highest degree of customization, the customDirectives value will be filtered, resolving the following macros with the runtime values for the job.

Variable	Description
JOB_APP_ID	The id of the app being run.
JOB_ARCHIVE	Whether Agave will attempt to archive the job. Values “true” or “false”.
JOB_ARCHIVE_PATH	The path on the archive system where the job output will be staged.
JOB_ARCHIVE_SYSTEM	The Agave storage system id to which the job output will be archived. This will be NULL if the the job is not archived.
JOB_ARCHIVE_URL	The Agave URL for the archived data.
JOB_BATCH_QUEUE	The batch queue of the JOB_EXECUTION_SYSTEM on which the job is assigned.
JOB_ID	The unique id used to reference the job within Agave.
JOB_EXECUTION_SYSTEM	The agave execution system id on which the job will run.
JOB_MAX_RUNTIME	The max job run from the job request in HH:MM:SS format.
JOB_MAX_RUNTIME_MILLISECONDS	The max job run time from the job request converted to milliseconds.
JOB_MAX_RUNTIME_SECONDS	The max job run time from the job request converted to seconds.
JOB_MEMORY_PER_NODE	The memory requested per node in the job request in GB.
JOB_NAME	The job name converted to a slug
JOB_NAME_RAW	The user-supplied name of the job
JOB_NODE_COUNT	The number of nodes from the job request.
JOB_OWNER	The username of the user who submitted the job request.
JOB_PARAMETERS	The serialized JSON object representing the job parameters.
JOB_PROCESSORS_PER_NODE	The processors per node from the job request.
JOB_SYSTEM	ID of the job execution system (ex. ssh.execute.example.com)
JOB_TENANT	The code of the tenant to which the job was submitted.

Supported login protocols

> Sample execution system login configurations for supported authentication mechansims.

{
  "host": "execute.example.com",
  "port": 22,
  "protocol": "SSH",
  "auth": {
    "username": "systest",
    "password": "changeit",
    "type": "PASSWORD"
  }
}

{
  "host": "execute.example.com",
  "port": 22,
  "protocol": "SSH",
  "auth": {
    "username":"nryan",
     "publicKey": "ssh-rsa AAAAB3NzaC1yc2EBBAADAQABMQPRgQChJ6bzejqSuJdTi+VwMif8qotuSSlYwrVt0EWVduKZHpzOnS1zlknAyYXmQQFcaJ+vNAQayVMTqv+A+1lzxppTdgZ0Dn42EOYWRa6B/IEMPzDuKb7F0qNFiH9m+OZJDYdIWS1rlN1oK32jHUi0xV8kM3KOLf2TIjDBUyZRpMGyQ== Generated by Nova",
     "privateKey": "-----BEGIN RSA PRIVATE KEY-----nMIVCXAIBAAKBgQRhJ6bzejqSuJdTi+VwMif8qoyuSSlYwrVt0EWVduKZHpzOnSManlknAyYXmQQFcaJ+vNAQayVqTqv+A+1lzxppTdgZ0Dn42EOYWRa6B/IEMPzDuKb7Fn0uNFiH9x+OZJDYdIWS1rN1oK4DjHUi0xV8kMN3OPSIU23asx1UyZRpMGyQIDAQABnAoGATrW4NAkJ3Kltt6+HQ1Ir95sxFNrE6AZJaLYllke3iwPJpCX1dDdpDcXa8AGbVnjFXJUGA+dPrJqbyGCHA7E3H342837k/twSRGkcCNpRx/MMdWnw3asea/K5L4XVeunXAn79vo/e28D4Uue62dSwIvDJKIFWMSAgUoD53ImushqlLUCQQDPkObaowzkboLCnv3Nyj16KFZ5Lp7r5q5MYfRxO7t53Z7AWoflr++KrAT3UbSKtqmC68CqbPzxSd6qHnbnkWaD0HAkEAxsJZh7xorwAtdYznMFOsO0w5HDHOB7MuAnjwUvYZVaM0wA7HkE4rnH5SFAwEMlwx82OJxv83CnkRdlXOexn95rwJBALd8cnboGCd/AZzCvX2R+5K5lZtvnhLvczkWho3qrcoG/aUw4l1K78h4VFOFKMJOwv53BXQisF9kW6+qY3/XM49UCQHqDn4AYQOALvPBZCdVtPqFGg6W8csCAE7a5ud8zbj8A+6swcEB0+YcyEkvzID8en1ekmno/ET1wwRnhH6g/tdJlcCQM55QS4Z7rR4psgFDkFvA+wmxlqTGsXJD32sw15g4A0bmzSXnbfFg8TBAjGTDW7l0P8prFrtQ8Wml14390b29l1ptAyE=n-----END RSA PRIVATE KEY-----",
     "type": "SSHKEYS"
  }
}

{
  "host":"execute.example.com",
  "port":22,
  "protocol":"SSH",
  "auth":{
    "username":"systest",
    "password":"changeit",
    "type":"PASSWORD"
  },
  "proxy":{
    "name":"My gateway proxy server",
    "host":"proxy.example.com",
    "port":"22"
  }
}

{
{
  "host": "localhost",
  "protocol": "LOCAL",
  "auth": {
    "type": "LOCAL"
  }
}

{
   "host":"execute.example.com",
   "port":2222,
   "protocol":"GSISSH",
   "auth":{
      "credential": "-----BEGIN CERTIFICATE-----nMIIDqjCCApKgAwIBAgIDJSFGMA0GCSqGSIb3DQEBBQUAMHsxCzAJBgNVBAYTAlVTnMTgwNgYDVQQKEy9OYXRpb25hbCBDZW50ZXIgZm9yIFN1cGVyY29tcHV0aW5nIEFwncGxpY2F0aW9uczEgMB4GA1UECxMXQ2VydGlmaWNhdGUgQXV0aG9yaXRpZXMxEDAOnBgNVBAMTB015UHJveHkwHhcNMTMxMDE0MDcyMjE4WhcNMTMxMDE0MTkyNzE4WjBnnMQswCQYDVQQGEwJVUzE4MDYGA1UEChMvTmF0aW9uYWwgQ2VudGVyIGZvciBTdXBlncmNvbXB1dGluZyBBcHBsaWNhdGlvbnMxHjAcBgNVBAMTFWlwbGFudCBDb21tdW5pndHkgVXNlcjCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAwfHbmtmJ1OUVwgDdn5oA8EsqihwRAi2xhZJYG/FFmOs38+0y7wTfORhVX/79XQMD3NqRJN8xhHQpmuoRynH9l9sbA9gbKaQsrpIYyExygrJ+qaZY0PccD+VAyPDjdLD86316AzWltEdV2E9b+OnCVioz62esJWSqOho8wya4Vo5svUCAwEAAaOBzjCByzAOBgNVHQ8BAf8EBAMCBLAwnHQYDVR0OBBYEFIJXT/jYmxaRywDbZudb1EXbxla5MB8GA1UdIwQYMBaAFNf8pQJ2nOvYT+iuh4OZQNccjx3tRMAwGA1UdEwEB/wQCMAAwNAYDVR0gBC0wKzAMBgorBgEEnAaQ+ZAIFMAwGCiqGSIb3TAUCAgMwDQYLKoZIhvdMBQIDAgEwNQYDVR0fBC4wLDAqnoCigJoYkaHR0cDovL2NhLm5jc2EudWl1Yy5lZHUvZjJlODlmZTMuY3JsMA0GCSqGnSIb3DQEBBQUAA4IBAQBDyW3FJ0xEIXEqk2NtiMqOM99MgufDPL0bxrR8CvPY5GRNn58EXU8RnSSJIuxL95PKclRPPOhGdB48eeF2H1MusOEUEEnHwzrZ1OUFUEpwKuqG6n0h411l3niRRx9wdJL4YITzAWZwpadzwj3d8aO9O/ttVJjGRc8A93I/d3fFAvHyvKnmlEaDrQZNBp1EtClW8xuxsfeUmyXkFlkRiKwqjkJGB8xBuzr8DfLomWq/mXaOkHznCo9nQxAs3gntszLOh+8U9aMxaeCsychRWxG3Y6Z33hrE0yz4AaVonVXu3Z7M+EN+nKbSVRblAzeKfQYYDOgsoFrugYbR9klv1so3Dt+n6n-----END CERTIFICATE-----n-----BEGIN RSA PRIVATE KEY-----nMIICWwIBAAKBgQDB8dua2YnU5RXCAN3mgDwSyqKHBECLbGFklgb8UWY6zfz7TLvBnN85GFVf/v1dAwPc2pEk3zGEdCma6hHIf2X2xsD2BsppCyukhjITHKCsn6ppljQ9xnwP5UDI8ON0sPzrfXoDNaW0R1XYT1v44JWKjPrZ6wlZKo6GjzDJrhWjmy9QIDAQABnAoGAcjrJZYMLM2FaV1G7YK/Wshq3b16JxZSoKF5U7vfihnAcuMaRL1R3IcAgfHlunIq2E7aIFnd+6sygVKXYo4alv5denekiucvKAyXK9F/VTTtLtajUnrvekLvSycKiEnbN9IgQ0ABCnlWyjgQMf64UUYBQtvU+lbRCs4jbuHxuyn5WECQQD8fJhlBHgA49hjnZBKnU9Xb+LEKhWDCEyIiOMMGY+2XhrGVvGF5KqJVusZEv8lbXNjzgSQFgLohEXVzn9v8tDFMzAkEAxKS5qCYHsTfgPlw3l1DLJRmG3SXrpevXSccBGpXQiUne9gfc9mlgnVTr5QQCXvvI673Y2LnNcnd94KEgvSrzhNwJACeS38/1g1mgXKo3ZTUUztBLinQ7sn463sQHsI6U8xGCbm/n8LMrxA8CsJadg6A6J3vdLpnm2U3YbZm1mqVhGNkQJAdsxxnoUVAdm8kWWhK6W6VG9e9I1OqdrXxfY/tecsyjg6D1a1Qb8mfuj4DoaKjCme69To8nZ3moZXRBWkypzYQopwJAB/zr1UpFz6vY4sIm3Gw3ll/ruNGCr2dzjTyLSGglCOf0nUljJ1FGLyW647JzGPMLcfdb0iEexzCEii9YUFUN1Ow==n-----END RSA PRIVATE KEY-----",
      "type": "X509"
   }
}

{
   "host":"execute.example.com",
   "port":2222,
   "protocol":"GSISSH",
   "auth":{
      "username":"systest",
      "password":"changeit",
      "credential":"",
      "type":"X509",
      "server":{
        "name":"IRODS MyProxy Server",
        "endpoint":"myproxy.example.com",
        "port":7512,
        "protocol":"MYPROXY"
      }
   }
}

{
   "host":"execute.example.com",
   "port":2222,
   "protocol":"GSISSH",
   "auth":{
      "username":"systest",
      "type":"X509",
      "server": {
        "name": "My Trusted MPG Server",
        "endpoint": "https://api.example.com/myproxy/v2/",
        "port": 443,
        "protocol": "MPG"
      }
   }
}

As with storage systems, Agave supports several different protocols and mechanisms for job submission. We already covered scheduler and queue support. Here we illustrate the different login configurations possible. For brevity, only the value of the login JSON object is shown.

The full list of login configuration options is given in the following table. We omit the `login.auth` and `login.proxy` attributes as they are identical to those used in the storage config.

Attribute	Type	Description
auth	JSON object	Required: A JSON object describing the default login authentication credential for this system.
host	string	Required: The hostname or ip address of the server where the job will be submitted.
port	int	The port number of the server where the job will be submitted. Defaults to the default port of the protocol used.
protocol	SSH, GSISSH, LOCAL	Required: The protocol used to submit jobs for execution.
proxy	JSON Object	The proxy server through with Agave will tunnel when submitting jobs. Currently proxy servers will use the same authentication mechanism as the target server.

Scratch and work directories

In the Job Management tutorial we will dive into how Agave manages the end-to-end lifecycle of running a job. Here we point out two relevant attributes that control where data is staged and where your job will physically run. The `scratchDir` and `workDir` attributes control where the working directories for each job will be created on an execution system. The following table summarizes the decision making process Agave uses to determine where the working directories should be created.

`rootDir` value	`homeDir` value	`scratchDir` value	Effective system path for job working directories
/	/	—	/
/	/	/	/
/	/	/scratch	/scratch
/	/home/nryan	—	/home/nryan
/	/home/nryan	/	/
/	/home/nryan	/scratch	/scratch
/home/nryan	/	—	/home/nryan
/home/nryan	/	/	/home/nryan
/home/nryan	/	/scratch	/home/nryan/scratch
/home/nryan	/home	—	/home/nryan/home
/home/nryan	/home	/	/home/nryan
/home/nryan	/home	/scratch	/home/nryan/scratch

While it is not required, it is a best practice to always specify `scratchDir` and `workDir` values for your execution systems and, whenever possible, place them outside of the system `homeDir` to ensure data privacy. The reason for this is that the file system available on many servers is actually made up of a combination of physically attached storage, mounted volumes, and network mounts. Often times, your home directory will have a very conservative quota while the mounted storage will essentially be quota free. As the above table shows, when you do not specify a `scratchDir` or `workDir`, Agave will attempt to create your job work directories in your system `homeDir`. It is very likely that, in the course of running simulations, you will reach the quota on your home directory, thereby causing that job and all future jobs to fail on the system until you clear up more space. To avoid this, we recommend specifying a location with sufficient available space to handle the work you want to do.

Another common error that arises from not specifying thoughtful `scratchDir` and `workDir` values for your execution systems is jobs failing due to “permission denied” errors. This often happens when your `scratchDir` and/or `workDir` resolve to the actual system root. Usually the account you are using to access the system will not have permission to write to `/`, so all attempts to create a job working directory fail, accurately, due to a “permission denied” error.

Creating a new execution system

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -F "fileToUpload=@ssh-password.json" https://sandbox.agaveplatform.org/systems/v2

systems-addupdate -v -F ssh-password.json

The response from the server will be similar to the following.

{
   "id":"demo.execute.example.com",
   "uuid":"0001323106792914-5056a550b8-0001-006",
   "name":"Example SSH Execution Host",
   "status":"UP",
   "type":"EXECUTION",
   "description":"My example system using ssh to submit jobs used for testing.",
   "site":"example.com",
   "revision":1,
   "public":false,
   "lastModified":"2013-07-02T10:16:11.000-05:00",
   "executionType":"HPC",
   "scheduler":"SGE",
   "environment":null,
   "startupScript":"./bashrc",
   "maxSystemJobs":100,
   "maxSystemJobsPerUser":10,
   "workDir":"/work",
   "scratchDir":"/scratch",
   "queues":[
      {
         "name":"normal",
         "maxJobs":100,
         "maxUserJobs":10,
         "maxNodes":32,
         "maxMemoryPerNode":"64GB",
         "maxProcessorsPerNode":12,
         "maxRequestedTime":"48:00:00",
         "customDirectives":null,
         "default":true
      },
      {
         "name":"largemem",
         "maxJobs":25,
         "maxUserJobs":5,
         "maxNodes":16,
         "maxMemoryPerNode":"2TB",
         "maxProcessorsPerNode":4,
         "maxRequestedTime":"96:00:00",
         "customDirectives":null,
         "default":false
      }
   ],
   "login":{
      "host":"texas.rangers.mlb.com",
      "port":22,
      "protocol":"SSH",
      "proxy":null,
      "auth":{
         "type":"PASSWORD"
      }
   },
   "storage":{
      "host":"texas.rangers.mlb.com",
      "port":22,
      "protocol":"SFTP",
      "rootDir":"/home/nryan",
      "homeDir":"",
      "proxy":null,
      "auth":{
         "type":"PASSWORD"
      }
   }
}

Disabling a system

Disable a system

curl -sk -H "Authorization: Bearer $AUTH_TOKEN"
    -H "Content-Type: application/json"
    -X PUT --data-binary '{"action": "disable"}'
    https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-disable $SYSTEM_ID

The response will look something like the following:

{
  "site": null,
  "id": "sftp.storage.example.com",
  "revision": 1,
  "default": false,
  "lastModified": "2016-09-06T17:46:42.621-05:00",
  "status": "UP",
  "description": "My example storage system using SFTP to store data for testing",
  "name": "Example SFTP Storage System",
  "owner": "nryan",
  "globalDefault": false,
  "available": false,
  "uuid": "4036169328045649434-242ac117-0001-006",
  "public": false,
  "type": "STORAGE",
  "storage": {
    "mirror": false,
    "port": 22,
    "homeDir": "/home/systest",
    "protocol": "SFTP",
    "host": "storage.example.com",
    "publicAppsDir": null,
    "proxy": null,
    "rootDir": "/",
    "auth": {
      "type": "PASSWORD"
    }
  },
  "_links": {
    "roles": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com/roles"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "credentials": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com/credentials"
    },
    "self": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com"
    },
    "metadata": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/?q=%7B%22associationIds%22%3A%224036169328045649434-242ac117-0001-006%22%7D"
    }
  }
}

There may be times when you need to disable a system. If your system has scheduled maintenance periods, you may want to disable the system until the maintenance period ends. You can do this by making a PUT request on a monitor with the a field name action set to “disabled”, or simply updating the status to “MAINTENANCE”. While disabled, all apps and jobs will be disabled. All file operations will be rejected during system downtimes as well. Once restored, all operations will pick back up.

Enabling a system

Enable a system

curl -sk -H "Authorization: Bearer $AUTH_TOKEN"
    -H "Content-Type: application/json"
    -X PUT --data-binary '{"action": "enable"}'
    https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-enable $SYSTEM_ID

The response will look something like the following:

{
  "site": null,
  "id": "sftp.storage.example.com",
  "revision": 1,
  "default": false,
  "lastModified": "2016-09-06T17:46:42.621-05:00",
  "status": "UP",
  "description": "My example storage system using SFTP to store data for testing",
  "name": "Example SFTP Storage System",
  "owner": "nryan",
  "globalDefault": false,
  "available": true,
  "uuid": "4036169328045649434-242ac117-0001-006",
  "public": false,
  "type": "STORAGE",
  "storage": {
    "mirror": false,
    "port": 22,
    "homeDir": "/home/systest",
    "protocol": "SFTP",
    "host": "storage.example.com",
    "publicAppsDir": null,
    "proxy": null,
    "rootDir": "/",
    "auth": {
      "type": "PASSWORD"
    }
  },
  "_links": {
    "roles": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com/roles"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "credentials": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com/credentials"
    },
    "self": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/sftp.storage.example.com"
    },
    "metadata": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/?q=%7B%22associationIds%22%3A%224036169328045649434-242ac117-0001-006%22%7D"
    }
  }
}

Similarly, to enable a monitor, make a PUT request with the a field name action set to “enabled”. Once reenabled, the monitor will resume its previous check schedule as specified in the nextUpdate field, or immediately if that time has already expired.

Deleting systems

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X DELETE https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-delete $SYSTEM_ID

The call will return an empty result.

In the event you wish to delete a system, you can make a DELETE request on the system URL. Deleting a system will disable the system and all applications published on that system from use. Any running jobs will be continue to run, but all pending, archiving, paused, and staged jobs will be killed, and any data archived on that system will no longer be available. Restoring a deleted system requires intervention from your tenant admin. Once deleted, the system id cannot be reused at a later time. Use this operation with care.

Multi-user environments

If your application supports a multi-user environment and those users do not have API accounts, then you may run into a situation where you are juggling multiple user credentials for a single system. Agave has a solution for this problem in the for of its Internal User feature. You can map your application users into a private user store Agave provides you and assign those users credentials on your systems. This allows you to move seamlessly from community users to private users and back without having to alter your application code. For a deep discussion on the mechanics and implications of credential management with internal users, see the Internal User Credential Management guide.

System roles

Systems you register are private to you and you alone. You can, however, allow other Agave clients to utilize the system you define by granting them a role on the system using the systems roles services. The available roles are given in the table below.

Role	Description
GUEST	Gives any authenticated user readonly access to the system. No file operations or job executions are allowed for users with GUEST access.
USER	Gives a user the ability to run jobs and access data on the system.
PUBLISHER	All the rights of USER as well as the ability to publish applications listing the system as an execution host.
ADMIN	All the rights of PUBLISHER as well as the ability to edit and grant roles on the system details. Admins may use the system to access data and run jobs using the default credential assigned to the system, but they may not view or update any of the credentials stored by the system owner. It is not possible for anyone but the system owner to assign or leverage internal user credentials on a system.
OWNER	Reserved for the user that originally created the system. This role is non-revokable.

System scope

Throughout these tutorials and Beginner’s Guides, we have referred to both public and private systems. In addition to roles, systems have a concept of scope associated with them. Not to be confused with OAuth scope mentioned in the Authentication Guide, system scope refers to the availability of a system to the general user community. The following table lists the available scopes and their meanings.

Scope	Required role	Description
private	Admin	System is visible and available for use to the owner and to anyone whom they grant a role.
read only	Tenant admin	Storage system is visible and available for data browsing and download by any API user. Write access is restricted unless explicitly granted to a specific user.
public	Tenant admin	System is visible and available to all users for reading and writing. Virtual user home directories are enforced and write access outside of a user’s home directory is restricted unless explicitly granted by a system admin.

Private systems

All systems are private by default. This means that no one can use a system you register without you or another user with “admin” permissions granting them a role on that system. Most of the time, unless you are configuring a tenant for your organization, all the systems you register will stay private. Do not mistake the term private for isolated. Private simply means not public. Another way to think of private systems is as “invitation only.” You are free to share your system as many or as few people as you want and it will still remain a private system.

Readonly systems

Readonly systems are systems who have granted a GUEST role to the world group. Once this grant is made, any user will be able to browse the system’s entire file system regardless of individual permissions. Be careful when making a system readonly. Usually, the only reason you would do this is because you have configured the system rootDir to point to a dataset or volume that you want to publish for others to use. Carelessly making systems readonly can expose personal data stored on the system to every other API user. While your intentions may be pure, theirs may not be, so think through the implications of this action before you take it.

Public systems

Public systems are available for use by every API user within your tenant. Once public, systems inherit specific behavior unique to their type. We will cover each system type in turn.

Public Storage Systems

Public storage systems enforce a virtual user home directory with implied user permissions. The following table gives a brief summary of the permission implications. You can read more about data permissions in the Data Permissions tutorial.

`rootDir`	`homeDir`	URL path	User permission
/	/home	—	READ
/	/home	/	READ
/	/home	/var	READ
/	/home	systest	ALL
/	/home	systest/some/subdir	ALL
/	/home	rjohnson	NONE

Notice in the above example that on public systems, users will have implied ownership of a folder matching their username in the system’s homeDir. In the table, this means that user “systest” will have ownership of the physical home directory /home/systest on the system after it’s public. It is important that, before publishing a system, you make sure that the account used to access the system can actually write to these folders. Otherwise, users will not be able to access their data on the system you make public.

Public Execution Systems

Public execution systems do not share the same behavior as public storage systems. Unless explicit permission has been given, public execution systems are not accessible for data access by non-privileged users. This is because public systems allow all users to run applications on them and granting public access to the file system would expose user job data to all users. If you do need to expose the data on a public execution system, either register it again as a storage system (using an appropriate rootDir outside of the system scratchDir and workDir paths), or grant specific users a role on the system.

Publishing a system

To publish a system and make it public, you make a PUT request on the system’s url.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
     -H "Content-Type: application/json"
     -X PUT
     --data-binary '{"action":"publish"}'
     https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-publish -v $SYSTEM_ID

The response from the service will be the same system description we saw before, this time with the public attribute set to true.

Unpublishing a system

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
     -H "Content-Type: application/json"
     -X PUT
     --data-binary '{"action":"unpublish"}'
     https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-unpublish -v $SYSTEM_ID

The response from the service will be the same system description we saw before, this time with the public attribute set to false.

To unpublish a system, make the same request with the action attribute set to unpublish.

Default systems

As you continue to use Agave over time, it will not be uncommon for you to accumulate additional storage and execution systems through both self-registration and other people sharing their systems with you. It may even be the case that you have multiple public systems available to you. In this situation, it is helpful for both you and your users to specify what the default systems should be.

Default systems are the systems that are used when the user does not specify a system to use when performing a remote action in Agave. For example, specifying an archivePath in a job request, but no archiveSystem, or specifying a deploymentPath in an app description, but no deploymentSystem. In these situations, Agave will use the user’s default storage system.

Four types of default systems are possible. The following table describes them.

Type	Scope	Role needed to set	Description
storage	user default	USER	Default storage system for an individual user. This takes priority over any global defaults and will be used in all data operations in leu of a system being specified for this user.
storage	global default	Tenant admin	Default storage system for an entire tenant. This will be used as the default storage system whenever a user has not explicitly specified another. Only public systems may be made the global default.
execution	user default	USER	Default execution system for an individual user. This takes priority over any global defaults and will be used in all app and job operations in leu of an execution system being specified for this user. In the case of app registration, normal user role requirements apply.
execution	global default	Tenant admin	Default execution system for an entire tenant. This will be used as the default execution system whenever a user has not explicitly specified another. Only public systems may be made the global default.

Setting user default system

To set a system as the user’s default, you make a PUT request on the system’s url. Only systems the user has access to may be used as their default.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
     -H "Content-Type: application/json"
     -X PUT
     --data-binary '{"action":"setDefault"}'
     https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-setdefault -v $SYSTEM_ID

The response from the service will be the same system description we saw before, this time with the default attribute set to true.

Unsetting user default system

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
     -H "Content-Type: application/json"
     -X PUT
     --data-binary '{"action":"unsetDefault"}'
     https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-unsetdefault -v $SYSTEM_ID

The response from the service will be the same system description we saw before, this time with the default attribute set to false.

To remove a system as the user’s default, make the same request with the action attribute set to unsetDefault. Keep in mind that you cannot remove the global default system from being the user’s default. You can only set a different one to replace it.

Setting global default system

Tenant administrators may wish to set default storage and execution systems for an entire tenant. These are called global default systems. There may be at most one system of each type set as a global default. To set a global default system, first make sure that the system is public. Only public systems may be set as a global default. Next, make sure you have administrator permissions for your tenant. Only tenant admins may publish systems and manage the global defaults. Lastly, make a PUT request on the system’s url with an action attribute in the body set to unsetGlobalDefault.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
     -H "Content-Type: application/json"
     -X PUT
     --data-binary '{"action":"setGlobalDefault"}'
     https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-setdefault -v -G $SYSTEM_ID

The response from the service will be the same system description we saw before, this time with both the default and public attributes set to true.

To remove a system from being the global default, make the same request with the action attribute set to unsetGlobalDefault.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
     -H "Content-Type: application/json"
     -X PUT
     --data-binary '{"action":"unsetGlobalDefault"}'
     https://sandbox.agaveplatform.org/systems/v2/$SYSTEM_ID

systems-unsetdefault -v -G $SYSTEM_ID

This time the response from the service will have default set to false and public set to true.

Files

 /$$$$$$$$ /$$ /$$
| $$_____/|__/| $$
| $$       /$$| $$  /$$$$$$   /$$$$$$$
| $$$$$   | $$| $$ /$$__  $$ /$$_____/
| $$__/   | $$| $$| $$$$$$$$|  $$$$$$
| $$      | $$| $$| $$_____/ \____  $$
| $$      | $$| $$|  $$$$$$$ /$$$$$$$/
|__/      |__/|__/ \_______/|_______/

The Agave Files service allows you to manage data across multiple storage systems using multiple protocols. It supports traditional file operations such as directory listing, renaming, copying, deleting, and upload/download that are traditional to most file services. It also supports file importing from arbitrary locations, metadata assignment, and a full access control layer allowing you to keep your data private, share it with your colleagues, or make it publicly available.

Files service URL structure

Canonical URL for all file items accessible in the Platform

https://sandbox.agaveplatform.org/files/v2/media/system/$SYSTEM_ID/$PATH

Every file and directory referenced through the Files service has a canonical URL show in the first example. The following table defines each component:

Token	Description
$SYSTEM_ID	The id of the system where the file or directory lives. The correspond to the ids returned from the Systems service.
$PATH	(Optional:) The path on the remote system. By default, all paths are relative to the home directory defined in the system description. To specify an absolute path, prefix the path with a `/`. For more on path resolution, see the next section.

Agave also supports the concept of default systems. Excluding the /system/$SYSTEM_ID segments from the above URL, the Files service will automatically assume you are referencing your default storage system. Thus, if your default system was data.agaveplatform.org, the following two examples would be identical.

If data.agaveplatform.org is your default storage system then

https://sandbox.agaveplatform.org/files/v2/media/shared

is equivalent to this:

https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/shared

This comes in especially handy when referencing your default system paths in other contexts such as job requests and when interacting with the Agave CLI. A good example of this situation is when you have a global default storage system accessible to all your users. In this case, most users will use that for all of their data staging and archiving needs. These users may find it easier not to even think about the system they are using. The default system support in the Files service allows them to do just that.

Understanding file paths

One powerful, but potentially confusing feature of Agave is its support for virtualizing systems paths. Every registered system specifies both a root directory, rootDir, and a home directory, homeDir attribute in its storage configuration. rootDir tells Agave the absolute path on the remote system that it should treat as /. Similar to the Linux chroot command; no requests made to Agave will ever be resolved to locations outside of rootDir.

Type of storage system	Examples of rootDir values
Linux	Actual system root directory, `/` RAID array physically attached to the system NSF mount you want to share An arbitrary file path, such as your `$HOME` directory from which you want to server application data.
Cloud	A bucket on S3 A folder/marker file in your object store
iRODS	A specific resource or zone you want to expose. A collection you want to publish for use Your personal home folder

homeDir specifies the path, relative to rootDir, that Agave should use for relative paths. Since Agave is stateless, there is no concept of a current working directory. Thus, when you specify a path to Agave that does not begin with a /, Agave will always prefix the path with the value of homeDir. The following table gives several examples of how different combinations of rootDir, homeDir, and URL paths will be resolved by Agave. For a deeper dive into this subject, please see the Understanding Agave File Paths section.

“rootDir” value	“homeDir” value	Agave URL path	Resolved path on system
/	/	–	/
/	/	..	/
/	/	home	/home
/	/	/home	/home
/	/home/nryan	–	/home/nryan
/	/home/nryan	/	/
/	/home/nryan	..	/home
/	/home/nryan	nryan	/home/nryan/nryan
/	/home/nryan	/nryan	/nryan
/home/nryan	/	–	/home/nryan
/home/nryan	/	..	/home/nryan
/home/nryan	/home	/	/home/nryan
/home/nryan	/home	..	/home/nryan
/home/nryan	/home	home	/home/nryan/home/home
/home/nryan	/home	/bgibson	/home/nryan/bgibson

Transfering data

Before we talk about how to do basic operations on your data, let’s first talk about how you can move your data around. You already have a storage system available to you, so we will start with the “hello world” of data movement, uploading a file.

Uploading data

Uploading a file

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -X POST \
    -F "fileToUpload=@files/picksumipsum.txt" \
    https://sandbox.agaveplatform.org/files/v2/media/data.agaveplatform.org/nryan

files-upload -v -F files/picksumipsum.txt -S data.agaveplatform.org nryan

The response will look something like this:

{
    "internalUsername": null,
    "lastModified": "2014-09-03T10:28:09.943-05:00",
    "name": "picksumipsum.txt",
    "nativeFormat": "raw",
    "owner": "nryan",
    "path": "/home/nryan/picksumipsum.txt",
    "source": "http://127.0.0.1/picksumipsum.txt",
    "status": "STAGING_QUEUED",
    "systemId": "data.agaveplatform.org",
    "uuid": "0001409758089943-5056a550b8-0001-002",
    "_links": {
        "history": {
            "href": "https://sandbox.agaveplatform.org/files/v2/history/system/data.agaveplatform.org/nryan/picksumipsum.txt"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
        }
    }
}

You may upload data to a remote systems by performing a multipart POST on the FILES service. If you are using the Agave CLI, you can perform recursive directory uploads. If you are manually calling curl or building an app with the Agave SDK, you will need to implement the recursion yourself. You can take a look in the files-upload script to see how this is done. The following is an example of how to upload a file that we will use in the remainder of this tutorial.

You will see a progress bar while the file uploads, followed by a response from the server with a description of the uploaded file. Agave does not block during data movement operations, so it may be just a moment before the file physically shows up on the remote system.

Importing data

You can also have Agave download data from an external URL. Rather than making a multipart file upload request, you can pass in a JSON object with the URL and an optional target file name, type, and array of notifications subscriptions. Agave supports several protocols for ingestion listed in the next table.

Schema	Details
http	Supported with and without user info
https	Supported with and without user info
ftp	Anonymous FTP only
sftp	User info required in URL
agave	No user info supported.

To demonstrate how this works, we will import a README.md file from the Agave Samples git repository in Bitbucket.

Download a file from a web accessible URL

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST
    -- data &#039;{ "url":"https://github.com/agavetraining/science-api-samples/raw/master/README.md"}&#039;
    https://sandbox.agaveplatform.org/files/v2/media/data.agaveplatform.org/nryan

files-import -v -U "https://github.com/agavetraining/science-api-samples/raw/master/README.md"
    -S data.agaveplatform.org nryan

The response will look something like this:

{
    "name" : "README.md",
    "uuid" : "0001409758713912-5056a550b8-0001-002",
    "owner" : "nryan",
    "internalUsername" : null,
    "lastModified" : "2014-09-10T20:00:55.266-05:00",
    "source" : "https://github.com/agavetraining/science-api-samples/raw/master/README.md",
    "path" : "/home/nryan/README.md",
    "status" : "STAGING_QUEUED",
    "systemId" : "data.agaveplatform.org",
    "nativeFormat" : "raw",
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/README.md"
      },
      "system" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
      },
      "history" : {
        "href" : "https://sandbox.agaveplatform.org/files/v2/history/system/data.agaveplatform.org/nryan/README.md"
      }
    }
}

Downloading data from a third party is done offline as an asynchronous activity, so the response from the server will come right away. One thing worth noting is that the file length given in the response will always be -1. This is because, generally speaking, Agave does not know what the actual source file size is until after the repsonse is send back. The file size will be updated as the download progresses. You can track the progress by querying the destination file item’s history. An entry will be present showing the progress of the download.

For this exercise, the file we just downloaded is just a few KB, so you should see it appear in your home folder on data.agaveplatform.org almost immediately. If you were importing larger datasets, the transfer could take significantly longer depending on the network quality between Agave and the source location. In this case, you would see the file size continue to increase until it completed. In the event of a failed transfer, Agave will retry several times before canceling the transfer.

Transferring data

Transferring data between systems

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data-binary '{"url":"agave://stampede.tacc.utexas.edu//etc/motd"}' \
    https://sandbox.agaveplatform.org/files/v2/media/data.agaveplatform.org/nryan

files-import -v -U "agave://stampede.tacc.utexas.edu//etc/motd" -S data.agaveplatform.org nryan

The response from the service will be the same as the one we received importing a file.

Much like downloading data, Agave can manage the transfer of data between registered systems. This is, in fact, how data is staged prior to running a simulation. Data transfers are carried out asynchronously, so you can simply start a transfer and go about your business. Agave will ensure it completes. If you would like a notification when the transfer completes or reaches a certain stage, you can subscribe for one or more emails, webhooks, and/or realtime notifications, and Agave will alert them when as the transfer progresses. The following table lists the available file events. For more information about the events and notifications systems, please see the Notifications Guide and Event Reference.

In the example below, we will transfer a file from stampede.tacc.utexas.edu to data.agaveplatform.org. While the request looks pretty basic, there is a lot going on behind the scenes. Agave will authenticate to both systems, check permissions, stream data out of Stampede using GridFTP and proxy it into data.agaveplatform.org using the SFTP protocol, adjusting the transfer buffer size along the way to optimize throughput. Doing this by hand is both painful and error prone. Doing it with Agave is nearly identical to copying a file from one directory to another on your local system.

One of the benefits of the Files service is that it frees you up to work in parallel and scale with your application demands. In the next example we will use the Files service to create redundant archives of a shared project directory.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data-binary '{"url":"agave://data.agaveplatform.org/nryan/foo_project"}' \
    https://sandbox.agaveplatform.org/files/v2/media/system/nryan.storage1/

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data-binary '{"url":"agave://data.agaveplatform.org/nryan/foo_project"}' \
    https://sandbox.agaveplatform.org/files/v2/media/system/nryan.storage2/

files-import -v -U "agave://data.agaveplatform.org/nryan/foo_project" -S nryan.storage1

files-import -v -U "agave://data.agaveplatform.org/nryan/foo_project" -S nryan.storage2

Basic data operations

Now that we understand how to move data into, out of, and between systems, we will look at how to perform file operations on the data. Again, remember that the Files service gives you a common REST interface to all your storage and execution systems regardless of the authentication mechanism or protocol they use. The examples below will use your default public storage system, but they would work identically with any storage system you have access to.

Directory listing

Listing a file or directory

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/files/v2/listings/data.agaveplatform.org/nryan

files-list -v -S data.agaveplatform.org nryan

The response would look something like this:

[
    {
        "format": "folder",
        "lastModified": "2012-08-03T06:30:12.000-05:00",
        "length": 0,
        "mimeType": "text/directory",
        "name": ".",
        "path": "nryan",
        "permisssions": "ALL",
        "system": "data.agaveplatform.org",
        "type": "dir",
        "_links": {
            "self": {
                "href": "https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan"
            },
            "system": {
                "href": "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
            }
        }
    },
    {
    "format": "raw",
    "lastModified": "2014-09-10T19:47:44.000-05:00",
    "length": 3235,
    "mimeType": "text/plain",
    "name": "picksumipsum.txt",
    "path": "nryan/picksumipsum.txt",
    "permissions": "ALL",
    "system": "data.agaveplatform.org",
    "type": "file",
    "_links": {
            "self": {
                "href": "https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
        }
    }
    }
]

Obtaining a directory listing, or information about a specific file is done by making a GET request on the /files/v2/listings/ resource.

The response to this contains a summary listing of the contents of your home directory on data.agaveplatform.org. Appending a file path to your commands above would give information on a specific file.

Move, copy, rename, delete

Basic file operations are available by sending a POST request the the /files/v2/media/ collection with the following parameters.

Attribute	Description
action	The action you want to perform. Select one of “move”, “copy”, “rename”, “mkdir”.
path	Full path to the destination file or folder. This may be the name of a new directory or renamed file, or an absolute or relative Agave path where the file or directory should be copied/moved.

Copying files and directories

Copy a file item within the same system.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data-binary '{"action":"copy","path":"$DESTPATH"}' \
    https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/$PATH

files-copy -D $DESTPATH -S data.agaveplatform.org $PATH

The response from a copy operation will be a JSON object describing the new file or folder.

Copying can be performed on any remote system. Unlike the Unix cp command, all copy invocations in Agave will overwrite the destination target if it exists. In the event of a directory collision, the contents of the two directory trees will be merged with the source overwriting the destination. Any overwritten files will maintain their provenance records and have an additional entry added to record the copy operation.

Moving files and directories

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data-binary '{"action":"move","path":"$DESTPATH"}' \
    https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/$PATH

files-move -D $DESTPATH -S data.agaveplatform.org $PATH

The response will reflect the new file item

Moving can be performed on any remote system. Moving a file or directory will overwrite the destination target if it exists. Unlike copy operations, the destination will be completely replaced by the source in the event of a collision. No merge will take place. Further, the provenance of the source will replace that of the target.

Renaming files and directories

Renaming a file item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
   -H "Content-Type: application/json" \
   -X POST \
   --data-binary '{"action":"rename","path":"$NEWNAME"}' \
    https://sandbox.agaveplatform.org/files/v2/media/system/$SYSTEM_ID/$PATH

files-rename -N $NEWNAME -S $SYSTEM_ID $PATH

The response will reflect the renamed file item

Renaming, like copying and moving, is only applicable within the context of a single system. Unlike on Unix systems, renaming and moving are not synonymous. When specifying a new name for a file or directory, the new name is relative to the parent directory of the original file or directory. Also, If a file or directory already exists with that name, the operation will fail and an error message will be returned. All provenance information will follow the renamed file or directory.

Creating a new directory

Creating a new directory

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data-binary '{"action":"mkdir","path":"$NEWDIR"}' \
    https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/$PATH

files-mkdir -N $NEWDIR -S $SYSTEM_ID $PATH

The response will reflect the new directory

Creating a new directory is a recursive action in Agave. If the parent directories do not exist, they will be created on the fly. If a file or directory already exists with that name, the operation will fail and an error message will be returned.

Deleting a file item

Deleting a file item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -X DELETE \
    https://sandbox.agaveplatform.org/files/v2/media/system/$SYSTEM_ID/$PATH

files-delete -S $SYSTEM_ID $PATH

A standard Agave response with an empty result value will be returned.

As with creating a directory, deleting a file or directory is a recursive action in Agave. No prompt or warning will be given once the request is sent. It is up to you to implement such checks in your application logic and/or user interface.

File history

A full history of changes, permissions changes, and access events made through the Files API is recorded for every file and folder on registered Agave systems. The recorded history events represent a subset of the events thrown by the Files API. Generally speaking, the events saved in a file item’s history represent mutations on the physical file item or its metadata.

Direct vs indirect events

Agave will record both direct and indirect events made on a file item. Examples of direct events are transferring a directory from one system to another or renaming a file. Examples of indirect events are a user manually deleting a file from the command line. The table below contains a list of all the provenance actions recorded.

Event	Description
CREATED	File or directory was created
DELETED	The file was deleted
RENAME	The file was renamed
MOVED	The file was moved to another path
OVERWRITTEN	The file was overwritten
PERMISSION_GRANT	A user permission was added
PERMISSION_REVOKE	A user permission was deleted
STAGING_QUEUED	File/folder queued for staging
STAGING	File or directory is currently in flight
STAGING_FAILED	Staging failed
STAGING_COMPLETED	Staging completed successfully
PREPROCESSING	Prepairing file for processing
TRANSFORMING_QUEUED	File/folder queued for transform
TRANSFORMING	Transforming file/folder
TRANSFORMING_FAILED	Transform failed
TRANSFORMING_COMPLETED	Transform completed successfully
UPLOADED	New content was uploaded to the file.
CONTENT_CHANGED	Content changed within this file/folder. If a folder, this event will be thrown whenever content changes in any file within this folder at most one level deep.

Out of band file system changes

Agave does not own the storage and execution systems you access through the Science APIs, so it cannot guarantee that everything that every possible change made to the file system is recorded. Thus, Agave takes a best-effort approach to provenance allowing you to choose, through your own use of best practices, how thorough you want the provenance trail of your data to be.

Listing file history

List the history of a file item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/files/v2/history/nryan/picksumipsum.txt

files-history -v nryan/picksumipsum.txt

The response to this contains a summary listing all permissions on the

$ files-history -v nryan/picksumipsum.txt
[
  {
    "status": "DOWNLOAD",
    "created": "2016-09-20T19:47:56.000-05:00",
    "createdBy": "public",
    "description": "File was downloaded"
  },
  {
    "status": "STAGING_QUEUED",
    "created": "2016-09-20T19:48:12.000-05:00",
    "createdBy": "nryan",
    "description": "File/folder queued for staging"
  },
  {
    "status": "STAGING_COMPLETED",
    "created": "2016-09-20T19:48:16.000-05:00",
    "createdBy": "nryan",
    "description": "Staging completed successfully"
  },
  {
    "status": "TRANSFORMING_COMPLETED",
    "created": "2016-09-20T19:48:17.000-05:00",
    "createdBy": "nryan",
    "description": "Your scheduled transfer of http://129.114.97.92/picksumipsum.txt completed staging. You can access the raw file on iPlant Data Store at /home/nryan/picksumipsum.txt or via the API at https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org//nryan/picksumipsum.txt."
  }
]

Basic paginated listing of file item history events is available as shown in the example. Currently, the file history service is readonly. The only way to erase the history on a file item is to delete the file item through the API.

Searching file history

Search a file item’s history

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/files/v2/history/nryan/picksumipsum.txt?limit=2&offset=1&createdBy.like=*ryan

files-history-search -v -l 2 -o 1 -S data.agaveplatform.org nryan/picksumipsum.txt createdBy.like=*ryan

The response is a JSON array of every action performed on the file by users with a username ending in ryan.

[
  {
    "status": "STAGING_QUEUED",
    "created": "2016-09-20T19:48:12.000-05:00",
    "createdBy": "nryan",
    "description": "File/folder queued for staging"
  },
  {
    "status": "STAGING_COMPLETED",
    "created": "2016-09-20T19:48:16.000-05:00",
    "createdBy": "nryan",
    "description": "Staging completed successfully"
  }
]

File histories can get rather lengthy over time. Full text search is available on the file history service using the standard search syntax.

File metadata management

In many systems, the concept of metadata is directly tied to the notion of a file system. Agave takes a broader view of metadata and supports it as its own first class resource in the REST API. For more information on how to leverage metadata in Agave, please consult the Metadata Guide. In there we cover all aspects of how to manage, search, validate, and associate metadata across your entire digital lab.

File permissions

Agave has a fine-grained permission model supporting use cases from creating and exposing readonly storage systems to sharing individual files and folders with one or more users. The permissions available for files items are listed in the following table. Please note that a user must have WRITE permissions to grant or revoke permissions on a file item.

Name	Description
READ	User can view, but not edit or execute the resource
WRITE	User can edit, but not view or execute the resource
EXECUTE	User can execute, but not view or edit the resource
READ_WRITE	User can view and write the resource, but not execute
READ_EXECUTE	User can view and execute the resource, but not edit it
WRITE_EXECUTE	User can edit and execute the resource, but not view it
ALL	User has full control over the resource
NONE	User has all permissions revoked on the given resource

Listing all permissions

List the permissions on a file item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    'https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?pretty=true''

files-pems-list \
    -S data.agaveplatform.org \
    nryan/picksumipsum.txt

The response will look something like the following:

[
  {
    "username": "nryan",
    "internalUsername": null,
    "permission": {
      "read": true,
      "write": true,
      "execute": true
    },
    "recursive": true,
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?username.eq=nryan"
      },
      "file": {
        "href": "https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
      }
    }
  }
]

To list all permissions for a file item, make a GET request on the file item’s permission collection

List permissions for a specific user

List the permissions on a file item for a given user

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?username=rclemens

files-pems-list \
    -u rclemens \
    -S data.agaveplatform.org \
    nryan/picksumipsum.txt

The response will look something like the following:

{
  "username":"rclemens",
  "permission":{
    "read":true,
    "write":true
  },
  "_links":{
    "self":{
      "href":"https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?username=rclemens"
    },
    "parent":{
      "href":"https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt"
    },
    "profile":{
      "href":"https://sandbox.agaveplatform.org/profiles/v2/rclemens"
    }
  }
}

Checking permissions for a single user is done using agave URL query search syntax.

Grant permissions

Grant read access to a file item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data '{"username":"rclemens", "permission":"READ"}' \
    https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt

files-pems-update 
    -u rclemens \
    -p READ \
    -S data.agaveplatform.org \
    nryan/picksumipsum.txt

Grant read and write access to a file item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data '{"username","rclemens", "permission":"READ_WRITE"}' \
    https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt

files-pems-addupdate 
    -u rclemens \
    -p READ_WRITE \
    -S data.agaveplatform.org \
    nryan/picksumipsum.txt

The response will look something like the following

[
  {
    "username": "rclemens",
    "internalUsername": null,
    "permission": {
      "read": true,
      "write": true,
      "execute": false
    },
    "recursive": false,
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?username.eq=rclemens"
      },
      "file": {
        "href": "https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/rclemens"
      }
    }
  }
]

To grant another user read access to your metadata item, assign them READ permission. To enable another user to update a file item, grant them READ_WRITE or ALL access.

Delete single user permissions

Delete permission for single user on a file item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -H "Content-Type: application/json" \
     -X POST \
     --data '{"username","rclemens", "permission":"NONE"}' \
     https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt

files-pems-update  \
    -u rclemens \
    -p 'NONE' \
    -S data.agaveplatform.org \
    nryan/picksumipsum.txt

A response similiar to the following will be returned

[
  {
    "username": "rclemens",
    "internalUsername": null,
    "permission": {
      "read": false,
      "write": false,
      "execute": false
    },
    "recursive": false,
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?username.eq=rclemens"
      },
      "file": {
        "href": "https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/rclemens"
      }
    }
  }
]

Permissions may be deleted for a single user by making a DELETE request on the metadata user permission resource. This will immediately revoke all permissions to the file item for that user.

Deleting all permissions

Delete all permissions on a file item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -H "Content-Type: application/json" \
     -X POST \
     --data '{"username","*", "permission":"NONE"}' \
     https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -X DELETE \
     https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt

files-pems-delete \
    -S data.agaveplatform.org \
    nryan/picksumipsum.txt

An empty response will be returned from the service.

Permissions may be cleared for all users on a file item by making a DELETE request on the file item permission collection. In

Recursive operations

Recursively delete all permissions on a directory

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -H "Content-Type: application/json" \
     -X POST \
     --data '{"username","*", "permission":"READ_WRITE", "recursive": true}' \
     https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -X DELETE \
     https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?recursive=true

files-pems-delete \
    --recursive \
    -S data.agaveplatform.org \
    nryan/picksumipsum.txt

An empty response will be returned from the service on delete. Update will return something like the following.

[
  {
    "username": "nryan",
    "internalUsername": null,
    "permission": {
      "read": true,
      "write": true,
      "execute": true
    },
    "recursive": true,
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?username.eq=nryan"
      },
      "file": {
        "href": "https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
      }
    }
  }
]

When dealing with directories, the permission operations you perform will apply onto to the directory item itself. Permissions will not automatically propagate to the directory contents. In cases where you want to recursively apply permissions to the entire directory tree, you can do so by including the recursive attribute in your permission objects or to your URL query parameters when making a DELETE request.

Publishing data

Agave provides multiple ways to share your data with your colleagues and the general public. In addition to the standard permission model enabling you to share your data with one or more authenticated users within the Platform, you also have the ability to publish your data and make it available via an unauthenticated public URL. Unlike traditional web and cloud hosting, your data remains in its original location and is served in situ by Agave upon user request.

Publishing a file for folder is simply a matter of granting the special public user READ permission on a file or folder. Similar to the way listings and permissions are exposed through unique paths in the Files API, published data is served from a custom /files/v2/download path. The public data URLs have the following structure:

https://sandbox.agaveplatform.org/files/v2/download/<username>/system/<system_id>/<path>

Notice two things. First, a username is inserted after the download path element. This is needed because there is no authorized user for whom to validate system or file ownership on a public request. The username gives the context by which to verify the availability of the system and file item being requested. Second, the system_id is mandatory in public data requests. This ensures that the public URL remains the same even when the default storage system of the user who published it changes.

The following sections give examples of publishing files and folders in the Agave Platform.

Publishing individual files

Publish file item on your default storage system for public access

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data '{"username","public", "permission":"READ"}' \
    https://sandbox.agaveplatform.org/files/v2/pems/nryan/picksumipsum.txt

files-pems-addupdate \
    -u public \
    -p READ \
    nryan/picksumipsum.txt

Publish file item on a named system for public access

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data '{"username","public", "permission":"READ"}' \
    https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt

files-pems-addupdate \
    -u public \
    -p READ \
    -S data.agaveplatform.org \
    nryan/picksumipsum.txt

The response will look something like the following:

{
  "username": "public",
  "permission": {
    "read": true,
    "write": false,
    "execute": false
  },
  "recursive": false,
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt?username.eq=public"
    },
    "file": {
      "href": "https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/picksumipsum.txt"
    },
    "profile": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/public"
    }
  }
}

Publishing a file for folder is simply a matter of giving the special public user READ permission on the file. Once published, the file will be available at the following URL:

https://sandbox.agaveplatform.org/files/v2/download/nryan/system/data.agaveplatform.org/nryan/picksumipsum.txt

Publishing directories

Publish directory on your default storage system for public access

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data '{"username","public", "permission":"READ", "recursive": true}' \
    https://sandbox.agaveplatform.org/files/v2/pems/nryan/public

files-pems-addupdate \
    --recursive \
    -u public \
    -p READ \
    nryan/public

Publish directory on a named system for public access

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -H "Content-Type: application/json" \
    -X POST \
    --data '{"username","public", "permission":"READ", "recursive": true}' \
    https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/public

files-pems-addupdate \
    --recursive \
    -u public \
    -p READ \
    -S data.agaveplatform.org \
    nryan/public

The response will look something like the following:

{
  "username": "public",
  "permission": {
    "read": true,
    "write": false,
    "execute": false
  },
  "recursive": true,
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/public?username.eq=public"
    },
    "file": {
      "href": "https://sandbox.agaveplatform.org/files/v2/pems/system/data.agaveplatform.org/nryan/public"
    },
    "profile": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/public"
    }
  }
}

Publishing an entire directory is identical to publishing a single file item. To make all the contents of the directory public as well, include a recursive field to your request with a value of true. Once published, the directory and all its contents will be avaialble for download. The above example will make every file and folder in the “nryan/public” directory of “data.agaveplatform.org” available for download at the following URL:

https://sandbox.agaveplatform.org/files/v2/download/nryan/system/data.agaveplatform.org/nryan/public

Publishing considerations

Publishing data through Agave can be a great way to share and access data. There are situations in which it may not be an ideal choice. We list several of the pitfalls user run into when publishing their data.

Large file publishing

Before publishing your large datasets, take a step back and consider how you might leverage the Files or Transfers API to reliable serve up your data. HTTP is not the fastest way to serve up the data, and it may not be the best usage pattern for applications hoping to consume it. Thinking through your use case is well worth the time, even if publishing ends up being the best approach.

Static website hosting

Website hosting is a fairly common use case for data publishing. The challenge is that your assets are still hosted remotely from our API servers and fetched on demand. This can create some heavy latency when serving up lots of assets. Depending on the nature of your backend storage solution, it may not easily handle access patterns common to the web. In those situations, you may see some files fail to load from time to time. If your site has many files, even a small failure rate can keep your site from reliably loading.

If you are going to use the file publishing service for web hosting, the following tips can help improve your overall experience.

Whenever possible, reference versions of your css, fonts, and javascript dependencies hosted on public CDN. CloudFlare, Google, and Amazon all host public mirrors of the most popular javascript libraries and frameworks. Linking to those can greatly speed up your load time.
Use a technology like Webpack to reduce the number of files needed to serve your application.
Lazy load your assets with oclazyload, requirejs or including async attributes on your <script> elements.
Store your assets on a storage system with as little connection and protocol overhead as possible. That means avoiding tape archives, gridftp, overprovisioned shared resources, and systems only accessible through a proxied connection. While the service will still work in all of these situations, it is common for the overhead involved in establishing a connection and authenticating to take longer than the actual file transfer when the file is small. Simply avoiding slower storage protocols can greating speed up your application’s load time.

Apps

  /$$$$$$
 /$$__  $$
| $$  \ $$ /$$$$$$  /$$$$$$  /$$$$$$$
| $$$$$$$$/$$__  $$/$$__  $$/$$_____/
| $$__  $| $$  \ $| $$  \ $|  $$$$$$
| $$  | $| $$  | $| $$  | $$\____  $$
| $$  | $| $$$$$$$| $$$$$$$//$$$$$$$/
|__/  |__| $$____/| $$____/|_______/
         | $$     | $$
         | $$     | $$
         |__/     |__/

An app, in the context of Agave, is an executable code available for invocation through the Agave Jobs service on a specific execution system. Put another way, an app is a piece of code that you can run on a specific system. If a single code needs to be run on multiple systems, each combination of app and system needs to be defined as an app.

Apps are language agnostic and may or may not carry with them their own dependencies. (More on bundling your app in a moment.) Any code that can be forked at the command line or submitted to a batch scheduler can be registered as an Agave app and run through the Jobs service.

The Apps service is the central registry for all Agave apps. The Apps service provides permissions, validation, archiving, and revision information about each app in addition to the usual discovery capability. The rest of this tutorial explains in detail how to register an app to the Apps service, how to manage and share apps, and what the different application scopes mean.

Discovering apps

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/apps/v2/

apps-list -v

The response will be something like this:

[
  {
      "id": "demo-pyplot-demo-0.1.0u3",
      "name": "demo-pyplot-demo",
      "version": "0.1.0",
      "revision": 3,
      "executionSystem": "docker.tacc.utexas.edu",
      "shortDescription": "Advanced demo plotting app",
      "isPublic": true,
      "label": "PyPlot Demo Advanced",
      "lastModified": "2017-11-03T18:05:33.000-05:00",
      "_links": {
        "self": {
          "href": "https://sandbox.agaveplatform.org/apps/v2/demo-pyplot-demo-0.1.0u3"
        }
      }
    }, {
      "id": "cloud-runner-0.1.0u1",
      "name": "cloud-runner",
      "version": "0.1.0",
      "revision": 1,
      "executionSystem": "docker.tacc.utexas.edu",
      "shortDescription": "Generic template for running arbitrary code in Agave's Dockerized cloud.",
      "isPublic": true,
      "label": "Run your code in the cloud",
      "lastModified": "2016-11-01T02:07:22.000-05:00",
      "_links": {
        "self": {
          "href": "https://sandbox.agaveplatform.org/apps/v2/cloud-runner-0.1.0u1"
        }
      }
    }
]

The Apps service allows you to list and search for apps you have registered and apps that have been shared with you. To get a list of all your apps, make a GET request on the Apps collection.

Filtering apps

List apps returning only the app id

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/apps/v2/?filter=id,shortDescription,executionType

apps-list -v

The response will be something like this:

[
  {
    "id": "demo-pyplot-demo-0.1.0u3",
    "executionType": "CLI",
    "shortDescription": "Advanced demo plotting app"
  }, {
    "id": "cloud-runner-0.1.0u1",
    "executionType": "CLI",
    "shortDescription": "Generic template for running arbitrary code in Agave's Dockerized cloud."
  }
]

App description can get rather verbose, so a summary object is returned when listing the apps collection. The summary object contains the most critical fields in order to reduce response size when retrieving a user’s apps. You can customize this behavior using the filter query parameter.

Searching apps

Only public apps

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/apps/v2/?public=true

apps-search -v public=true

Only private apps

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/apps/v2/?public=false

apps-search -v public=false

Only apps with “plot” in the name

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/apps/v2/?name.like=*plot*

apps-search -v name.like=*plot*

Only apps that run on execution system “docker.tacc.utexas.edu”

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/apps/v2/?executionSystem.eq=docker.tacc.utexas.edu

apps-search -v executionSystem.eq=docker.tacc.utexas.edu

You can directly search the app collection by any field in the app description using Agave’s search syntax. Multiple fields can be included to further refine the query. See the section on Search for more details.

App details

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/apps/v2/cloud-runner-0.1.0u1

apps-list -v cloud-runner-0.1.0u1

The response will be something like this:

{
  "id": "cloud-runner-0.1.0u1",
  "name": "cloud-runner",
  "icon": null,
  "uuid": "3058779360820391450-242ac115-0001-005",
  "parallelism": "SERIAL",
  "defaultProcessorsPerNode": 1,
  "defaultMemoryPerNode": 1,
  "defaultNodeCount": 1,
  "defaultMaxRunTime": null,
  "defaultQueue": null,
  "version": "0.1.0",
  "revision": 1,
  "isPublic": true,
  "helpURI": "https://agaveplatform.org/contact-us",
  "label": "Run your code in the cloud",
  "owner": "dooley",
  "shortDescription": "Generic template for running arbitrary code in Agave's Dockerized cloud.",
  "longDescription": "Generic template for running an arbitrary application in Agave's hosted Docker cloud. Apps should be a gzipped archive.",
  "tags": [
    "docker",
    "demo",
    "awesome"
  ],
  "ontology": [],
  "executionType": "CLI",
  "executionSystem": "docker.tacc.utexas.edu",
  "deploymentPath": "/apps/cloud-runner-0.1.0u1.zip",
  "deploymentSystem": "data.agaveplatform.org",
  "templatePath": "wrapper.sh",
  "testPath": "test/test.sh",
  "checkpointable": false,
  "lastModified": "2016-11-01T02:07:22.000-05:00",
  "modules": [],
  "available": true,
  "inputs": [
    {
      "id": "dockerFile",
      "value": {
        "validator": null,
        "visible": true,
        "required": false,
        "order": 0,
        "enquote": false,
        "default": null
      },
      "details": {
        "label": "Dockerfile",
        "description": "Dockerfile to build the container that will be run as the executable. This is optional. Only include if you need to build a new container that is not present in the Docker central index and your app bundle does not already have a Dockerfile in it.",
        "argument": null,
        "showArgument": false,
        "repeatArgument": false
      },
      "semantics": {
        "minCardinality": 0,
        "maxCardinality": 1,
        "ontology": [],
        "fileTypes": []
      }
    },
    {
      "id": "appBundle",
      "value": {
        "validator": "([^\s]+(\.(?i)(zip|gz|tgz|tar.gz|bz2|rar))$)",
        "visible": true,
        "required": false,
        "order": 0,
        "enquote": false,
        "default": null
      },
      "details": {
        "label": "Application bundle",
        "description": "Compressed work folder containing application and binaries to be run in the Docker container. zip, gz.",
        "argument": null,
        "showArgument": false,
        "repeatArgument": false
      },
      "semantics": {
        "minCardinality": 0,
        "maxCardinality": 1,
        "ontology": [],
        "fileTypes": []
      }
    }
  ],
  "parameters": [
    {
      "id": "command",
      "value": {
        "visible": true,
        "required": false,
        "type": "string",
        "order": 0,
        "enquote": false,
        "default": "python",
        "validator": null
      },
      "details": {
        "label": "Command to run",
        "description": "This is the actual executable needed to run your program in the Docker container. ex. Rscript, python, java, mvn, php, sh",
        "argument": null,
        "showArgument": false,
        "repeatArgument": false
      },
      "semantics": {
        "minCardinality": 0,
        "maxCardinality": 1,
        "ontology": []
      }
    },
    {
      "id": "unpackInputs",
      "value": {
        "visible": true,
        "required": false,
        "type": "flag",
        "order": 0,
        "enquote": false,
        "default": true,
        "validator": null
      },
      "details": {
        "label": "Unpack input files",
        "description": "If true, any compressed input files will be expanded prior to execution on the remote system.",
        "argument": "1",
        "showArgument": true,
        "repeatArgument": false
      },
      "semantics": {
        "minCardinality": 0,
        "maxCardinality": 1,
        "ontology": []
      }
    },
    {
      "id": "commandArgs",
      "value": {
        "visible": true,
        "required": false,
        "type": "string",
        "order": 0,
        "enquote": false,
        "default": "main.py",
        "validator": null
      },
      "details": {
        "label": "Command arguments",
        "description": "This is a string reprsenting the command line needed to run your code. ex. main.r, main.py, -cp $CLASSPATH:lib, exec:java, -f main.php, -c main.sh ",
        "argument": null,
        "showArgument": false,
        "repeatArgument": false
      },
      "semantics": {
        "minCardinality": 0,
        "maxCardinality": 1,
        "ontology": []
      }
    },
    {
      "id": "dockerImage",
      "value": {
        "visible": true,
        "required": true,
        "type": "string",
        "order": 0,
        "enquote": false,
        "default": "agaveplatform/scipy-matplot-2.7",
        "validator": null
      },
      "details": {
        "label": "Image name",
        "description": "Container image from the Docker central repo or name of the image created by building the dockerFile",
        "argument": null,
        "showArgument": false,
        "repeatArgument": false
      },
      "semantics": {
        "minCardinality": 1,
        "maxCardinality": 1,
        "ontology": []
      }
    }
  ],
  "outputs": [],
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/apps/v2/cloud-runner-0.1.0u1"
    },
    "executionSystem": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/docker.tacc.utexas.edu"
    },
    "storageSystem": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
    },
    "history": {
      "href": "https://sandbox.agaveplatform.org/apps/v2/cloud-runner-0.1.0u1/history"
    },
    "metadata": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/?q=%7B%22associationIds%22%3A%223058779360820391450-242ac115-0001-005%22%7D"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/dooley"
    },
    "permissions": {
      "href": "https://sandbox.agaveplatform.org/apps/v2/cloud-runner-0.1.0u1/pems"
    }
  }
}

To query for detailed information about a specific app, add the app id to the base collection url and make another GET request.

This time, the response will be a JSON object with a full app description. The following is the description of the public cloud-runner-0.1.0u1 app. In the next section we talk more about the different parts of an app definition and how to register one of your own.

Defining apps

In this section we take a detailed look at the inputs and parameters sections of your app descriptions. Each of these sections takes an array of JSON objects. Each JSON object represents either a data source that needs staging in prior to job execution or a primary value passed into your app as a parameter. In either case, the JSON object only requires an id by which to reference the object in a job request, and a type field indicating primary type if the object represents a parameter.

In practice, you will want to add some descriptive information, constraints, and runtime validation checks to reduce the amount of error users can run into when attempting to run your app. The full lists of app input and parameter attributes are provided in their respective sections below. However, before we dive deeper into the next section on app inputs, let’s first get a big picture view of what we are doing when we define our app’s input and parameters.

Input and Parameter Information Flow

When a user submits a job request in step 1, they specify the inputs and parameters needed to run that job. Those attributes are defined in your app description. The Jobs service will use your app description to validate the values in the job request and either reject it with a descriptive error message as in step 2, or accept it as in step 4. Once the job request is accepted, the values provided for the inputs and parameters given in the job request are used to replace their corresponding template placeholder values in the wrapper script. For example, the job request assigned a value of foo for the input with id equal to input1. Before submitting the job request to the remote system, the Jobs service will replace all occurrences of ${input1} in the app wrapper script with foo. The same will happen with param1 and param2. All occurrences of ${param1} will be replaced with bar and all occurrences of ${param2} will be replaced with 2, just as specified in the job request.

As we look at how to define inputs and parameters for your app, keep this big picture in mind. The purpose of inputs is to specify data that need to be staged prior to your job running and to tell your wrapper script about them. The purpose of parameters is to specify variables that need to be passed to your wrapper script. To do this, we only need a simple id by which to reference the values in a job request. The rest of what we will discuss in this tutorial is the mechanism that Agave provides for you to validate, describe, discover, and restrict application inputs and parameters to provider better user and developer experiences using your app.

App inputs

Minimal app input definition

{
  "id": "input1"
}

The inputs attribute of your app description contains a JSON array of input objects. An input represents one or more pieces of data that your app will use at runtime. That data can be a single file, a directory, or a response from a web service. It can reside on a system that Agave knows about, or at a publicly accessible URL. Regardless of where it lives and what it is, Agave will grab the data (recursively if need be) and copy it to your job’s working directory just before execution.

A minimal input object contains a single inputs.[].id attribute that uniquely identifies it within the context of your app. Any alphanumeric value under 64 characters can be an identifier, but it must be unique among all the inputs and parameters in that app.

Most of the time, such a minimal definition is not helpful. At the very least, you would want some descriptive information, a restriction on the cardinality, and potentially a default value. This can be achieved with the details, semantics, and value objects. The full list of input attributes is shown in the following table. We cover each attribute in the corresponding section below.

Name	Type	Description
id	String	Required: The textual id of this input. This value must be unique within all inputs and inputs for an app description.
details	JSON object
details.argument	string	A command line argument or flag to be prepended before the input value.
details.description	string	Human-readable description of the input. Often used to create contextual help in automatically generated UI.
details.label	string	Human-readable label for the input. Often implemented as text label next to the field in automatically generated UI.
details.showArgument	boolean	Whether to include the argument value for this input when performing the template variable replacement during job submission. If true, the `details.argument` value will be prepended, without spaces, to the actual input value(s).
details.repeatArgument	boolean	When multiple values are provided for this input, this attribute determines whether to include the argument value before each user-supplied value when performing the template variable replacement during job submission. The `details.showArgument` value must be true for this value to be applied.
semantics	JSON object	Describes the semantic definition of this inputs and the filetypes it represents. Multiple ontologies and values are supported.
semantics.fileTypes	JSON array	Array of string values describing the file types represented by this input. The types correspond to values from the Transforms service. Use “raw-0” for the time being
semantics.minCardinality	integer	Minimum number of values this input must have.
semantics.maxCardinality	integer	Maximum number of values this input can have. A null value or value of -1 indicates no limit.
semantics.ontology	JSON array	List of ontology terms (or URIs pointing to ontology terms) applicable to the input. We recommend at least specifying an XSL Schema Simple Type.
value	JSON object	A description of the anticipated value and the situations when it is required.
value.default	string, JSON array	The default value for this input. This value is optional except when `value.required` is true and `value.visible` is false. Values may be absolute or relative paths on the user’s default storage sytem, an agave URI, or any valid URL with a supported schema.
value.order	integer	The order in which this input should appear when auto-generating a command line invocation.
value.required	boolean	Required: Is specification of this input mandatory to run a job?
value.validator	string	Perl-formatted regular expression to restrict valid values.
value.visible	boolean	When automatically generated a UI, should this field be visible to end users? If false, users will not be able to set this value in their job request.
value.enquote	boolean	Should the value be surrounded in quotation marks prior to injecting into the wrapper template at job runtime.

Input details section

The inputs.[].details object contains information specifying how to describe an input in different contexts. The description and label values provide human readable information appropriate for a tool tip and form label respectively. Neither of these attributes are required, however they dramatically improve the readability of your app description if you include them.

Often times you will need to translate your input value into actual command line arguments. By default, Agave will replace all occurrences of your attribute inputs.[].id in your wrapper script with the value of that attribute in your job description. That means that you are responsible for inserting any command line flags or arguments into the wrapper script yourself. This is a pretty straightforward process, however in situations where an input is optional, the resulting command line could be broken if the user does not specify an input value in their job request. One way to work around this is to add a conditional check to the variable assignment and exclude the command line flag or argument if it does not have a value set. Another is to use the inputs.[].details.argument attribute.

The inputs.[].details.argument value describes the command line argument that corresponds to this input, and the inputs.[].details.showArgument attribute specifies whether the inputs.[].details.argument value should be injected into the wrapper template in front of the actual runtime value. The following table illustrates the result of these attributes in different scenarios.

`argument`	`showArgument`	Input value from job request	Value injected into wrapper template
	true	/etc/motd	/etc/motd
-f	true	/etc/motd	-f/etc/motd
-f (trailing space)	true	/etc/motd	-f /etc/motd
-f	false	/etc/motd	/etc/motd
–filename	true	/etc/motd	–filename/etc/motd
–filename=	true	/etc/motd	–filename=/etc/motd
–filename	false	/etc/motd	/etc/motd

Input semantics section

The inputs.[].semantics object contains semantic information about the input. The minCardinality attribute specifies the minimum number of data sources that can be specified for the input. This attribute is used to validate the value(s) provided for the input in a job request. The ontology attribute specifies a JSON array of URLs pointing to the ontology definitions of this file type. (We recommend at least specifying an XSL Schema Simple Type.) Finally, the fileTypes attribute contains a JSON array of file type strings as specified in the transforms service. (In most situations you will leave the fileTypes attribute null or specify RAW-0 as the single file type in the array.)

Input value section

The inputs.[].value object contains the information needed to validate user-supplied input values in a job request. The validator attribute accepts a Perl regular expression which will be applied to the input value(s). Any submissions that do not match the validator expression will be rejected.

The default attribute allows you to specify a default value for the input. This will be used in leu of a user-supplied value if the input is required, but not visible. All default values must match the validator expression, if provided.

The required attribute specifies whether the input must be specified during a job submission.

The visible attribute takes a boolean value specifying whether the input should be accepted as as a user-supplied value in a job requests. If false, the value will be ignored at job submission and the default value will be used instead. Whenever visible is set to false, required must be true.

The order attribute is used to specify the order in which inputs should be listed in the response from the API and in command-line generation. By default, order is set to zero. Thus, providing a value greater than zero is sufficient to force any single input to be listed last.

Validating inputs

The previous section covered different ways you can specify for Agave to validate and restrict the data inputs to your app. When a user submits an job request, the order in which they are applied is as follows.

visible
required
minCardinality
maxCardinality
validator

Once an input passes these tests, Agave will check that it exists and that the user has permission to access the data. Assuming everything passes, the input is accepted and scheduled for staging.

App parameters

Minimal app parameter definition

{
  "id": "parameter1",
  "value": {
    "type": "string"
  }
}

The parameters attribute of your app description contains a JSON array of parameter objects. A parameter represents one or more arguments that your app will use at runtime. Those arguments can be more or less anything you want them to be. If, for some reason, your app handles data staging on its own and you do not want Agave to move the data on your behalf, but you do need a data reference passed in, you can define it as a parameter rather than an input.

A minimal parameter object contains a single id attribute that uniquely identifies it within the context of your app and a value.type attribute specifying the primary type of the parameter. Any alphanumeric value under 64 characters can be an identifier, but it must be unique among all the inputs and parameters in that app. The parameter type is restricted to a handful of primary types listed in the table below.

In most situations you will want some descriptive information and validation of the user-supplied values for this parameter. As with your app inputs, app parameters have details, semantics, and value objects that allow you to do just that. The full list of parameter attributes is shown in the following table. We cover each attribute in the corresponding section below.

Name	Type	Description
id	String	Required: The textual id of this parameter. This value must be unique within all parameters and parameters for an app description.
details	JSON object
details.argument	string	A command line argument or flag to be prepended before the parameter value.
details.description	string	Human-readable description of the parameter. Often used to create contextual help in automatically generated UI.
details.label	string	Human-readable label for the parameter. Often implemented as text label next to the field in automatically generated UI.
details.showArgument	boolean	Whether to include the argument value for this parameter when performing the template variable replacement during job submission. If true, the `details.argument` value will be prepended, without spaces, to the actual parameter value(s).
details.repeatArgument	boolean	When multiple values are provided for this input, this attribute determines whether to include the argument value before each user-supplied value when performing the template variable replacement during job submission. The `details.showArgument` value must be true for this value to be applied.
semantics	JSON object	Describes the semantic definition of this parameters and the filetypes it represents. Multiple ontologies and values are supported.
semantics.minCardinality	integer	Minimum number of values this parameter must have.
semantics.maxCardinality	integer	Maximum number of values this parameter can have. A null value or value of -1 indicates no limit.
semantics.ontology	JSON array	List of ontology terms (or URIs pointing to ontology terms) applicable to the parameter. We recommend at least specifying an XSL Schema Simple Type.
value	JSON object	A description of the anticipated value and the situations when it is required.
value.default	string, JSON array	The default value for this parameter. This value is optional except when `value.required` is true and `value.visible` is false. If the `value.type` is of this parameter is enumeration, this value must be one of the specified `value.enumValues`. If the `value.type` is of this parameter is bool or flag, then only boolean values are accepted here.
value.enumValues	JSON array	An array of values specifying the possible values this parameter may have when `value.type` is enumeration. Both JSON Objects and strings are supported in the array. If a JSON Object is given, the object must be a single value attribute. The key will be the value passed into the wrapper template. The value will be the display value shown when auto-generating the option element in the select box representing this input.
value.order	integer	The order in which this parameter should appear when auto-generating a command line invocation.
value.required	boolean	Required: Is specification of this parameter mandatory to run a job?
value.type	string, number, enumeration, bool, flag	JSON type for this parameter (used to generate and validate UI).
value.validator	string	Perl-formatted regular expression to restrict valid values.
value.visible	boolean	When automatically generated a UI, should this field be visible to end users? If false, users will not be able to set this value in their job request.
value.enquote	boolean	Should the value be surrounded in quotation marks prior to injecting into the wrapper template at job runtime.

Parameter details section

The parameters.[].details object contains information specifying how to describe a parameter in different contexts and is identical to the inputs.[].details object.

Parameter semantics section

The parameters.[].semantics object contains semantic information about the parameter. Unlike the inputs.[].semantics object, it only has a single attribute, ontology. The ontology attribute specifies a JSON array of URLs pointing to the ontology definitions of this parameter type. (We recommend at least specifying an XSL Schema Simple Type.)

Parameter value section

Example enumValue definition specifying just values for the enumeration.

[
  "red",
  "white",
  "green",
  "black"
]

Example enumValue definition specifying both a value and label for enumerated parameter.

[
  { "red": "Deep Cherry Red" },
  { "white": "Bright White" },
  { "green": "Black Forest Green" },
  { "black": "Brilliant Black Crystal Pearl" }
]

The parameters.[].value object contains the information needed to validate user-supplied parameter values in a job request. The type attribute defines the primary type of this parameter’s values. The available types are:

number: any real number
string: any json-escaped alphanumeric string.
bool: true or false
flag: true or false. Identical to boolean, but only the `argument` value will be inserted into the wrapper template.
enumeration: a JSON array of strings values or JSON objects representing the acceptable values for this parameter. If an array of JSON objects is given, each object should have a single attribute with the key being a desired enumeration value, and the value being a human readable descriptive name for the enumerated value. The value of using objects vs strings is that object values provide a way to create more descriptive user interfaces by customizing both the content and value of a HTML select box’s option elements. An example of both is given below.

The validator attribute accepts a Perl regular expression which will be applied to the input value(s). Any submissions that do not match the validator expression will be rejected. This attribute is available both to parameters of type number and string. It is not available to bool or flag parameter types, or to enumeration parameters as they require the enumValues attribute instead.

The default attribute allows you to specify a default value for the parameter. This will be used in leu of a user-supplied value if the parameter is required, but not visible. All default values must match the appropriate validator if type is number or string, or be one of the values in the enumValues array if type is enumeration.

The enumValues attribute is a JSON array of alphanumeric values specifying the acceptable values for this input. This attribute only exists for enumeration parameter types.

The required attribute specifies whether the parameter must be specified during a job submission.

The visible attribute takes a boolean value specifying whether the parameter should be accepted as as a user-supplied value in a job requests. If false, the value will be ignored at job submission and the default value will be used instead. Whenever visible is set to false, required must be true.

The order attribute is used to specify the order in which parameters should be listed in the response from the API and in command-line generation. By default, order is set to 0. Thus, providing a value greater than zero is sufficient to force any single parameter to be listed last.

Validating inputs

The previous section covered different ways you can tell for Agave to validate and restrict the parameters to your app. When a user submits an job request, the order in which they are applied is as follows.

visible
required
type
validator / enumValues

App outputs

App outputs are not currently supported as first class objects in the app or job lifecycle. Their primary purpose is as metadata for use in client-side workflows and post-processing tasks. While not required, it is considered a best practice to define a list of the outputs expected when running the app. In doing so, an app can “advertise” to its consumers what it expect as the result of a run, thereby allowing apps to be chained together in a machine-readable fashion.

Outputs are defined similarly to inputs. The full list of output attributes is shown in the following table.

Name	Type	Description
id	String	Required: The textual id of this output.
details	JSON object
details.argument	string	A command line argument or flag to be prepended before the output value.
details.description	string	Human-readable description of the output. Often used to create contextual help in automatically generated UI.
details.label	string	Human-readable label for the output. Often implemented as text label next to the field in automatically generated UI.
details.showArgument	boolean	Whether to include the argument value for this input when performing the template variable replacement during job submission. If true, the `details.argument` value will be prepended, without spaces, to the actual output value(s).
details.repeatArgument	boolean	When multiple values are provided for this output, this attribute determines whether to include the argument value before each user-supplied value when performing the template variable replacement during job submission. The `details.showArgument` value must be true for this value to be applied.
semantics	JSON object	Describes the semantic definition of this output and the filetypes it represents. Multiple ontologies and values are supported.
semantics.fileTypes	JSON array	Array of string values describing the file types represented by this output. The types correspond to values from the Transforms service. Use "raw-0” for the time being
semantics.minCardinality	integer	Minimum number of values this output must have.
semantics.maxCardinality	integer	Maximum number of values this output can have. A null value or value of -1 indicates no limit.
semantics.ontology	JSON array	List of ontology terms (or URIs pointing to ontology terms) applicable to the output. We recommend at least specifying an XSL Schema Simple Type.
value	JSON object	A description of the anticipated value and the situations when it is required.
value.default	string, JSON array	The default value for this output. This value is optional except when `value.required` is true and `value.visible` is false. Values may be absolute or relative paths on the user’s default storage sytem, an agave URI, or any valid URL with a supported schema.
value.order	integer	The order in which this output should appear when auto-generating a command line invocation.
value.required	boolean	Required: Is specification of this output mandatory to run a job?
value.validator	string	Perl-formatted regular expression to restrict valid values.
value.visible	boolean	When automatically generated a UI, should this field be visible to end users? If false, users will not be able to set this value in their job request.
value.enquote	boolean	Should the value be surrounded in quotation marks prior to injecting into the wrapper template at job runtime.

Defining app wrapper templates

Example wrapper script that prints out all of Agave’s available runtime job macros and runs a user-suppled string defined as the command argument in the app description.

date

echo "Printing Agave job template variables..."

echo 'IPLANT_JOB_NAME="${IPLANT_JOB_NAME}"'
echo 'AGAVE_JOB_NAME="${AGAVE_JOB_NAME}"'
echo 'AGAVE_JOB_ID="${AGAVE_JOB_ID}"'
echo 'AGAVE_JOB_APP_ID="${AGAVE_JOB_APP_ID}"'
echo 'AGAVE_JOB_EXECUTION_SYSTEM="${AGAVE_JOB_EXECUTION_SYSTEM}"'
echo 'AGAVE_JOB_BATCH_QUEUE="${AGAVE_JOB_BATCH_QUEUE}"'
echo 'AGAVE_JOB_SUBMIT_TIME="${AGAVE_JOB_SUBMIT_TIME}"'
echo 'AGAVE_JOB_ARCHIVE_SYSTEM="${AGAVE_JOB_ARCHIVE_SYSTEM}"'
echo 'AGAVE_JOB_ARCHIVE_PATH="${AGAVE_JOB_ARCHIVE_PATH}"'
echo 'AGAVE_JOB_NODE_COUNT="${AGAVE_JOB_NODE_COUNT}"'
echo 'IPLANT_CORES_REQUESTED="${IPLANT_CORES_REQUESTED}"'
echo 'AGAVE_JOB_PROCESSORS_PER_NODE="${AGAVE_JOB_PROCESSORS_PER_NODE}"'
echo 'AGAVE_JOB_MEMORY_PER_NODE="${AGAVE_JOB_MEMORY_PER_NODE}"'
echo 'AGAVE_JOB_ARCHIVE_URL="${AGAVE_JOB_ARCHIVE_URL}"'
echo 'AGAVE_JOB_OWNER="${AGAVE_JOB_OWNER}"'
echo 'AGAVE_JOB_TENANT="${AGAVE_JOB_TENANT}"'
echo 'AGAVE_JOB_ARCHIVE="${AGAVE_JOB_ARCHIVE}"'
echo 'AGAVE_JOB_MAX_RUNTIME="${AGAVE_JOB_MAX_RUNTIME}"'
echo 'AGAVE_JOB_MAX_RUNTIME_MILLISECONDS="${AGAVE_JOB_MAX_RUNTIME_MILLISECONDS}"'
echo "Printing runtime environment..."

env

CALLBACK=$(${command})

${AGAVE_JOB_CALLBACK_NOTIFICATION|CALLBACK}

sleep 3

In order to run your application, you will need to create a wrapper template that calls your executable code. The wrapper template is a simple script that Agave will filter and execute to start your app. The filtering Agave applies to your wrapper script is to inject runtime values from a job request into the script to replace the template variables representing the inputs and parameters of your app.

The order in which wrapper templates are processed in HPC and Condor apps is as follows.

environment variables injected.
startupScript run.
Scheduler directives prepended to the wrapper template.
additionalDirectives concatenated after the scheduler directives.
Custom modules concatenated after the additionalDirectives.
inputs and parameters template variables replaced with values from the job request.
Blacklist commands, if present, are disabled in the scripts.
Resulting script is written to the remote job execution folder and executed.

The order in which wrapper templates are processed in CLI apps is as follows.

Shell environment sourced
environment variables injected
startupScript run
Custom modules prepended to the top of the wrapper
inputs and parameters template variables replaced with values from the job request
Blacklist commands, if present, are disabled in the scripts.
Resulting script is forked into the background immediately.

Environment

Comes from the system definition. Handle in your script if you cannot change the system definition to suite your needs. Ship whatever you need with your app’s assets.

Using modules in wrapper templates

See more about Modules and Lmod. Can be used to customize your environment, locate your application, and improve portability between systems. Agave does not install or manage the module installation on a particular system, however it does know how to interact with it. Specifying the modules needed to run your app either in your wrapper template or in your system definition can greatly help you during the development process.

Available wrapper template runtime macros

Agave provides information about the job, system, and user as predefined macros you can use in your wrapper templates. The full list of runtime job macros are give in the following table.

Variable	Description
AGAVE_JOB_APP_ID	The appId for which the job was requested.
AGAVE_JOB_ARCHIVE	Binary boolean value indicating whether the current job will be archived after the wrapper template exits.
AGAVE_JOB_ARCHIVE_SYSTEM	The system to which the job will be archived after the wrapper template exits.
AGAVE_JOB_ARCHIVE_URL	The fully qualified URL to the archive folder where the job output will be copied if archiving is enabled, or the URL of the output listing
AGAVE_JOB_ARCHIVE_PATH	The path on the archiveSystem where the job output will be copied if archiving is enabled.
AGAVE_JOB_BATCH_QUEUE	The batch queue on the AGAVE_JOB_EXECUTION_SYSTEM to which the job was submitted.
AGAVE_JOB_EXECUTION_SYSTEM	The Agave execution system id where this job is running.
AGAVE_JOB_ID	The unique identifier of the job.
JOB_MAX_RUNTIME	The max job run from the job request in HH:MM:SS format.
JOB_MAX_RUNTIME_MILLISECONDS	The max job run time from the job request converted to milliseconds.
AGAVE_JOB_MEMORY_PER_NODE	The amount of memory per node requested at submit time.
AGAVE_JOB_NAME	The slugified version of the name of the job. See the section on Conventions for more information about slugs.
AGAVE_JOB_NAME_RAW	The name of the job as given at submit time.
AGAVE_JOB_NODE_COUNT	The number of nodes requested at submit time.
AGAVE_JOB_OWNER	The username of the job owner.
AGAVE_JOB_PROCESSORS_PER_NODE	The number of cores requested at submit time.
AGAVE_JOB_SUBMIT_TIME	The time at which the job was submitted in ISO-8601 format.
AGAVE_JOB_TENANT	The id of the tenant to which the job was submitted.
AGAVE_JOB_ARCHIVE_URL	The Agave url to which the job will be archived after the job completes.
AGAVE_JOB_CALLBACK_RUNNING	Represents a call back to the API stating the job has started.
AGAVE_JOB_CALLBACK_CLEANING_UP	Represents a call back to the API stating the job is cleaning up.
AGAVE_JOB_CALLBACK_ALIVE	Represents a call back to the API stating the job is still alive. This will essentially update the timestamp on the job and add an entry to the job’s history record.
AGAVE_JOB_CALLBACK_NOTIFICATION	Represents a call back to the API telling it to forward a notification to the registered endpoint for that job. If no notification is registered, this will be ignored.
AGAVE_JOB_CALLBACK_FAILURE	Represents a call back to the API stating the job failed. Use this with caution as it will tell the API the job failed even if it has not yet completed. Upon receiving this callback, Agave will abandon the job and skip any archiving that may have been requested. Think of this as kill -9 for the job lifecycle.

Handling app inputs

Agave will stage the files and folders you specify as inputs to your app. These will be available in the top level of your job directory at runtime. Additionally, the names of each of the inputs will be injected into your wrapper template for you to use in your application logic. Please be aware that Agave will not attempt to resolve namespace conflicts between your app inputs. That means that if a job specifies two inputs with the same name, one will overwrite the other during the input staging phase of the job and, though the variable names will be correctly injected to the wrapper script, your job will most likely fail due to missing data.

Handling app parameters

If you refer back to the app definition we used in the App Management Tutorial, you will see there are multiple inputs and parameters defined for that app. Each input and parameter object had an id attribute. That id value is the attribute name you use to associate runtime values with app inputs and parameters. When a job is submitted to Agave, prior to physically running the wrapper template, all instances of that id are replaced with the actual value from the job request. The example below shows our app description, a job request, and the resulting wrapper template at run time.

Variable type casting

During the jobs submission process, Agave will store your inputs and parameters as serialized JSON. At the point that variable injection occurs, Agave will replace all occurrences of your input and parameter with their value provided in the job request. In order for Agave to properly identify your input and parameter ids, wrap them in brackets and prepend a dollar sign. For example, if you have a parameter with id param1, you would include it in your wrapper script as ${param1}. Case sensitivity is honored at all times.

Handling boolean values

Boolean values are passed in as truthy values. true = 1, false is empty.

Using flag parameters

If your parameter was of type “flag”, Agave will replace all occurences of the template variable with the value you provided for the argument field.

App permissions

Apps have fine grained permissions similar to those found in the Jobs and Files services. Using these, you can share your app other Agave users. App permissions are private by default, so when you first POST your app to the Apps service, you are the only one who can see it. You may share your app with other users by granting them varying degrees of permissions. The full list of app permission values are listed in the following table.

Permission	Description
READ	Gives the ability to view the app description.
WRITE	Gives the ability to update the app.
EXECUTE	Gives the ability to submit jobs using the app
ALL	Gives full READ and WRITE and EXECUTE permissions to the user.
READ_WRITE	Gives full READ and WRITE permissions to the user
READ_EXECUTE	Gives full READ and EXECUTE permissions to the user
WRITE_EXECUTE	Gives full WRITE and EXECUTE permissions to the user

App permissions are distinct from all other roles and permissions and do not have implications outside the Apps service. This means that if you want to allow someone to run a job using your app, it is not sufficient to grant them READ_EXECUTE permissions on your app. They must also have an appropriate user role on the execution system on which the app will run. Similarly, if you do not have the right to publish on the executionSystem or access the deploymentPath on the deploymentSystem in your app description, you will not be able to publish your app.

Listing app permissions

App permissions are managed through a set of URLs consistent with the permission operations elsewhere in the API. To query for a user’s permission for an app, perform a GET on the user’s unique app permissions url.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems/$USERNAME

apps-pems-list -v -u $USERNAME $APP_ID

The response from the service will be a JSON object representing the user permission. If the user does not have a permission for that app, the permission value will be NONE. By default, only you have permission to your private apps. Public apps will return a single permission for the public meta user rather than return a permissions for every user.

{
    "_links": {
        "app": {
            "href": "https://sandbox.agaveplatform.org/apps/v2/$APP_ID"
        },
        "profile": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/systest"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems/systest"
        }
    },
    "permission": {
        "execute": true,
        "read": true,
        "write": true
    },
    "username": "systest"
}

You can also query for all permissions granted on a specific app by making a GET request on the app’s permission collection.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems

apps-pems-list -v $APP_ID

This time the service will respond with a JSON array of permission objects.

[
   {
      "_links":{
         "app":{
            "href":"https://sandbox.agaveplatform.org/apps/v2/$APP_ID"
         },
         "profile":{
            "href":"https://sandbox.agaveplatform.org/profiles/v2/systest"
         },
         "self":{
            "href":"https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems/systest"
         }
      },
      "permission":{
         "execute":true,
         "read":true,
         "write":true
      },
      "username":"systest"
   }
]

Adding and updating app permissions

Setting permissions is done by posting a JSON object containing a permission and username. Alternatively, you can POST just the permission and append the username to the URL.

# Standard syntax to grant permissions to a specific user
curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST -d "username=bgibson&amp;permission=READ" https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems

# Abbreviated POST data to grant permission to a single user
curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST -d "permission=READ" https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems/bgibson

apps-pems-update -v -u bgibson -p READ $APP_ID

The response will contain a JSON object representing the permission that was just created.

{
    "_links": {
        "app": {
            "href": "https://sandbox.agaveplatform.org/apps/v2/$APP_ID"
        },
        "profile": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/bgibson"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems/bgibson"
        }
    },
    "permission": {
        "execute": false,
        "read": true,
        "write": false
    },
    "username": "bgibson"
}

Deleting app permissions

Permissions can be deleted on a user-by-user basis, or all at once. To delete an individual user permission, make a DELETE request on the user’s app permission URL.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X DELETE https://sandbox.agaveplatform.org/apps/v2/$APP_ID/bgibson

apps-pems-delete -u bgibson $APP_ID

The response will be an empty result object.

You can accomplish the same thing by updating the user permission to an empty value.

# Delete permission for a single user by updating with an empty permission value
curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"  \
     -X POST -d "username=bgibson" -d "permission=NONE" \
     https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems

# Delete permission for a single user by updating with an empty permission value
curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -X POST -d "permission=" \
     https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems/bgibson

apps-pems-update -v -u bgibson $APP_ID

Since this is an update operation, the resulting JSON permission object will be returned showing the user has no permissions to the app anymore.

{
    "_links": {
        "app": {
            "href": "https://sandbox.agaveplatform.org/apps/v2/$APP_ID"
        },
        "profile": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/bgibson"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/apps/v2/$APP_ID/pems/bgibson"
        }
    },
    "permission": {
        "execute": false,
        "read": false,
        "write": false
    },
    "username": "bgibson"
}

To delete all permissions for an app, make a DELETE request on the app’s permissions collection.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -X DELETE \
     https://sandbox.agaveplatform.org/apps/v2/$APP_ID

apps-pems-delete $APP_ID

The response will be an empty result object.

App scope

In addition to traditional permissions, apps also have a concept of scope. Unless otherwise configured, apps are private to the owner and the users they grant permission. Applications can, however move from the private space into the public space for use any anyone. Moving an app into the public space is called publishing. Publishing an app gives it much greater exposure and results in increased usage by the user community. It also comes with increased responsibilities for the original owner as well as the API administrators. Several of these are listed below:

Public apps must run on public systems. This makes the app available to everyone.
Public apps must be vetted for performance, reliability, and security by the API administrators.
The original app author must remain available via email for ongoing support.
Public apps must be copied into a public repository and checksummed.
Updates to public apps must result in a snapshot of the original app being created and stored with its resulting checksum in a separate location.
API administrators must maintain and support the app throughout its lifetime.

Publishing an app

Publishing an app.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
     -H "Content-Type: application/json"
     -X PUT
     --data-binary '{"action":"publish","executionSystem":"condor.opensciencegrid.org"}'
     https://sandbox.agaveplatform.org/apps/v2/wc-osg-1.00

apps-publish -e condor.opensciencegrid.org wc-osg-1.00

The response from the service will resemble the following:

{
  "id": "wc-osg-1.00u1",
  "name": "wc-osg",
  "icon": null,
  "uuid": "8734854070765284890-242ac116-0001-005",
  "parallelism": "SERIAL",
  "defaultProcessorsPerNode": 1,
  "defaultMemoryPerNode": 1,
  "defaultNodeCount": 1,
  "defaultMaxRunTime": null,
  "defaultQueue": null,
  "version": "1.00",
  "revision": 1,
  "isPublic": false,
  "helpURI": "http://www.gnu.org/s/coreutils/manual/html_node/wc-invocation.html",
  "label": "wc condor",
  "shortDescription": "Count words in a file",
  "longDescription": "",
  "tags": [
    "gnu",
    "textutils"
  ],
  "ontology": [
    "http://sswapmeet.sswap.info/algorithms/wc"
  ],
  "executionType": "CONDOR",
  "executionSystem": "condor.opensciencegrid.org",
  "deploymentPath": "/agave/apps/wc-1.00",
  "deploymentSystem": "public.storage.agave",
  "templatePath": "/wrapper.sh",
  "testPath": "/wrapper.sh",
  "checkpointable": true,
  "lastModified": "2016-09-15T04:48:17.000-05:00",
  "modules": [
    "load TACC",
    "purge"
  ],
  "available": true,
  "inputs": [
    {
      "id": "query1",
      "value": {
        "validator": "",
        "visible": true,
        "required": false,
        "order": 0,
        "enquote": false,
        "default": [
          "read1.fq"
        ]
      },
      "details": {
        "label": "File to count words in: ",
        "description": "",
        "argument": null,
        "showArgument": false,
        "repeatArgument": false
      },
      "semantics": {
        "minCardinality": 1,
        "maxCardinality": -1,
        "ontology": [
          "http://sswapmeet.sswap.info/util/TextDocument"
        ],
        "fileTypes": [
          "text-0"
        ]
      }
    }
  ],
  "parameters": [],
  "outputs": [
    {
      "id": "outputWC",
      "value": {
        "validator": "",
        "order": 0,
        "default": "wc_out.txt"
      },
      "details": {
        "label": "Text file",
        "description": "Results of WC"
      },
      "semantics": {
        "minCardinality": 1,
        "maxCardinality": 1,
        "ontology": [
          "http://sswapmeet.sswap.info/util/TextDocument"
        ],
        "fileTypes": []
      }
    }
  ],
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/apps/v2/wc-osg-1.00u1"
    },
    "executionSystem": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/condor.opensciencegrid.org"
    },
    "storageSystem": {
      "href": "https://sandbox.agaveplatform.org/systems/v2/public.storage.agave"
    },
    "history": {
      "href": "https://sandbox.agaveplatform.org/apps/v2/wc-osg-1.00u1/history"
    },
    "metadata": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/?q=%7B%22associationIds%22%3A%228734854070765284890-242ac116-0001-005%22%7D"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "permissions": {
      "href": "https://sandbox.agaveplatform.org/apps/v2/wc-osg-1.00u1/pems"
    }
  }
}

To publish an app, make a PUT request on the app resource. In this example, we publish the wc-osg-1.00 app. Notice a few things about the response.

Both the executionSystem and deploymentSystem have changed. Public apps must run and store their assets on public systems.
We did not specify the deploymentSystem where the public app assets should be stored, so Agave placed them on the default public storage system, public.storage.agave.
We did not specify the deploymentPath where the public app assets should be stored, so Agave placed them in the publicAppsDir of the deploymentPath.
The deploymentPath is now a zip archive rather than a folder. Agave does this because once, published, the app can no longer be updated, so the assets are frozen and stored in a separate location, removed from user access.
The id of the app has changed. It now has a u1 appended to the original app id. This indicates that it is a public app and that it has been updated a single time. If we were to publish the app again, the resulting id would be wc-osg-1.00u2. This differs from unpublished apps whose revision number increments without impacting the app id. Every time you publish an app, the id of the resulting public app will change.

Unpublishing an app

Unpublishing a public system is equivalent to disabling it.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
     -H "Content-Type: application/json"
     -X PUT
     --data-binary '{"action":"disable"}'
     https://sandbox.agaveplatform.org/apps/v2/wc-osg-1.00u1

apps-disable -v wc-osg-1.00u1

The response will look identical to before, but with available set to false

Unlike systems, it is not possible to unpublish an app. Once published, a deep copy of the app is store in an external location with its own provenance trail. If you would like to remove a published app from further use, simply disable it.

Cloning an app

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"\
     -X POST -d "action=clone" \
     -d "name=my-pyplot-demo" \
     -d "version=0.1.0" \
     -d "executionSystem=sftp.storage.example.com" \
     -d "deploymentSystem=2.2" \
     -d "deploymentPath=/apps/" \
     https://sandbox.agaveplatform.org/apps/v2/demo-pyplot-demo-advanced-0.1.0?pretty=true

apps-clone -N my-pyplot-demo -V 2.2 demo-pyplot-demo-advanced-0.1.0

Often times you will want to copy an existing app for use on another system, or simply to obtain a private copy of the app for your own use. This can be done using the clone functionality in the Apps service. The following tabs show how to do this using the unix curl command as well as with the Agave CLI.

Disabling an app

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -X PUT -d action=disable my-pyplot-demo-2.2

apps-disable -v my-pyplot-demo-2.2

Disabling an app make it unavailable for use. This means new job requests will fail due to the app being disabled. Existing jobs queued up to run will be held until the app becomes available. Running jobs will continue as normal, but any retries will be held until the app is reenabled.

To disable an app, make a PUT request on the app’s URL with action=disable as the body.

Enabling an app

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -X PUT -d action=enable my-pyplot-demo-2.2

apps-enable -v my-pyplot-demo-2.2

Enabling an app instantly returns it to service. Any pending jobs will immediately start processing according to the queuing policy and quotas in place when the app is enabled.

To enable an app, make a PUT request on the app URL with action=enable as the body.

Deleting an app

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -X DELETE my-pyplot-demo-2.2

apps-delete -v my-pyplot-demo-2.2

Deleting an app is done by calling a HTTP DELETE on an app’s URL. Note that deleting an app does not make its id available for reuse.

App history

A full history of changes to an app’s definition, permissions, and availability is recorded for every app. The recorded history events represent a subset of the events thrown by the Apps API. Generally speaking, the events saved in an app’s history represent mutations to the app’s definition and state, not its assets. For further details about history associated with file items, see the section on File history.

Direct vs indirect events

Agave will record all the direct events related to an app. Examples of direct events are enabling and updating an app. Indirect events such as submitting a job, or deleting a system to which the app is associated, will not be recorded.

Publishing history

App is published for the first time

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -X PUT --data-binary '{"action":"publish","name":"demo-pyplot-demo"}' \
    https://sandbox.agaveplatform.org/apps/v2/demo-pyplot-demo-basic-0.1.0

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -X PUT --data-binary '{"action":"publish","name":"demo-pyplot-demo"}' \
    https://sandbox.agaveplatform.org/apps/v2/demo-pyplot-demo-intermediate-0.1.0

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -X PUT --data-binary '{"action":"publish","name":"demo-pyplot-demo"}' \
    https://sandbox.agaveplatform.org/apps/v2/demo-pyplot-demo-advanced-0.1.0

apps-publish -v -N demo-pyplot-demo demo-pyplot-demo-basic-0.1.0
apps-publish -v -N demo-pyplot-demo demo-pyplot-demo-intermediate-0.1.0
apps-publish -v -N demo-pyplot-demo demo-pyplot-demo-advanced-0.1.0

The following (appreviated) entries are written to the app histories as follows

Original app demo-pyplot-demo-basic-0.1.0

[
  {
      "status": "PUBLISHED",
      "description": "App was published by nryan as demo-pyplot-demo-0.1.0u1. The published asset checksum is 6aea5f80fa7c1af9a945fc5d1cfc8c95",
      ...
   }
]

Original app demo-pyplot-demo-intermediate-0.1.0

[
  {
      "status": "PUBLISHED",
      "description": "App was published by nryan as demo-pyplot-demo-0.1.0u1. The published asset checksum is 6aea5f80fa7c1af9a945fc5d1cfc8c95",
      ...
   }
]

Original app demo-pyplot-demo-advanced-0.1.0

[
  {
      "status": "PUBLISHED",
      "description": "App was published by nryan as demo-pyplot-demo-0.1.0u3. The published asset checksum is f4193325b37b879e7218dcd81c81c614",
      ...
   }
]

Published app demo-pyplot-demo-0.1.0u1

[
  {
    "status": "CREATED",
    "description": "App was created by nryan as a result of publishing demo-pyplot-demo-basic-0.1.0. The asset checksum is 6aea5f80fa7c1af9a945fc5d1cfc8c95",
    ...
  },
  {
    "status": "REPUBLISHED",
    "description": "A new version of this app, demo-pyplot-demo-0.1.0u2, was created by nryan as a result of publishing demo-pyplot-demo-intermediate-0.1.0. The published asset checksum is 9b7f72f279ad41a993fe9b1eaca87e3a",
    ...
  },
  {
    "status": "REPUBLISHED",
    "description": "A new version of this app, demo-pyplot-demo-0.1.0u3, was created by nryan as a result of publishing demo-pyplot-demo-advanced-0.1.0. The published asset checksum is f4193325b37b879e7218dcd81c81c614",
    ...
  },
]

Published app demo-pyplot-demo-0.1.0u2

[
  {
    "status": "CREATED",
    "description": "App was created by nryan as a result of publishing demo-pyplot-demo-intermediate-0.1.0. The asset checksum is 9b7f72f279ad41a993fe9b1eaca87e3a",
    ...
  },
  {
    "status": "REPUBLISHED",
    "description": "A new version of this app, demo-pyplot-demo-0.1.0u3, was created by nryan as a result of publishing demo-pyplot-demo-advanced-0.1.0. The published asset checksum is f4193325b37b879e7218dcd81c81c614",
    ...
  } 
]

Published app demo-pyplot-demo-0.1.0u3

[
  {
    "status": "CREATED",
    "description": "App was created by nryan as a result of publishing demo-pyplot-demo-advanced-0.1.0. The asset checksum is f4193325b37b879e7218dcd81c81c614",
    ...
   }
]

When publishing an app, the published name and the original name may not be the same. Thus, there may be multiple apps from which a public app was created over time. Further, since every app publication results in a new public app id, there may not be any outwardly apparent relationship between the current and previous versions of a published app. To help track the changes of a published app over time, Agave will propagate and record publication REPUBLISHED events through the entire ancestry of a published app. Take the following example using our pyplot app to illustrate this behavior.

We see that republishing an app records the change in the history of every app in the published app’s ancestry. This makes it easy to track down changes that occur both in the past and future of a given published app.

Listing app history

List the history of an app

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/apps/v2/history/demo-pyplot-demo-advanced-0.1.0

app-history -v demo-pyplot-demo-advanced-0.1.0

The response to this contains a summary listing all recorded events in the app history

[
  {
    "status": "DOWNLOAD",
    "created": "2016-09-20T19:47:56.000-05:00",
    "createdBy": "public",
    "description": "File was downloaded"
  },
  {
    "status": "STAGING_QUEUED",
    "created": "2016-09-20T19:48:12.000-05:00",
    "createdBy": "nryan",
    "description": "File/folder queued for staging"
  },
  {
    "status": "STAGING_COMPLETED",
    "created": "2016-09-20T19:48:16.000-05:00",
    "createdBy": "nryan",
    "description": "Staging completed successfully"
  },
  {
    "status": "TRANSFORMING_COMPLETED",
    "created": "2016-09-20T19:48:17.000-05:00",
    "createdBy": "nryan",
    "description": "Your scheduled transfer of http://129.114.97.92/picksumipsum.txt completed staging. You can access the raw file on iPlant Data Store at /home/nryan/picksumipsum.txt or via the API at https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org//nryan/picksumipsum.txt."
  }
]

Basic paginated listing of app history events is available as shown in the example. Currently, the app history service is readonly. The only way to erase the history on an app is to delete the app through the API.

Jobs

    /$$$$$         /$$
   |__  $$        | $$
      | $$ /$$$$$$| $$$$$$$  /$$$$$$$
      | $$/$$__  $| $$__  $$/$$_____/
 /$$  | $| $$  \ $| $$  \ $|  $$$$$$
| $$  | $| $$  | $| $$  | $$\____  $$
|  $$$$$$|  $$$$$$| $$$$$$$//$$$$$$$/
 \______/ \______/|_______/|_______/

The Jobs service is a basic execution service that allows you to run applications registered with the Apps service across multiple, distributed, heterogeneous systems through a common REST interface. The service manages all aspects of execution and job management from data staging, job submission, monitoring, output archiving, event logging, sharing, and notifications. The Jobs service also provides a persistent reference to your job’s output data and a mechanism for sharing all aspects of your job with others. Each feature will be described in more detail below.

Job submission

Job submission is a term recycled from shared batch computing environments where a user would submit a request for a unit of computational work (called a Job) to the batch scheduler, then go head home for dinner while waiting for the computer to complete the job they gave it.

Originally the batch scheduler was a person and the term batch came from their ability to process several submissions together. Later on, as human schedulers were replaced by software, the term stuck even though the process remained unchanged. Today the term job submission is essentially unchanged.

A user submits a request for a unit of work to be done. The primary difference is that today, often times, the wait time between submission and execution is considerably less. On shared systems, such as many of the HPC systems originally targeted by Agave, waiting for your job to start is the price you pay for the incredible performance you get once your job starts.

Agave, too, adopts the concept of job submission, though it is not in and of itself a scheduler. In the context of Agave’s Job service, the process of running an application registered with the Apps service is referred to as submitting a job.

Unlike in the batch scheduling world where each scheduler has its own job submission syntax and its own idiosyncrasies, the mechanism for submitting a job to Agave is consistent regardless of the application or system on which you run. A HTML form or JSON object are posted to the Jobs service. The submission is validated, and the job is forwarded to the scheduling and execution services for processing.

Because Agave takes an app-centric view of science, execution does not require knowing about the underlying systems on which an application runs. Simply knowing how the parameters and inputs you want to use when running an app is sufficient to define a job. Agave will handle the rest.

As mentioned previously, jobs are submitted by making a HTTP POST request either a HTML form or a JSON object to the Jobs service. All job submissions must include a few mandatory values that are used to define a basic unit of work. Table 1 lists the optional and required attributes of all job submissions.

Name	Value(s)	Description
name	string	Descriptive name of the job. This will be slugified and used as one component of directory names in certain situations.
appId	string	The unique name of the application being run by this job. This must be a valid application that the calling user has permission to run.
batchQueue	string	The batch queue on the execution system to which this job is submitted. Defaults to the app’s defaultQueue property if specified. Otherwise a best-fit algorithm is used to match the job parameters to a queue on the execution system with sufficient capabilities to run the job.
nodeCount	integer	The number of nodes to use when running this job. Defaults to the app’s defaultNodes property or 1 if no default is specified.
processorsPerNode	integer	The number of processors this application should utilize while running. Defaults to the app’s defaultProcessorsPerNode property or 1 if no default is specified. If the application is not of executionType PARALLEL, this should be 1.
memoryPerNode	string	The maximum amount of memory needed per node for this application to run given in ####.#[E\|P\|T\|G]B format. Defaults to the app’s defaultMemoryPerNode property if it exists. GB are assumed if no magnitude is specified.
maxRunTime	string	The estimated compute time needed for this application to complete given in hh:mm:ss format. This value must be less than or equal to the max run time of the queue to which this job is assigned.
notifications*	JSON array	An array of one or more JSON objects describing an event and url which the service will POST to when the given event occurs. For more on Notifications, see the section on webhooks below.
archive*	boolean	Whether the output from this job should be archived. If true, all new files created by this application’s execution will be archived to the archivePath in the user’s default storage system.
archiveSystem*	string	System to which the job output should be archived. Defaults to the user’s default storage system if not specified.
archivePath*	string	Location where the job output should be archived. A relative path or absolute path may be specified. If not specified, a unique folder will be created in the user’s home directory of the archiveSystem at ‘archive/jobs/job-$JOB_ID’

Table 1. The optional and required attributes common to all job submissions. Optional fields are marked with an astericks.

In addition to the standard fields for all jobs, the application you specify in the appId field will also have its own set of inputs and parameters specified during registration that are unique to that app. (For more information about app registration and descriptions, see the App Management Tutorial).

The following snippet shows a sample JSON job request that could be submitted to the Jobs service to run the pyplot-0.1.0 app from the Advanced App Example tutorial.

{
  "name":"pyplot-demo test",
  "appId":"demo-pyplot-demo-advanced-0.1.0",
  "inputs":{
    "dataset":[
      "agave://$PUBLIC_STORAGE_SYSTEM/$API_USERNAME/inputs/pyplot/testdata.csv",
      "agave://$PUBLIC_STORAGE_SYSTEM/$API_USERNAME/inputs/pyplot/testdata2.csv"
    ]
  },
  "archive":false,
  "parameters":{
    "unpackInputs":false,
    "chartType":[
      "bar",
      "line"
    ],
    "width":1024,
    "height":512,
    "background":"#d96727",
    "showYLabel":true,
    "ylabel":"The Y Axis Label",
    "showXLabel":true,
    "xlabel":"The X Axis Label",
    "showLegend":true,
    "separateCharts":false
  },
  "notifications":[
    {
      "url":"$API_EMAIL",
      "event":"RUNNING"
    },
    {
      "url":"$API_EMAIL",
      "event":"FINISHED"
    },
    {
      "url":"http://requestbin.agaveplatform.org/o1aiawo1?job_id=${JOB_ID}&amp;status=${JOB_STATUS}",
      "event":"*",
      "persistent":true
    }
  ]
}

Notice that this example specifies a single input attribute, dataset. The pyplot-0.1.0 app definition specified that the dataset input attribute could accept more than one value (maxCardinality = 2). In the job request object, that translates to an array of string values. Each string represents a piece of data that Agave will transfer into the job work directory prior to job execution. Any value accepted by the Files service when importing data is accepted here. Some examples of valid values are given in the following table.

Name	Description
inputs/pyplot/testdata.csv	A relative path on the user’s default storage system.
/home/apiuser/inputs/pyplot/testdata.csv	An absolute path on the user’s default storage system.
agave://$PUBLIC_STORAGE_SYSTEM/$API_USERNAME/inputs/pyplot/testdata.csv	An Agave URL explicitly specifying a source system and relative path.
agave://$PUBLIC_STORAGE_SYSTEM//home/apiuser/$API_USERNAME/inputs/pyplot/testdata.csv	An Agave URL explicitly specifying a source system and absolute path.
http://example.com/inputs/pyplot/testdata.csv	Standard url with any supported transfer protocol.

Table 2. Examples of different syntaxes that input values can be specified in the job request object. Here we assume that the validator for the input field is such that these would pass.

The example job request also specifies parameters object with the parameters defined in the pyplot-0.1.0 app description. Notice that the parameter type value specified in the app description is reflected here. Numbers are given as numbers, not strings. Boolean and flag attributes are given as boolean true and false values. As with the input section, there is also a parameter chartType that accepts multiple values. In this case that translates to an array of string value. Had the parameter type required another primary type, that would be used in the array instead.

Finally, we see a notifications array specifying that we want Agave send three notifications related to this job. The first is a one-time email when the job starts running. The second is a one-time email when the job reaches a terminal state. The third is a webhook to the url we specified. More on notifications in the section on monitoring below.

Job submission validation

If everything went well, you will receive a response that looks something like the following JSON object.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST -F "fileToUpload=@job.json" https://sandbox.agaveplatform.org/jobs/v2?pretty=true

jobs-submit -F job.json

{
  "status" : "success",
  "message" : null,
  "version" : "2.1.0-r6d11c",
  "result" : {
    "id" : "0001414144065563-5056a550b8-0001-007",
    "name" : "demo-pyplot-demo-advanced test-1414139896",
    "owner" : "$API_USERNAME",
    "appId" : "demo-pyplot-demo-advanced-0.1.0",
    "executionSystem" : "$PUBLIC_EXECUTION_SYSTEM",
    "batchQueue" : "debug",
    "nodeCount" : 1,
    "processorsPerNode" : 1,
    "memoryPerNode" : 1.0,
    "maxRunTime" : "01:00:00",
    "archive" : false,
    "retries" : 0,
    "localId" : "10321",
    "outputPath" : null,
    "status" : "FINISHED",
    "submitTime" : "2014-10-24T04:48:11.000-05:00",
    "startTime" : "2014-10-24T04:48:08.000-05:00",
    "endTime" : "2014-10-24T04:48:15.000-05:00",
    "inputs" : {
      "dataset" : "agave://$PUBLIC_STORAGE_SYSTEM/$API_USERNAME/inputs/pyplot/testdata.csv"
    },
    "parameters" : {
      "chartType" : "bar",
      "height" : "512",
      "showLegend" : "false",
      "xlabel" : "Time",
      "background" : "#FFF",
      "width" : "1024",
      "showXLabel" : "true",
      "separateCharts" : "false",
      "unpackInputs" : "false",
      "ylabel" : "Magnitude",
      "showYLabel" : "true"
    },
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
      },
      "app" : {
        "href" : "https://sandbox.agaveplatform.org/apps/v2/demo-pyplot-demo-advanced-0.1.0"
      },
      "executionSystem" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/$PUBLIC_EXECUTION_SYSTEM"
      },
      "archiveData" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/listings"
      },
      "owner" : {
        "href" : "https://sandbox.agaveplatform.org/profiles/v2/$API_USERNAME"
      },
      "permissions" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/pems"
      },
      "history" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/history"
      },
      "metadata" : {
        "href" : "https://sandbox.agaveplatform.org/meta/v2/data/?q={"associationIds":"0001414144065563-5056a550b8-0001-007"}"
      },
      "notifications" : {
        "href" : "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=0001414144065563-5056a550b8-0001-007"
      }
    }
  }
}

Job monitoring

Once you submit your job request, the job will be handed off to Agave’s back end execution service. Your job may run right away, or it may wait in a batch queue on the execution system until the required resources are available. Either way, the execution process occurs completely asynchronous to the submission process. To monitor the status of your job, Agave supports two different mechanisms: polling and webhooks.

Polling

If you have ever taken a long road trip with children, you are probably painfully aware of how polling works. Starting several minutes from the time you leave the house, a child asks, “Are we there yet?” You reply, “No.” Several minutes later the child again asks, “Are we there yet?” You again reply, “No.” This process continues until you finally arrive at your destination. This is called polling and polling is bad

Polling for your job status works the same way. After submitting your job, you start a while loop that queries the Jobs service for your job status until it detects that the job is in a terminal state. The following two URLs both return the status of your job. The first will result in a list of abbreviated job descriptions, the second will result in a full description of the job with the given $JOB_ID, exactly like that returned when submitting the job. The third will result in a much smaller response object that contains only the $JOB_ID and status being returned.

curl -sk -H "Authorization: Bearer  $ACCESS_TOKEN" https://sandbox.agaveplatform.org/jobs/v2
curl -sk -H "Authorization: Bearer  $ACCESS_TOKEN" https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007
curl -sk -H "Authorization: Bearer  $ACCESS_TOKEN" https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/status

Sample response snippet

{
  "status" : "success",
  "message" : null,
  "version" : "2.1.0-r6d11c",
  "result" : {
    "id" : "0001414144065563-5056a550b8-0001-007",
    "status" : "FINISHED",
    "_links" : {
      "self" : {
        "href" : "$API_BASE_URL/jobs/v2/0001414144065563-5056a550b8-0001-007"
      }
    }
  }
}

The list of all possible job statuses is given in table 2.

Event	Description
CREATED	The job was updated
UPDATED	The job was updated
DELETED	The job was deleted
PERMISSION_GRANT	User permission was granted
PERMISSION_REVOKE	Permission was removed for a user on this job
PENDING	Job accepted and queued for submission.
STAGING_INPUTS	Transferring job input data to execution system
CLEANING_UP	Job completed execution
ARCHIVING	Transferring job output to archive system
STAGING_JOB	Job inputs staged to execution system
FINISHED	Job complete
KILLED	Job execution killed at user request
FAILED	Job failed
STOPPED	Job execution intentionally stopped
RUNNING	Job started running
PAUSED	Job execution paused by user
QUEUED	Job successfully placed into queue
SUBMITTING	Preparing job for execution and staging binaries to execution system
STAGED	Job inputs staged to execution system
PROCESSING_INPUTS	Identifying input files for staging
ARCHIVING_FINISHED	Job archiving complete
ARCHIVING_FAILED	Job archiving failed
HEARTBEAT	Job heartbeat received

Table 2. Job statuses listed in progressive order from job submission to completion.

Polling is an incredibly effective approach, but it is bad practice for two reasons. First, it does not scale well. Querying for one job status every few seconds does not take much effort, but querying for 100 takes quite a bit of time and puts unnecessary load on Agave’s servers. Second, polling provides what is effectively a binary response. It tells you whether a job is done or not done, it does not give you any information on what is actually going on with the job or where it is in the overall execution process.

The job history URL provides much more detailed information on the various state changes, system messages, and progress information associated with data staging. The syntax of the job history URL is as follows

curl -sk -H "Authorization: Bearer  $ACCESS_TOKEN" https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/history

Sample response snippet

{
  "status":"success",
  "message":null,
  "version":"2.1.0-r6d11c",
  "result":[
    {
      "created":"2014-10-24T04:47:45.000-05:00",
      "status":"PENDING",
      "description":"Job accepted and queued for submission."
    },
    {
      "created":"2014-10-24T04:47:47.000-05:00",
      "status":"PROCESSING_INPUTS",
      "description":"Attempt 1 to stage job inputs"
    },
    {
      "created":"2014-10-24T04:47:47.000-05:00",
      "status":"PROCESSING_INPUTS",
      "description":"Identifying input files for staging"
    },
    {
      "created":"2014-10-24T04:47:48.000-05:00",
      "status":"STAGING_INPUTS",
      "description":"Staging agave://$PUBLIC_STORAGE_SYSTEM/$API_USERNAME/inputs/pyplot/testdata.csv to remote job directory"
    },
    {
      "progress":{
        "averageRate":0,
        "totalFiles":1,
        "source":"agave://$PUBLIC_STORAGE_SYSTEM/$API_USERNAME/inputs/pyplot/testdata.csv",
        "totalActiveTransfers":0,
        "totalBytes":3212,
        "totalBytesTransferred":3212
      },
      "created":"2014-10-24T04:47:48.000-05:00",
      "status":"STAGING_INPUTS",
      "description":"Copy in progress"
    },
    {
      "created":"2014-10-24T04:47:50.000-05:00",
      "status":"STAGED",
      "description":"Job inputs staged to execution system"
    },
    {
      "created":"2014-10-24T04:47:55.000-05:00",
      "status":"SUBMITTING",
      "description":"Preparing job for submission."
    },
    {
      "created":"2014-10-24T04:47:55.000-05:00",
      "status":"SUBMITTING",
      "description":"Attempt 1 to submit job"
    },
    {
      "created":"2014-10-24T04:48:08.000-05:00",
      "status":"RUNNING",
      "description":"Job started running"
    },
    {
      "created":"2014-10-24T04:48:12.000-05:00",
      "status":"CLEANING_UP"
    },
    {
      "created":"2014-10-24T04:48:15.000-05:00",
      "status":"FINISHED",
      "description":"Job completed. Skipping archiving at user request."
    }
  ]
}

Depending on the nature of your job and the reliability of the underlying systems, the response from this service can grow rather large, so it is important to be aware that this query can be an expensive call for your client application to make. Everything we said before about polling job status applies to polling job history with the additional caveat that you can chew through quite a bit of bandwidth polling this service, so keep that in mind if your application is bandwidth starved.

Often times, however, polling is unavoidable. In these situations, we recommend using an exponential backoff to check job status. An exponential backoff is an alogrithm that increases the time between retries as the number of failures increases.

Webhooks

Webhooks are the alternative, preferred way for your application to monitor the status of asynchronous actions in Agave. If you are a Gang of Four disciple, webhooks are a mechanism for implementing the Observer Pattern. They are widely used across the web and chances are that something you’re using right now is leveraging them. In the context of Agave, a webhook is a URL that you give to Agave in advance of an event which it later POSTs a response to when that event occurs. A webhook can be any web accessible URL.

The Jobs service provides several template variables for constructing dynamic URLs. Template variables can be included anywhere in your URL by surrounding the variable name in the following manner ${VARIABLE_NAME}. When an event of interest occurs, the variables will be resolved and the resulting URL called. Several example urls are given below.

http://example.com/?job_id=${JOB_ID}&amp;job_status=${EVENT}

http://example.com/trigger/job/${JOB_NAME}/${EVENT}

http://example.com/webhooks/?nonce=sdfkajerouiwe234289fahlkqr&amp;id=${JOB_ID}&amp;status=${EVENT}&amp;start=${JOB_START_TIME}&amp;end=${JOB_END_TIME}&amp;url=${JOB_ARCHIVE_URL}

The full list of template variables are listed in the following table.

Variable	Description
UUID	The UUID of the job
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
JOB_STATUS	The status of the job at the time the event occurs
JOB_URL	The url of the job within the API
JOB_ID	The unique id used to reference the job within Agave.
JOB_SYSTEM	ID of the job execution system (ex. ssh.execute.example.com)
JOB_NAME	The user-supplied name of the job
JOB_START_TIME	The time when the job started running in ISO8601 format.
JOB_END_TIME	The time when the job stopped running in ISO8601 format.
JOB_SUBMIT_TIME	The time when the job was submitted to Agave for execution by the user in ISO8601 format.
JOB_ARCHIVE_PATH	The path on the archive system where the job output will be staged.
JOB_ARCHIVE_URL	The Agave URL for the archived data.
JOB_ERROR	The error message explaining why a job failed. Null if completed successfully.
JOB_APP_ID	The id of the app being run.
JOB_BATCH_QUEUE	The batch queue of the JOB_EXECUTION_SYSTEM on which the job is assigned.
JOB_CREATED	The time when the job request was initially received in ISO8601 format.
JOB_EXECUTION_SYSTEM	The agave execution system id on which the job will run.
JOB_INPUTS	The serialized JSON object representing the job inputs.
JOB_LOCAL_ID	The id of the job on the JOB_EXECUTION_SYSTEM. This will be the id assigned to the batch scheduler, condor schedd, or sytem PID depending on the system scheduler type.
JOB_MAX_RUNTIME	The max job run from the job request in HH:MM:SS format.
JOB_MAX_RUNTIME_MILLISECONDS	The max job run time from the job request converted to milliseconds.
JOB_MAX_RUNTIME_SECONDS	The max job run time from the job request converted to seconds.
JOB_MEMORY_PER_NODE	The memory requested per node in the job request in GB.
JOB_NODE_COUNT	The number of nodes from the job request.
JOB_OWNER	The username of the user who submitted the job request.
JOB_OUTPUT_PATH	The absolute path to the job directory on the remote system.
JOB_PARAMETERS	The serialized JSON object representing the job parameters.
JOB_PROCESSORS_PER_NODE	The processors per node from the job request.
JOB_STATUS	The current job status.
JOB_START_TIME	The time when the job moved to a “RUNNING” status in ISO8601 format.
JOB_TENANT	The code of the tenant to which the job was submitted.
JOB_URL	The canonical Agave URL of the job.
JOB_ARCHIVE	Whether Agave will attempt to archive the job. Values “true” or “false”.
JOB_ARCHIVE_SYSTEM	The Agave storage system id to which the job output will be archived. This will be NULL if the the job is not archived.
JOB_ERROR/td>	The current debug or error message set for the job.
JOB_JSON	The serialized JSON object representing the job. This is identical to what would come back if you made a naked GET request on the job url.
RAW_JSON	The serialized JSON event payload. This is identical to what is sent in the body of a webhook POST callback.

Table 3. Template variables available for use when defining webhooks for your job.

Email

In situations where you do not have a persistent web address, or access to a backend service, you may find it more convenient to subscribe for email notifications rather then providing a webhook. Agave supports email notifications as well. Simply specify a valid email address in the url field in your job submission notification object and an email will be sent to that address when a relevant event occurs. A sample email message is given below.

The status of job 0001414144065563-5056a550b8-0001-007, "demo-pyplot-demo-advanced test-1414139896," has changed to FINISHED.

Name: demo-pyplot-demo-advanced test-1414139896
URL: https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007
Message: Job completed successfully.
Submit Time: 2014-10-24T04:48:11.000-05:00
Start Time: 2014-10-24T04:48:08.000-05:0
End Time: 2014-10-24T04:48:15.000-05:00
Output Path: $API_USERNAME/archive/jobs/job-0001414144065563-5056a550b8-0001-007
Output URL: https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs

Websockets

Websockets are a realtime approach to monitoring where your client application listens on a dedicated channel for notification messages from Agave. Simply subscribe to Agave’s websocket server (https://realtime.agaveplatform.org and listen for a channel matching the job id.

/agave.prod/$API_USERNAME/$JOB_ID

Stopping

Once your job is submitted, you have the ability to stop the job. This will kill the job on the system on which it is running.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST -d "action=kill" https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID

jobs-stop $JOB_ID

{
  "id" : "$JOB_ID",
  "name" : "demo-pyplot-demo-advanced test-1414139896",
  "owner" : "$API_USERNAME",
  "appId" : "demo-pyplot-demo-advanced-0.1.0",
  "executionSystem" : "$PUBLIC_EXECUTION_SYSTEM",
  "batchQueue" : "debug",
  "nodeCount" : 1,
  "processorsPerNode" : 1,
  "memoryPerNode" : 1.0,
  "maxRunTime" : "01:00:00",
  "archive" : false,
  "retries" : 0,
  "localId" : "10321",
  "outputPath" : null,
  "status" : "STOPPED",
  "submitTime" : "2014-10-24T04:48:11.000-05:00",
  "startTime" : "2014-10-24T04:48:08.000-05:00",
  "endTime" : null,
  "inputs" : {
    "dataset" : "agave://$PUBLIC_STORAGE_SYSTEM/$API_USERNAME/inputs/pyplot/testdata.csv"
  },
  "parameters" : {
    "chartType" : "bar",
    "height" : "512",
    "showLegend" : "false",
    "xlabel" : "Time",
    "background" : "#FFF",
    "width" : "1024",
    "showXLabel" : "true",
    "separateCharts" : "false",
    "unpackInputs" : "false",
    "ylabel" : "Magnitude",
    "showYLabel" : "true"
  },
  "_links" : {
    "self" : {
      "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
    },
    "app" : {
      "href" : "https://sandbox.agaveplatform.org/apps/v2/demo-pyplot-demo-advanced-0.1.0"
    },
    "executionSystem" : {
      "href" : "https://sandbox.agaveplatform.org/systems/v2/$PUBLIC_EXECUTION_SYSTEM"
    },
    "archiveData" : {
      "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/listings"
    },
    "owner" : {
      "href" : "https://sandbox.agaveplatform.org/profiles/v2/$API_USERNAME"
    },
    "permissions" : {
      "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/pems"
    },
    "history" : {
      "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/history"
    },
    "metadata" : {
      "href" : "https://sandbox.agaveplatform.org/meta/v2/data/?q={"associationIds":"0001414144065563-5056a550b8-0001-007"}"
    },
    "notifications" : {
      "href" : "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=0001414144065563-5056a550b8-0001-007"
    }
  }
}

Deleting

Deleting a job

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X DELETE https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID

jobs-delete $JOB_ID

Over time the number of jobs you have run can grow rather large. You can delete jobs to remove them from your listing results.

Resubmitting

Resubmitting a job

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST -d "action=resubmit" https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID

jobs-resubmit $JOB_ID

Often times you will want to rerun a previous job as part of a pipeline, automation, or validation that the results were valid. In this situation, it is convenient to use the resubmit feature of the Jobs service.

Resubmission provides you the options to enforce as much or as little rigor as you desire with respect to reproducibility in the job submission process. The following options are available to you for configuring a resubmission according to your requirements.

Field	Type	Description
ignoreInputConflicts	boolean	Whether to ignore discrepencies in the previous app inputs for the resubmitted job. If true, the resubmitted job will make a best fit attempt and migrating the inputs.
ignoreParameterConflicts	boolean	Whether to ignore discrepencies in the previous app parameters for the resubmitted job. If true, the resubmitted job will make a best fit attempt and migrating the parameters.
preserveNotifications	boolean	Whether to recreate the notification of the original job for the resubmitted job.

Outputs

Throughout the lifecycle of a job, your inputs, application assets, and outputs are copied from and shuffled between several different locations. Though it is possible in many instances to explicitly locate and view all the moving pieces of your job through the Files service, resolving where those pieces are given the status, execution system, storage systems, data protocols, login protocols, and execution mechanisms of your job at a given time is…challenging. It is important, however, that you have the ability to monitor your job’s output throughout the lifetime of the job.

To make tracking the output of a specific job easier to do, the Jobs service provides a special URL for referencing individual job outputs

curl -sk -H "Authorization: Bearer  $ACCESS_TOKEN" https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/outputs/listings/$PATH

The syntax of this service is consistent with the Files service syntax, as is the JSON response from the service. The response would be similar to the following:

{
  "status" : "success",
  "message" : null,
  "version" : "2.1.0-r6d11c",
  "result" : [ {
    "name" : "output",
    "path" : "/output",
    "lastModified" : "2014-11-06T13:34:35.000-06:00",
    "length" : 0,
    "permission" : "NONE",
    "mimeType" : "text/directory",
    "format" : "folder",
    "type" : "dir",
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/media/output"
      },
      "system" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
      },
      "parent" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
      }
    }
  }, {
    "name" : "demo-pyplot-demo-advanced-test-1414139896.err",
    "path" : "/demo-pyplot-demo-advanced-test-1414139896.err",
    "lastModified" : "2014-11-06T13:34:27.000-06:00",
    "length" : 442,
    "permission" : "NONE",
    "mimeType" : "application/octet-stream",
    "format" : "unknown",
    "type" : "file",
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/media/demo-pyplot-demo-advanced-test-1414139896.err"
      },
      "system" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
      },
      "parent" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
      }
    }
  }, {
    "name" : "demo-pyplot-demo-advanced-test-1414139896.out",
    "path" : "/demo-pyplot-demo-advanced-test-1414139896.out",
    "lastModified" : "2014-11-06T13:34:30.000-06:00",
    "length" : 1396,
    "permission" : "NONE",
    "mimeType" : "application/octet-stream",
    "format" : "unknown",
    "type" : "file",
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/media/demo-pyplot-demo-advanced-test-1414139896.out"
      },
      "system" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
      },
      "parent" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
      }
    }
  }, {
    "name" : "demo-pyplot-demo-advanced-test-1414139896.pid",
    "path" : "/demo-pyplot-demo-advanced-test-1414139896.pid",
    "lastModified" : "2014-11-06T13:34:33.000-06:00",
    "length" : 6,
    "permission" : "NONE",
    "mimeType" : "application/octet-stream",
    "format" : "unknown",
    "type" : "file",
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/media/demo-pyplot-demo-advanced-test-1414139896.pid"
      },
      "system" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
      },
      "parent" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
      }
    }
  }, {
    "name" : "testdata.csv",
    "path" : "/testdata.csv",
    "lastModified" : "2014-11-06T13:34:42.000-06:00",
    "length" : 3212,
    "permission" : "NONE",
    "mimeType" : "application/octet-stream",
    "format" : "unknown",
    "type" : "file",
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/media/testdata.csv"
      },
      "system" : {
        "href" : "https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
      },
      "parent" : {
        "href" : "https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
      }
    }
  } ]
}

To download a file you would use the following syntax

curl -sk -H "Authorization: Bearer  $ACCESS_TOKEN" https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/outputs/media/$PATH

Regardless of job status, the above services will always point to the most recent location of the job data. If you choose for the Jobs service to archive your job after completion, the URL will point to the archive folder of the job. If you do not choose to archive your data, or if archiving fails, the URL will point to the execution folder created for your job at runtime. Because Agave does not own any of the underlying hardware, it cannot guarantee that those locations will always exist. If, for example, the execution system enforces a purge policy, the output data may be deleted by the system administrators. Agave will let you know if the data is no longer present, however, it cannot prevent it from being deleted. This is another reason that it is important to archive data you feel will be needed in the future.

Job Lifecycle Management

Pseudocode for job work directory naming

if (executionSystem.scratchDir exists)

    $jobDir = executionSystem.scratchDir

else if (executionSystem.workDir exists)

    $jobDir = system.workDir

else

    $jobDir = system.storage.homeDir

endif

$jobDir = $jobDir + "/" + job.owner + "/job-" + job.uuid

Agave handles all of the end-to-end details involved with managing a job lifecycle for you. This can seem like black magic at times, so here we detail the overall lifecycle process every job goes through.

Job request is made, validated, and saved.
Job is queued up for execution. Job stays in a pending state until there are resources to run the job. This means that the target execution system is online, the storage system with the app assets is online, and neither the user nor the system are over quota. If resource do not become available with 7 days, the job is killed.
When resources are available to run the job on the execution system, a work directory is created on the execution system. The job work directory is created based on the following pseudocode.
The job inputs are staged to the job work directory, job status is updated to “INPUTS_STAGING”
- If all inputs succeed and the job is updated to “STAGED”
- If one or more inputs fail to transfer. Job status is set back to “PENDING” and staging will be attempted up to 2 more times.
- If the user does not have permission to access one or more inputs. The job is set to “FAILED” and exists.
The job again waits until the resources are available to run the job. Usually this is immediately after the inputs finish staging. If resource do not become available with 7 days, the job is killed.
The app deploymentPath is copied from the app.deploymentSystem to a temp dir on the API server. The jobs API then processes the app.deploymentDir + “/” + app.templatePath`` file to create the .ipcexe file. The process goes as follows:
1. Script headers are written. This includes scheduler directives if a batch system, shbang if a forked app.
2. Additional executionSystem[job.batchQueue].customDirectives are written
3. “RUNNING” callback written
4. Module commands are written
5. executionSystem.environment is written
6. wrapper script is filtered
  - blacklisted commands are removed
  - app parameter template variables are resolved against job parameter values.
  - app input template variables are resolved against job input values
  - blacklisted commands are removed again
7. “CLEANING_UP” callback written
8. All template macros are resolved.
9. job.name.slugify + ".ipcexe" file written to temp directory
App assets with wrapper template are copied to remote job work directory.
Directory listing of job work directory is written to a .agave.archive manifest file in the remote job work directory.
Command line is generated to invoke the *.ipcexe file by the appropriate method for the execution system.
Command line is run on the remote system. If the command succeeds, the schedule, process, or other remote job id is captured and stored with the job record. If the command fails, the job status is updated to “STAGED”, and submission will be attempted up to 2 more times.
Job is updated to “QUEUED”
Job waits for a “RUNNING” callback and adds a background process to monitor the job in case the callback never comes.
Callback checks the job status according the the following schedule
- every 30 seconds for the first 5 minutes
- every minute for the next 30 minutes
- every 5 minutes for the next hour
- every 15 minutes for the next 12 hours
- every 30 minutes for the next 24 hours
- every hour for the next 14 days
Job either calls back with a “CLEANING_UP” status update or the monitoring process discovers the job no longer exists on the remote system.
If job.archive is true, send job to archiving queue to stage outputs to job.archiveSystem. Resource do not become available with 7 days, the job is killed.
Read the .agave.archive manifest file from the job work directory
Begin a breadth first directory traversal of the job work directory
If a file/folder is not in the .agave.archive manifest, copy it to the job.archivePath on the job.archiveSystem
Delete the job work directory
Update job status to “FINISHED”

As with the Systems, Apps, and Files services, your jobs have their own set of access controls. Using these, you can share your job and its data with other Agave users. Job permissions are private by default. The permissions you give a job apply both to the job, its outputs, its metadata, and the permissions themselves. Thus, by sharing a job with another user, you share all aspects of that job.

Job permissions are managed through a set of URLs consistent with the permissions URL elsewhere in the API.

Granting

# General grant
curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -H "Content-Type: application/json" \
     -X POST --data-binary '{"permission":"READ","username":"$USERNAME"}' \
     https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/pems

# Custom url grant
curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -H "Content-Type: application/json" \
     -X POST --data-binary '{"permission":"READ"}' \
     https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/pems/$USERNAME

jobs-pems-update -u $USERNAME $JOB_ID

{
  "username": "$USERNAME",
  "internalUsername": null,
  "permission": {
    "read": true,
    "write": false
  },
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/pems/$USERNAME"
    },
    "parent": {
      "href": "https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID"
    },
    "profile": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/$USERNAME"
    }
  }
}

Granting permissions is simply a matter of issuing a POST with the desired permission object to the job’s pems collection.

The available permission values are listed in Table 2.

Permission	Description
READ	Gives the ability to view the job status, and output data.
WRITE	Gives the ability to perform actions, manage metadata, and set permissions.
ALL	Gives full READ and WRITE permissions to the user.
READ_WRITE	Synonymous to ALL. Gives full READ and WRITE permissions to the user

Table 2. Supported job permission values.

Job permissions are distinct from file permissions. In many instances, your job output will be accessible via the Files and Jobs services simultaneously. Granting a user permissions a job output file through the Files services does not alter the accessibility of that file through the Jobs service. It is important, then, that you consider to whom you grant permissions, and the implications of that decision in all areas of your application.

Listing

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
    'https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/pems/'

jobs-pems-list -V $JOB_ID

[
  {
    "username": "$API_USERNAME",
    "internalUsername": null,
    "permission": {
      "read": true,
      "write": true
    },
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/jobs/v2/6608339759546166810-242ac114-0001-007/pems/$API_USERNAME"
      },
      "parent": {
        "href": "https://sandbox.agaveplatform.org/jobs/v2/6608339759546166810-242ac114-0001-007"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/$API_USERNAME"
      }
    }
  },
  {
    "username": "$USERNAME",
    "internalUsername": null,
    "permission": {
      "read": true,
      "write": false
    },
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/pems/$USERNAME"
      },
      "parent": {
        "href": "https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/$USERNAME"
      }
    }
  }
]

To find the permissions for a given job, make a GET on the job’s pems collection. Here we see that both the job owner and the user we just granted permission to appear in the response.

Updating

curl -sk -H "Authorization: Bearer  $ACCESS_TOKEN" \
     -H "Content-Type: application/json" \
     -X POST --data-binary {"permission":"READ_WRITE}" \
     https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/$USERNAME

jobs-pems-update -u $USERNAME -p READ_WRITE $JOB_ID

{
  "username": "$USERNAME",
  "internalUsername": null,
  "permission": {
    "read": true,
    "write": true
  },
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/pems/$USERNAME"
    },
    "parent": {
      "href": "https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID"
    },
    "profile": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/$USERNAME"
    }
  }
}

Updating is exactly like granting permissions. Just POST to the same job’s pems collection.

Deleting

curl -sk -H "Authorization: Bearer  $ACCESS_TOKEN" \
     -X DELETE \
     https://sandbox.agaveplatform.org/jobs/v2/$JOB_ID/$USERNAME

jobs-pems-update -u $USERNAME -p '' $JOB_ID

To delete a permission, you can issue a DELETE request on the user permission resource we’ve been using, or update with an empty permission value.

Notifications

 /$$$$$$$          /$$       /$$$$$$          /$$
| $$__  $$        | $$      /$$__  $$        | $$
| $$  \ $$/$$   /$| $$$$$$$| $$  \__//$$   /$| $$$$$$$
| $$$$$$$| $$  | $| $$__  $|  $$$$$$| $$  | $| $$__  $$
| $$____/| $$  | $| $$  \ $$\____  $| $$  | $| $$  \ $$
| $$     | $$  | $| $$  | $$/$$  \ $| $$  | $| $$  | $$
| $$     |  $$$$$$| $$$$$$$|  $$$$$$|  $$$$$$| $$$$$$$/
|__/      \______/|_______/ \______/ \______/|_______/

Under the covers, the Agave API is an event-driven distributed system implemented on top of a reliable, cloud-based messaging system. This means that every action either observed or taken by Agave is tied to an event. The changing of a job from one status to another is an event. The granting of permissions on a file is an event. Editing a piece of metadata is an event, and to be sure, the moment you created an account with Agave was an event. You get the idea.

Having such a fine-grain event system is helpful for the same reason that having a fine-grain permission model is helpful. It affords you the highest degree of flexibility and control possible to achieve the behavior you desire. With Agave’s event system, you have the ability to alert your users (or yourself) the instant something occurs. You can be proactive rather than reactive, and you can begin orchestrating your complex tasks in a loosely coupled, asynchronous way.

Subscriptions

Example notification subscription request

{
  "associatedUuid": "0001409758089943-5056a550b8-0001-002",
  "event": "OVERWRITTEN",
  "persistent": true,
  "url": "nryan@rangers.mlb.com"
}

As consumers of Agave, you have the ability to subscribe to events occurring on any resource to which you have access. By that we mean, for example, you could subscribe to events on your job and a job that someone shared with you, but you could not subscribe to events on a job submitted by someone else who has not shared the job with you. Basically, if you can see a resource, you can subscribe to its events.

The Notifications service is the primary mechanism by which you create and manage your event subscriptions. A typical use case is a user subscribing for an email alert when her job completes. The following JSON object represents a request for such a notification.

The associatedUuid value is the UUID of her job. Here, we given the UUID of the picsumipsum.txt file we uploaded in the Files API guide. The event value is the name of the event to which she wants to be notified. This example is asking for an email to be sent whenever the file is overwritten. She could have just as easily specified a status of DELETED or RENAME to be notified when the file was deleted or renamed.

The persistent value specifies whether the notification should fire more than once. By default, all event subscriptions are transient. This is because the events themselves are transient. An event occurs, then it is over. There are, however, many situations where events could occur over and over again. Permission events, changes to metadata and data, application registrations on a system, job submissions to a system or queue, etc., all are transient events that can potentially occur many, many times. In these cases it is either not possible or highly undesirable to constantly resubscribe for the same event. The persistent attribute tells the notification service to keep a subscription alive until it is explicitly deleted.

Continuing to work through the example, the url value specifies where the notification should be sent. In this example, our example user specified that she would like to be notified via email. Agave supports both email and webhook notifications. If you are unfamiliar with webhooks, take a moment to glance at the webhooks.org page for a brief overview. If you are a Gang of Four disciple, webhooks are a mechanism for implementing the Observer Pattern. Webhooks are widely used across the web and chances are that something you’re using right now is leveraging them.

URL Macros

Receive a callback when a new user is created that includes the new user’s information

https://example.com/sendWelcome.php?username=${USERNAME}&email=${EMAIL}&firstName=${FIRST_NAME}&lastName=${LAST_NAME}&src=agaveplatform.org&nonce=1234567

Receive self-describing job status updates

http://example.com/job/${JOB_ID}?status=${JOB_STATUS}&lastUpdated=${JOB_START_TIME}

Get notified on all jobs going into and out of queues

http://example.com/system/${JOB_EXECUTION_SYSTEM}/queue/${JOB_BATCH_QUEUE}?action=add
http://example.com/system/${JOB_EXECUTION_SYSTEM}/queue/${JOB_BATCH_QUEUE}?action=subtract

Use plus mailing to route job notifications to different folders

nryan+${EXECUTION_SYSTEM}+${JOB_ID}@gmail.com

In the context of Agave, a webhook is a URL to which Agave will send a POST request when that event occurs. A webhook can be any web accessible URL. While you cannot customize the POST content that Agave sends (it is unique to the event), you can take advantage of the many template variables that Agave provides to customize the URL at run time. The following tables show the webhook template variables available for each resource. Use the select box to view the macros for different resources.

Variable	Description
UUID	The UUID of the app.
EVENT	The event which occurred
OWNER	The username of the user who triggered the event.
APP_ID	The id of the app.
APP_NAME	The name of the app.
APP_VERSION	The version of the app.
APP_OWNER	The username of the user who created or published the app.
APP_SHORT_DESCRIPTION	The short textual app description.
APP_UUID	The uuid of the app.
APP_IS_PUBLIC	Whether the app is public or private. Values are “true” and “false”.
APP_LABEL	The display label of the app.
APP_LONG_DESCRIPTION	The full textual app description.
APP_AVAILABLE	Whether the app is available. Values are “true” and “false”.
APP_CHECKPOINTABLE	Whether the app is checkpointable. Values are “true” and “false”.
APP_DEFAULT_MAX_RUN_TIME	The default max runtime of a job running this app. This is the value used by the job service if no maxRunTime is specified in the job request.
APP_DEFAULT_MEMORY_PER_NODE	The default memory of a job running this app. This is the value used by the job service if no memoryPerNode is specified in the job request.
APP_DEFAULT_PROCESSORS_PER_NODE	The default processors per node of a job running this app. This is the value used by the job service if no processorsPerNode is specified in the job request.
APP_DEFAULT_NODE_COUNT	The default node count of a job running this app. This is the value used by the job service if no nodeCount is specified in the job request.
APP_DEFAULT_QUEUE	The name of the default batch queue of a job running this app. This is the value used by the job service if no batchQueue is specified in the job request.
APP_DEPLOYMENT_PATH	The default deployment path of the app assets on the remote deploymentSystem.
APP_DEPLOYMENT_SYSTEM	The id of the Agave system on which the app assets are stored.
APP_HELP_URI	The help URL of the app.
APP_URL	The canonical URL of the app.
APP_JSON	The serialized JSON representation of the resource. This is what would be returned if you made a naked GET request to the API for the resource details.
RAW_JSON	The serialized JSON event payload. This is identical to what is sent in the body of a webhook POST callback.

Variable	Description
UUID	The UUID of the job
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
JOB_STATUS	The status of the job at the time the event occurs
JOB_URL	The url of the job within the API
JOB_ID	The unique id used to reference the job within Agave.
JOB_SYSTEM	ID of the job execution system (ex. ssh.execute.example.com)
JOB_NAME	The user-supplied name of the job
JOB_START_TIME	The time when the job started running in ISO8601 format.
JOB_END_TIME	The time when the job stopped running in ISO8601 format.
JOB_SUBMIT_TIME	The time when the job was submitted to Agave for execution by the user in ISO8601 format.
JOB_ARCHIVE_PATH	The path on the archive system where the job output will be staged.
JOB_ARCHIVE_URL	The Agave URL for the archived data.
JOB_ERROR	The error message explaining why a job failed. Null if completed successfully.
JOB_APP_ID	The id of the app being run.
JOB_BATCH_QUEUE	The batch queue of the JOB_EXECUTION_SYSTEM on which the job is assigned.
JOB_CREATED	The time when the job request was initially received in ISO8601 format.
JOB_EXECUTION_SYSTEM	The agave execution system id on which the job will run.
JOB_INPUTS	The serialized JSON object representing the job inputs.
JOB_LOCAL_ID	The id of the job on the JOB_EXECUTION_SYSTEM. This will be the id assigned to the batch scheduler, condor schedd, or sytem PID depending on the system scheduler type.
JOB_MAX_RUNTIME	The max job run from the job request in HH:MM:SS format.
JOB_MAX_RUNTIME_MILLISECONDS	The max job run time from the job request converted to milliseconds.
JOB_MAX_RUNTIME_SECONDS	The max job run time from the job request converted to seconds.
JOB_MEMORY_PER_NODE	The memory requested per node in the job request in GB.
JOB_NODE_COUNT	The number of nodes from the job request.
JOB_OWNER	The username of the user who submitted the job request.
JOB_OUTPUT_PATH	The absolute path to the job directory on the remote system.
JOB_PARAMETERS	The serialized JSON object representing the job parameters.
JOB_PROCESSORS_PER_NODE	The processors per node from the job request.
JOB_STATUS	The current job status.
JOB_START_TIME	The time when the job moved to a “RUNNING” status in ISO8601 format.
JOB_TENANT	The code of the tenant to which the job was submitted.
JOB_URL	The canonical Agave URL of the job.
JOB_ARCHIVE	Whether Agave will attempt to archive the job. Values “true” or “false”.
JOB_ARCHIVE_SYSTEM	The Agave storage system id to which the job output will be archived. This will be NULL if the the job is not archived.
JOB_ERROR/td>	The current debug or error message set for the job.
JOB_JSON	The serialized JSON object representing the job. This is identical to what would come back if you made a naked GET request on the job url.
RAW_JSON	The serialized JSON event payload. This is identical to what is sent in the body of a webhook POST callback.

Variable	Description
UUID	The UUID of the file
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
FILE_UUID	The file item UUID.
FILE_NAME	The file item name.
FILE_OWNER	The file item owner.
FILE_LASTMODIFIED	The file item last modified timestamp in ISO8601 format.
FILE_PATH	The agave path of the file item on the agave system.
FILE_STATUS	The status of the file item at the time of the event.
FILE_SYSTEMID	The id of the agave system on which the file item resides.
FILE_TYPE	The agave file type of the file item.
FILE_PERMISSIONS	The native file system permissions of the file item on the remote system.
FILE_LENGTH	The size of the file item in bytes.
FILE_URL	The canonical URL of the file item.
FILE_MIMETYPE	The mimetype of the file item.
FILE_JSON	The serialized JSON representation of the resource. This is what would be returned if you made a naked GET request to the API for the resource details.
RAW_JSON	The serialized JSON event payload. This is identical to what is sent in the body of a webhook POST callback.

Variable	Description
UUID	The UUID of the schemata object.
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
METADATA_ID	The id of the metadata item.
METADATA_SCHEMAID	The id of the metadata schema used to validate this metadata item. NULL if none was assigned.
METADATA_VALUE	The raw value of the metadata item.
METADATA_ASSOCIATIONIDS	The serialized JSON array of AGAVE UUID representing resources associated with this metadata item.
METADATA_LASTUPDATED	The last time this metadata item was updated in ISO8601 format.
METADATA_CREATED	The creation timestamp of this metadata item in ISO8601 format.
RAW_JSON	The serialized JSON event payload. This is identical to what is sent in the body of a webhook POST callback.

Variable	Description
UUID	The UUID of the schemata object.
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
METADATA_SCHEMA_ID	The id of the metadata item.
METADATA_SCHEMA_OWNER	The username of the user who created the metadata schema.
METADATA_SCHEMA_SCHEMA	The serialized JSON schema definition.
METADATA_SCHEMA_LASTUPDATED	The last time this metadata schema was updated in ISO8601 format.
METADATA_SCHEMA_CREATED	The creation timestamp of this metadata schema in ISO8601 format.
RAW_JSON	The serialized JSON event payload. This is identical to what is sent in the body of a webhook POST callback.

Variable	Description
UUID	The uuid of the monitor.
EVENT	The name of the event which occurred.
OWNER	The owner of the monitor.
ID	The ID of the monitor.
TARGET	The system to which the monitor applies.
ACTIVE	Whether the monitor is active or inactive.
UPDATE_SYSTEM_STATUS	Whether the system status will be updated with the check results.
INTERNAL_USERNAME	The internal user associated with the status check.
CREATED	The time the monitor was created in ISO8601 format.
LAST_SUCCESS	The time the monitor last successfully ran in ISO8601 format.
LAST_UPDATED	The time the monitor last ran in ISO8601 format.
NEXT_CHECK	The time the monitor will run in ISO8601 format.
LAST_CHECK_ID	The id of the last check. **Only present in monitoring check events fire..
LAST_MESSAGE	The message returned from the check. **Only present in monitoring check events fire..
TYPE	Type of the monitoring check run: EXECUTION, STORAGE. **Only present in monitoring check events fire..

Variable	Description
UUID	The UUID of the notification object
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
URL	The URL to which this notification will be published.
ATTEMPTS	Maximum retry attempts that will be made for this notification.
RESPONSE_CODE	The last response code for a delivery attempt for this notification
LAST_UPDATED	The timestamp of the last time this notification was updated in ISO8601 format
ASSOCIATED_ID	The resource whose events this notification is subscribed
CREATED	The timestamp when the notification was created in ISO8601 format
STATUS	The current status of this notification. eg. ACTIVE, INACTIVE, FAILED, COMPLETE.

Variable	Description
UUID	The UUID of the PostIt
EVENT	The event which occurred
OWNER	The username of the user who triggered the event.
NONCE	Nonce specified in the POSTIT url
CREATED	Time the PostIt was created ISO8601 format
RENEWED	Last time the PostIt was renewed in ISO8601 format
EXPIRES	Time the PostIt expires in ISO8601 format
TARGET_URL	Remote URL which will be called when the PostIt is redeemed
TARGET_METHOD	HTTP method that will be called on the TARGET_URL
REMAINING_USES	Number of invocations remaining for this PostIt
POSTIT	Full PostIt URL

Variable	Description
UUID	The UUID of the profile
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
USERNAME	Username of the user

Variable	Description
UUID	The UUID of the system
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
SYSTEM_ID	ID of the system (ex. ssh.execute.example.com)
SYSTEM_STATUS	Current status of the system: UP, DOWN, UNKNOWN
SYSTEM_PUBLIC	True if the system is publicly available, false otherwise
SYSTEM_GLOBALDEFAULT	True if the system is one of the two default publicly available systems, false otherwise
SYSTEM_LASTUPDATED	The last time this system was updated in ISO8601 format
SYSTEM_STORAGE_PROTOCOL	The protocol used to move data to and from this system
SYSTEM_STORAGE_HOST	The storage host for this sytem
SYSTEM_STORAGE_PORT	The storage port for this system
SYSTEM_STORAGE_RESOURCE	The system resource for iRODS systems
SYSTEM_STORAGE_ZONE	The system zone for iRODS systems
SYSTEM_STORAGE_CONTAINER	The the object store bucket in which the rootDir resides.
SYSTEM_STORAGE_ROOT_DIR	The virtual root directory exposed on this system
SYSTEM_STORAGE_HOME_DIR	The home directory on this system relative to the STORAGE_ROOT_DIR
SYSTEM_STORAGE_AUTH_TYPE	The storage authentication method for this system
SYSTEM_LOGIN_PROTOCOL	The protocol used to establish a session with this system (eg SSH, GSISSH, etc)
SYSTEM_LOGIN_HOST	The login host for this system
SYSTEM_LOGIN_PORT	The login port for this system
SYSTEM_LOGIN_AUTH_TYPE	The login authentication method for this system

Variable	Description
UUID	The UUID of the schemata object.
EVENT	The name of the event which occurred.
OWNER	The username of the user who triggered the event.
TAG_ID	The id of the tag.
TAG_NAME	The name of the tag.
TAG_OWNER	The username of the user who created the tag.
TAG_URL	The canonical URL to the tag.
TAG_ASSOCIATIONIDS	The serialized JSON array of AGAVE UUID representing resources associated with this tag.
PERMISSION_ID	The id of the tag permission. **Only present on tag permission events
PERMISSION_PERMISSION	The resulting permission after completion of the event. **Only present on tag permission events
PERMISSION_USERNAME	The user to whom the permission was applied. **Only present on tag permission events
PERMISSION_LASTUPDATED	The last time this permission was updated in ISO8601 format. **Only present on tag permission events
PERMISSION_JSON	The serialized JSON representation of the permission. This is identical to what is returned from a GET request for this permission.
TAG_CREATED	The creation timestamp of this tag in ISO8601 format.
RAW_JSON	The serialized JSON event payload. This is identical to what is sent in the body of a webhook POST callback.

Variable	Description
UUID	The UUID of the transfer
EVENT	The event which occurred
SOURCE	The source URL of this transfer
DESTINATION	The destination URL of this transfer
STATUS	The current status of this transfer in ISO8601 format
CREATED	The time the transfer was submitted to Agave in ISO8601 format
START_TIME	The time the transfer started in ISO8601 format
END_TIME	The time the transfer ended in ISO8601 format
TOTAL_SIZE	Total data size to be transferred
TOTAL_TRANSFER	Total bytes transferred
TRANSFER_RATE	Average transfer rate of all data moved in this transfer given in Gbps
ATTEMPTS	Number of attempts made to transfer the SOURCE data

The value of webhook template variables is that they allow you to build custom callbacks using the values of the resource variable at run time. Several commonly used webhooks are shown in the tables above.

Creating

Create a new notification subscription

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST \
    -H "Content-Type: application/json" \
    --data-binary '{"associatedUuid": "7554973644402463206-242ac114-0001-007", "event": "FINISHED", "url": "http://requestbin.agaveplatform.org/zyiomxzy?path=${PATH}&system=${SYSTEM}&event=${EVENT}" }' \
    https://sandbox.agaveplatform.org/notifications/v2?pretty=true

notifications-addupdate -F notification.json

Which will result in output similar to this

{
  "id": "7612526206168863206-242ac114-0001-011",
  "owner": "nryan",
  "url": "http://requestbin.agaveplatform.org/zyiomxzy?path=${PATH}&system=${SYSTEM}&event=${EVENT}",
  "associatedUuid": "7554973644402463206-242ac114-0001-007",
  "event": "FINISHED",
  "responseCode": null,
  "attempts": 0,
  "lastSent": null,
  "success": false,
  "persistent": false,
  "status": "ACTIVE",
  "lastUpdated": "2016-08-24T10:07:03.000-05:00",
  "created": "2016-08-24T10:07:03.000-05:00",
  "policy": {
    "retryLimit": 5,
    "retryRate": 5,
    "retryDelay": 0,
    "saveOnFailure": true,
    "retryStrategy": "NONE"
  },
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/notifications/v2/7612526206168863206-242ac114-0001-011"
    },
    "history": {
      "href": "https://sandbox.agaveplatform.org/notifications/v2/7612526206168863206-242ac114-0001-011/history"
    },
    "attempts": {
      "href": "https://sandbox.agaveplatform.org/notifications/v2/7612526206168863206-242ac114-0001-011/attempts"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "job": {
      "href": "https://sandbox.agaveplatform.org/jobs/v2/7554973644402463206-242ac114-0001-007"
    }
  }
}

Subscribing to an event is done by posting a form or JSON object to the Notifications service. An example of doing this using curl as well as the CLI is given below.

Updating

The updated notification subscription object

{
    "associatedUuid": "7554973644402463206-242ac114-0001-007",
    "event": "*",
    "url": "http://requestbin.agaveplatform.org/zyiomxzy?path=${PATH}&system=${SYSTEM}&event=${EVENT}"
}

The JSON used to update the subscription is shown above

Updating a subscription is done identically to creation except that the form or JSON is POSTed to the existing subscription URL. An example of doing this using curl as well as the CLI is given below.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST \
    -H "Content-Type: application/json" \
    -F "fileToUpload=@notification.json" \
    https://sandbox.agaveplatform.org/notifications/v2/2699130208276770330-242ac114-0001-011

notifications-addupdate -F notification.json 2699130208276770330-242ac114-0001-011

Which will result in output similar to this

{
  "id": "7612526206168863206-242ac114-0001-011",
  "owner": "nryan",
  "url": "http://requestbin.agaveplatform.org/zyiomxzy?path=${PATH}&system=${SYSTEM}&event=${EVENT}",
  "associatedUuid": "7554973644402463206-242ac114-0001-007",
  "event": "*",
  "responseCode": null,
  "attempts": 0,
  "lastSent": null,
  "success": false,
  "persistent": false,
  "status": "ACTIVE",
  "lastUpdated": "2016-08-24T10:07:03.000-05:00",
  "created": "2016-08-24T10:07:03.000-05:00",
  "policy": {
    "retryLimit": 5,
    "retryRate": 5,
    "retryDelay": 0,
    "saveOnFailure": true,
    "retryStrategy": "NONE"
  },
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/notifications/v2/7612526206168863206-242ac114-0001-011"
    },
    "history": {
      "href": "https://sandbox.agaveplatform.org/notifications/v2/7612526206168863206-242ac114-0001-011/history"
    },
    "attempts": {
      "href": "https://sandbox.agaveplatform.org/notifications/v2/7612526206168863206-242ac114-0001-011/attempts"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "job": {
      "href": "https://sandbox.agaveplatform.org/jobs/v2/7554973644402463206-242ac114-0001-007"
    }
  }
}

Listing

Listing notification subscriptions

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/notifications/v2/2699130208276770330-242ac114-0001-011

notifications-list -V

Which will result in output similar to this

[
  {
    "id": "7612526206168863206-242ac114-0001-011",
    "url": "http://requestbin.agaveplatform.org/zyiomxzy?path=${PATH}&system=${SYSTEM}&event=${EVENT}",
    "associatedUuid": "7554973644402463206-242ac114-0001-007",
    "event": "*",
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/notifications/v2/7612526206168863206-242ac114-0001-011"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
      },
      "job": {
        "href": "https://sandbox.agaveplatform.org/jobs/v2/7554973644402463206-242ac114-0001-007"
      }
    }
  },
  {
    "id": "7404907487080223206-242ac114-0001-011",
    "url": "nryan@rangers.texas.mlb.com",
    "associatedUuid": "6904887394479903206-242ac114-0001-007",
    "event": "FINISHED",
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/notifications/v2/7404907487080223206-242ac114-0001-011"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
      },
      "job": {
        "href": "https://sandbox.agaveplatform.org/jobs/v2/6904887394479903206-242ac114-0001-007"
      }
    }
  },
  {
    "id": "3676815741209931290-242ac114-0001-011",
    "url": "nryan@rangers.texas.mlb.com",
    "associatedUuid": "3717016635100491290-242ac114-0001-007",
    "event": "FINISHED",
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/notifications/v2/3676815741209931290-242ac114-0001-011"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
      },
      "job": {
        "href": "https://sandbox.agaveplatform.org/jobs/v2/3717016635100491290-242ac114-0001-007"
      }
    }
  }
]

You can get a list of your current notification subscriptions by performing a GET operation on the base /notifications collection. Adding the UUID of a notification will return just that notification. You can also query for all notifications assigned to a specific UUID by adding associatedUuid=$uuid. An example of querying all notifications using curl as well as the CLI is given below.

Unsubscribing

Unsubscribing from a notification subscription

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     -X DELETE \
     https://sandbox.agaveplatform.org/notifications/v2/2699130208276770330-242ac114-0001-011

notifications-delete -V

An standard Agave response with an empty result will be returned.

To unsubscribe from an event, perform a DELETE on the notification URL. Once deleted, you can not restore a subscription. You can, however create a new one. Keep in mind that if you do this, the UUID of the new notification will be different that that of the deleted one. An example of deleting a notification using curl as well as the CLI is given below.

Retry Policies

Sample notification subscription object with custom retry policy.

{
  "url" : "$REQUEST_BIN?path=${PATH}&system=${SYSTEM}&event=${EVENT}",
  "event" : "*",
  "persistent": true,
  "policy": {
      "retryStrategy": "IMMEDIATE",
      "retryLimit": 20,
      "retryRate": 5,
      "retryDelay": 0,
      "saveOnFailure": true
    }
}

In some situations, Agave may be unable to publish a specific notification. When this happens, Agave will immediately retry the notification 5 times in an attempt to deliver it successfully. When delivery fails for a 5th time, the notification is abandoned. If your application requires a more tenacious or methodical approach to retry delivery, you may provide a notification policy.

Name	Type	Description
retryStrategy	NONE, IMMEDIATE, DELAYED, EXPONENTIAL	The retry strategy to employ. Default is IMMEDIATE
retryRate	int; 0:86400	The frequency with which attempts should be made to deliver the message.
retryLimit	int; 0:1440	The maximum attempts that should be made to delivery the message.
retryDelay	int; 0:86400	The initial delay between the initial delivery attempt and the first retry.
saveOnFailure	boolean	Whether the failed message should be persisted if unable to be delivered within the retryLimit

Notification retry policies describe the strategy, frequency, delay, limit, and persistence to be applied when publishing an individual event for a given notification. The example above is our previous example with a notification policy included.

Failed deliveries

Query failed attempts for a specific notification

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
     https://$API_BASE_URL/notifications/$API_VERSION/229681451607921126-8e1831906a8e-0001-042"/attempts

notifications-list-failures 229681451607921126-8e1831906a8e-0001-042"

A list of notification attempts will be returned.

[
  {
    "id" : "229681451607921126-8e1831906a8e-0001-042",
    "url" : "https://httpbin.org/status/500",
    "event" : "SENT",
    "associatedUuid" : "5833036796741676570-b0b0b0bb0b-0001-011",
    "startTime" : "2016-06-19T22:21:02.266-05:00",
    "endTime" : "2016-06-19T22:21:03.268-05:00",
    "response" : {
      "code" : 500,
      "message" : ""
    },
    "_links" : {
      "self" : {
        "href" : "https://$API_BASE_URL/notifications/$API_VERSION/229123105859441126-8e1831906a8e-0001-011/attempts/229681451607921126-8e1831906a8e-0001-042"
      },
      "notification" : {
        "href" : "https://$API_BASE_URL/notifications/$API_VERSION/5833036796741676570-b0b0b0bb0b-0001-011"
      },
      "profile" : {
        "href" : "https://$API_BASE_URL/profiles/$API_VERSION/ipcservices"
      }
    }
  }
]

By providing a retry policy where saveOnFailure is true, failed messages will be persisted and made available for querying at a later time. This is a great way to handled missed work due to a server failure, maintenance downtime, etc. To query for for failed messages

Note: There is no way to save successful notification deliveries.

PostIts

 /$$$$$$$                       /$$    /$$$$$$ /$$
| $$__  $$                     | $$   |_  $$_/| $$
| $$  \ $$ /$$$$$$   /$$$$$$$ /$$$$$$   | $$ /$$$$$$
| $$$$$$$//$$__  $$ /$$_____/|_  $$_/   | $$|_  $$_/
| $$____/| $$  \ $$|  $$$$$$   | $$     | $$  | $$
| $$     | $$  | $$ \____  $$  | $$ /$$ | $$  | $$ /$$
| $$     |  $$$$$$/ /$$$$$$$/  |  $$$$//$$$$$$|  $$$$/
|__/      \______/ |_______/    \___/ |______/ \___/

The PostIts service is a URL shortening service similar to bit.ly, goo.gl, and t.co. It allows you to create pre-authenticated, disposable URLs to any resource in the Agave Platform. You have control over the lifetime and number of times the URL can be redeemed, and you can expire a PostIt at any time. As with all Science API resources, a full set of events is available for you to track usage and integrate the lifecycle of a PostIt into external applications as needed.

The most common use of PostIts is to create URLs to files and folders you can share with others without having to upload them to a third-party service. For example, using the PostIts service, you can share the output(s) of an experimental run, distribute materials for a class, submit data to a third-party service, and serve up assets for a static website like Agave ToGo.

Other uses cases for the PostIts service include creating “drop” folders to which anyone with the link can upload data, allowing a job to be reproducibly rerun for peer review, publishing metadata for public consumption, publishing a canonical reference to your user profile. The possibilities go on and on. Anytime you need to share your science with your world, PostIts can help you.

Creating PostIts

Creating a PostIt

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
    -X POST \
    -d "lifetime=3600" \
    -d "maxUses=10" \
    -d "method=GET" \
    -d "url=https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt" \
    'https://sandbox.agaveplatform.org/postits/v2/?pretty=true'

postits-create \
    -m 10 \
    -l 86400 \
    https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt

Should result in something similar to the following:

{
  "creator":"nryan",
  "internalUsername":null,
  "authenticated":true,
  "created":"2016-09-30T21:51:31-05:00",
  "expires":"2016-10-01T00:14:51-05:00",
  "remainingUses":10,
  "postit":"f61256c53bf3744185de4ac6c0c839b4",
  "noauth":false,
  "url":"https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org//home/nryan/picksumipsum.txt",
  "method":"GET",
  "_links":{
    "self":{
      "href":"https://sandbox.agaveplatform.org/postits/v2/f61256c53bf3744185de4ac6c0c839b4"
    },
    "profile":{
      "href":"https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "file":{
      "href":"https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org//home/nryan/picksumipsum.txt"
    }
  }
}

To create a PostIt, send a POST request to the PostIts service with the target url you want to share. In this example, we are sharing a file we have in Agave’s cloud storage account.

In the response you see standard fields such as created timestamp and the postit token. You also see several fields that lead into the discussion of another aspect of PosIts, the ability to restrict usage and expire them on demand.

Restricting PostIt usage

When creating a PostIt, you have the ability to limit the lifespan, number of uses, and HTTP method used to connect to the target resource. The following table shows the fields available for this purpose. Not specifying any of these fields results in a single-use PostIt that remains valid for 1 calendar month.

Attribute	Type	Description
maxUses	JSON object	The maximum number of times the postit may be redeemed. Defaults to 1.
maxLifetime	string	The maximum lifetime in seconds over which the postit may be redeemed. Defaults to 1 month.
method	GET,POST,PUT,DELETE	The HTTP method to be used to request the target resource when redeeming a postit. Defaults to GET
noauth	boolean	Whether the request to the target resource should be authenticated. Defaults to true.

Listing Active PostIts

Listing active PostIts

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
    'https://sandbox.agaveplatform.org/postits/v2/?pretty=true'

postits-list -v

Should result in something similar to the following:

[
  {
    "creator":"nryan",
    "internalUsername":null,
    "authenticated":true,
    "created":"2016-09-30T21:51:31-05:00",
    "expires":"2016-10-01T00:14:51-05:00",
    "remainingUses":10,
    "postit":"f61256c53bf3744185de4ac6c0c839b4",
    "noauth":false,
    "url":"https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org//home/nryan/picksumipsum.txt",
    "method":"GET",
    "_links":{
      "self":{
        "href":"https://sandbox.agaveplatform.org/postits/v2/f61256c53bf3744185de4ac6c0c839b4"
      },
      "profile":{
        "href":"https://sandbox.agaveplatform.org/profiles/v2/nryan"
      },
      "file":{
        "href":"https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org//home/nryan/picksumipsum.txt"
      }
    }
  }
]

Redeeming PostIts

Redeeming a PostIt

curl -s -o picksumipsum.txt 'https://sandbox.agaveplatform.org/postits/v2/f61256c53bf3744185de4ac6c0c839b4'

curl -s -o picksumipsum.txt 'https://sandbox.agaveplatform.org/postits/v2/f61256c53bf3744185de4ac6c0c839b4'

Which would download the picksumipsum.txt file from your storage system.

You redeem a postit by making a non-authenticated HTTP request on the PostIt URL. In the above example, that would be https://sandbox.agaveplatform.org/postits/v2/ead227bace394790e56beb07e7c3ff4d. Every time you make a get request on the PostIt, the remainingUses field decrements by 1. This continues until the value hits 0 or the PostIt outlives its expires field.

Forcing PostIt browser downloads

If you are using PostIts in a browser environment, you can force a file download by adding force=true to the PostIt URL query. If the target URL is a file item, the name of the file item will be included in the Content-Disposition header so the downloaded file has the correct file name. You may also add the same query parameter to any target file item to force the Content-Disposition header from the Files API.

Expiring PostIts

Manually expiring a PostIt

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
    -X DELETE
    'https://sandbox.agaveplatform.org/postits/v2/f61566c53bf3744185de4ac6c0c839b4?pretty=true'

postits-delete f61566c53bf3744185de4ac6c0c839b4

Which will result in an empty response from the server.

In addition to setting expiration parameters when you create a PostIt, you can manually expire a PostIt at any time by making an authenticated DELETE request on the PostIt URL. This will instantly expire the PostIt from further use and remove it from your listing results.

Metadata

 /$$      /$$             /$$
| $$$    /$$$            | $$
| $$$$  /$$$$  /$$$$$$  /$$$$$$    /$$$$$$
| $$ $$/$$ $$ /$$__  $$|_  $$_/   |____  $$
| $$  $$$| $$| $$$$$$$$  | $$      /$$$$$$$
| $$\  $ | $$| $$_____/  | $$ /$$ /$$__  $$
| $$ \/  | $$|  $$$$$$$  |  $$$$/|  $$$$$$$
|__/     |__/ \_______/   \___/   \_______/

The Agave Metadata service allows you to manage metadata and associate it with Agave entities via associated UUIDs. It supports JSON schema for structured JSON metadata; it also accepts any valid JSON-formatted metadata or plain text String when no schema is specified. As with other Agave services, a full access control layer is available, enabling you to keep your metadata private or share it with your colleagues.

Metadata Structure

Key-value metadata item

{
  "name": "some metadata",
  "value": "A model organism...",
}

Structured metadata item, metadata.json

{
  "name":"some metadata",
  "value":{
    "title":"Example Metadata",
    "properties":{
      "species":"arabidopsis",
      "description":"A model organism..."
    }
  }
}

Every metadata item has four fields shown in the following table.

Field name	Type	Description
name	string; 1-256	`required` A non-unique key you can use to reference and group your metadata.
value	json	string; 0-5M
associationIds	array;	An JSON array of zero or more UUID to which this metadata item should be associated.
schemaId	string;	The id of a valid Agave metadata schema object representing the JSON Schema definition used to validate this metadata item.

The name field is just that, a user-defined name you give to your metadata item. There is no uniqueness constraint put on the name field, so it is up to you to the application to enforce whatever naming policy it sees fit.

Depending on your application needs, you may use the Metadata service as a key-value store, document store, or both. When using it as a key-value store, you provide text for the value field. When you fetching data, you could search by exact value or full-text search as needed.

When using the Metadata service as a document store, you provide a JSON object or array for the value field. In this use case you can leverage additional functionality such as structured queries, atomic updates, etc.

Either use case is acceptable and fully supported. Your application needs will determine the best approach for you to take.

Associations

Each metadata item also has an optional associationIds field. This field contains a JSON array of Agave UUID for which this metadata item applies. This provides a convenient grouping mechanism by which to organize logically-related resources. One common examples is creating a metadata item to represent a “data collection” and associating files and folders that may be geographically distributed under that “data collection”. Another is creating a metadata item to represent a “project”, then sharing the “project” with other users involved in the “project”.

Metadata items can also be associated with other metadata items to create hierarchical relationships. Building on the “project” example, additional metadata items could be created for “links”, “videos”, and “experiments” to hold references for categorized groups of postits, video file items, and jobs respectively. Such a model translates well to a user interface layer and eliminates a large amount of boilerplate code in your application.

The associationIds field does not carry with it any special permissions or behavior. It is simply a link between a metadata item and the resources it represents.

Creating metadata

Create a new metadata item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST  
    -H 'Content-Type: application/json'
    --data-binary '{"value": {"title": "Example Metadata", "properties": {"species": "arabidopsis", "description": "A model organism..."}}, "name": "some metadata"}'
    https://sandbox.agaveplatform.org/meta/v2/data

metadata-addupdate -v -F - <<<'{"value": {"title": "Example Metadata", "properties": {"species": "arabidopsis", "description": "A model organism..."}}, "name": "some metadata"}'

The response will look something like the following:

{
  "uuid": "7341557475441971686-242ac11f-0001-012",
  "owner": "nryan",
  "schemaId": null,
  "internalUsername": null,
  "associationIds": [],
  "lastUpdated": "2016-08-29T04:49:34.532-05:00",
  "name": "some metadata",
  "value": {
    "title": "Example Metadata",
    "properties": {
      "species": "arabidopsis",
      "description": "A model organism..."
    }
  },
  "created": "2016-08-29T04:49:34.532-05:00",
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012"
    },
    "permissions": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012/pems"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
  }
}

New Metadata are created in the repository via a POST to their collection URLs. As we mentioned before, there is no uniqueness constraint placed on metadata items. Thus, repeatedly POSTing the same metadata item to the service will create duplicate entries, each with their own unique UUID assigned by the service.

Updating metadata

Update a metadata item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST
    -H 'Content-Type: application/json'
    --data-binary '{"value": {"title": "Example Metadata", "properties": {"species": "arabidopsis", "description": "A model plant organism..."}}, "name": "some metadata", "associationIds":["179338873096442342-242ac113-0001-002","6608339759546166810-242ac114-0001-007"]}'
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012

metadata-addupdate -v -F - 7341557475441971686-242ac11f-0001-012 <<<'{"value": {"title": "Example Metadata", "properties": {"species": "arabidopsis", "description": "A model plant organism..."}}, "name": "some metadata", "associationIds":["179338873096442342-242ac113-0001-002","6608339759546166810-242ac114-0001-007"]}'

The response will look something like the following:

{
  "uuid": "7341557475441971686-242ac11f-0001-012",
  "schemaId": null,
  "internalUsername": null,
  "associationIds": [
    "179338873096442342-242ac113-0001-002",
    "6608339759546166810-242ac114-0001-007"
  ],
  "lastUpdated": "2016-08-29T05:51:39.908-05:00",
  "name": "some metadata",
  "value": {
    "title": "Example Metadata",
    "properties": {
      "species": "arabidopsis",
      "description": "A model plant organism..."
    }
  },
  "created": "2016-08-29T05:43:18.618-05:00",
  "owner": "nryan",
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012"
    },
    "permissions": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012/pems"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "associationIds": [
      {
        "rel": "179338873096442342-242ac113-0001-002",
        "href": "https://sandbox.agaveplatform.org/files/v2/media/system/storage.example.com//",
        "title": "file"
      },
      {
        "rel": "6608339759546166810-242ac114-0001-007",
        "href": "https://sandbox.agaveplatform.org/jobs/v2/6608339759546166810-242ac114-0001-007",
        "title": "job"
      }
    ]
  }
}

Updating metadata is done by POSTing an updated metadata object to the existing resource. When updating, it is important to note that it is not possible to change the metadata uuid, owner, lastUpdated or created fields. Those fields are managed by the service.

Deleting metadata

Delete a metadata item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    -X DELETE
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012

metadata-delete 7341557475441971686-242ac11f-0001-012

An empty response will be returned from the service.

To delete a metadata item, simply make a DELETE request on the metadata resource.

Metadata details

Fetching a metadata item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012

metadata-list -v 7341557475441971686-242ac11f-0001-012

The response will look something like the following:

{
  "uuid":"7341557475441971686-242ac11f-0001-012",
  "schemaId":null,
  "internalUsername":null,
  "associationIds":[
    "179338873096442342-242ac113-0001-002",
    "6608339759546166810-242ac114-0001-007"
  ],
  "lastUpdated":"2016-08-29T05:51:39.908-05:00",
  "name":"some metadata",
  "value":{
    "title":"Example Metadata",
    "properties":{
      "species":"arabidopsis",
      "description":"A model plant organism..."
    }
  },
  "created":"2016-08-29T05:43:18.618-05:00",
  "owner":"nryan",
  "_links":{
    "self":{
      "href":"https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013"
    },
    "permissions":{
      "href":"https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013/pems"
    },
    "owner":{
      "href":"https://sandbox.agaveplatform.org/profiles/v2/nryan"
    },
    "associationIds":[
      {
        "rel":"179338873096442342-242ac113-0001-002",
        "href":"https://sandbox.agaveplatform.org/files/v2/media/system/storage.example.com//",
        "title":"file"
      },
      {
        "rel":"6608339759546166810-242ac114-0001-007",
        "href":"https://sandbox.agaveplatform.org/jobs/v2/6608339759546166810-242ac114-0001-007",
        "title":"job"
      }
    ]
  }
}

To fetch a detailed description of a metadata item, make a GET request on the resource URL. The response will be the full metadata item representation. Two points of interest in the example response are that the response does not have an id field. Instead, it has a uuid field which serves as its ID. This is the result of regression support for legacy consumers and will be changed in the next major release.

The second point of interest in the response is the _links.associationIds array in the hypermedia response. This contains an expanded representation of the associationIds field in the body. The objects in this array are similar to the information you would recieve by calling the UUID API to resolve each of the associationIds array values. By leveraging the information in the hypermedia response, you can save several round trips to resolve basic information about the resources the associationIds represent.

Metadata browsing

Listing your metadata

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    https://sandbox.agaveplatform.org/meta/v2/data?limit=1

metadata-list -v -l 1

The response will look something like the following:

[
  {
    "uuid": "7341557475441971686-242ac11f-0001-012",
    "schemaId": null,
    "internalUsername": null,
    "associationIds": [
      "179338873096442342-242ac113-0001-002",
      "6608339759546166810-242ac114-0001-007"
    ],
    "lastUpdated": "2016-08-29T05:51:39.908-05:00",
    "name": "some metadata",
    "value": {
      "title": "Example Metadata",
      "properties": {
        "species": "arabidopsis",
        "description": "A model plant organism..."
      }
    },
    "created": "2016-08-29T05:43:18.618-05:00",
    "owner": "nryan",
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013"
      },
      "permissions": {
        "href": "https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013/pems"
      },
      "owner": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
      },
      "associationIds": [
        {
          "rel": "179338873096442342-242ac113-0001-002",
          "href": "https://sandbox.agaveplatform.org/files/v2/media/system/storage.example.com//",
          "title": "file"
        },
        {
          "rel": "6608339759546166810-242ac114-0001-007",
          "href": "https://sandbox.agaveplatform.org/jobs/v2/6608339759546166810-242ac114-0001-007",
          "title": "job"
        }
      ]
    }
  }
]

To browse your Metadata, make a GET request against the /meta/v2/data collection. This will return all the metadata you created and to which you have been granted READ access. This includes any metadata items that have been shared with the public or world users. In practice, users will have many metadata items created and shared with them as part of normal use of the platform, so pagination and search become important aspects of interacting with the service.

For admins, who have implicit access to all metadata, the default listing response will be a paginated list of every metadata item in the tenant. To avoid such a scenario, admin users can append privileged=false to bypass implicit permissions and only return the metadata queries to which they have ownership or been granted explicit access.

Metadata Validation

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST
    -H 'Content-Type: application/json'
    --data-binary '{"schemaId": "4736020169528054246-242ac11f-0001-013", "value": {"title": "Example Metadata", "properties": {"description": "A model organism..."}}, "name": "some metadata"}'
    https://sandbox.agaveplatform.org/meta/v2/data

metadata-addupdate -v -F - <<<'{"schemaId": "4736020169528054246-242ac11f-0001-013", "value": {"title": "Example Metadata", "properties": {"description": "A model organism..."}}, "name": "some metadata"}'

The response will look something like the following:

{
  "status" : "error",
  "message" : "Metadata value does not conform to schema.",
  "version" : "2.1.8-r8bb7e86"
}

Often times it is necessary to validate metadata for format or simple quality control. The Metadata service is capable of validating the value of a metadata item against a predefined JSON Schema definition. In order to leverage this feature, you must first register your JSON Schema definition with the Metadata Schemata service, then reference the UUID of that metadata schema resource in the schemaId field.

Given our previous example metadata schema object, the following request would fail due to a missing “species” value in the metadata item value field.

Metadata Searching

Searching metadata for all items with name like “mustard plant”

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    --data-urlencode '{"name": "mustard plant"}'
    https://sandbox.agaveplatform.org/meta/v2/data

metadata-list -v -Q '{"name":"mustard+plant"}'

The response will look something like the following:

[
  {
    "uuid": "7341557475441971686-242ac11f-0001-012",
    "schemaId": null,
    "internalUsername": null,
    "associationIds": [
      "179338873096442342-242ac113-0001-002",
      "6608339759546166810-242ac114-0001-007"
    ],
    "lastUpdated": "2016-08-29T05:51:39.908-05:00",
    "name": "some metadata",
    "value": {
      "title": "Example Metadata",
      "properties": {
        "species": "arabidopsis",
        "description": "A model plant organism..."
      }
    },
    "created": "2016-08-29T05:43:18.618-05:00",
    "owner": "nryan",
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013"
      },
      "permissions": {
        "href": "https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013/pems"
      },
      "owner": {
        "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
      },
      "associationIds": [
        {
          "rel": "179338873096442342-242ac113-0001-002",
          "href": "https://sandbox.agaveplatform.org/files/v2/media/system/storage.example.com//",
          "title": "file"
        },
        {
          "rel": "6608339759546166810-242ac114-0001-007",
          "href": "https://sandbox.agaveplatform.org/jobs/v2/6608339759546166810-242ac114-0001-007",
          "title": "job"
        }
      ]
    }
  }
]

In addition to retrieving Metadata via its UUID, the Metadata service supports MongoDB query syntax. Just add the q=<value> to URL query portion of your GET request on the metadata collection. This differs from other APIs, but provides a richer syntax to query and filter responses.

If you wanted to look up Metadata corresponding to a specific value within its JSON Metadata value, you can specify this using a JSON object such as {"name": "mustard plant"}. Remember that, in order to send JSON in a URL query string, it must first be URL encoded. Luckily this is easily handled for us by curl and the Agave CLI.

The given query will return all metadata with name, “mustard plant” that you have permission to access.

Search Examples

metadata search by exact name

{"name": "mustard plant"}

metadata search by field in value

{"value.type": "a plant"}

metadata search for values with any field matching an item in the given array

{ "value.profile.status": { "$in": [ "active", "paused" ] } }

metadata search for items with a name matching a case-insensitive regex

{ "name": { "$regex": "^Cactus.*", "$options": "i"}}

metadata search for value by regex matched against each line of a value

{ "value.description": { "$regex": ".*monocots.*", "$options": "m"}}

metadata search for value by conditional queries

{
   "$or":[
      {
         "value.description":{
            "$regex":[
               ".*prickly pear.*",
               ".*agave.*",
               ".*century.*"
            ],
            "$options":"i"
         }
      },
      {
         "value.title":{
            "$regex":".*Cactus$"
         },
         "value.order":{
            "$regex":"Agavoideae"
         }
      }
   ]
}

Some common search syntax examples. Consult the MongoDB Query Documentation for more examples and full syntax documentation.

Metadata Permissions

The Metadata service supports permissions for both Metadata and Schemata consistent with that of a number of other Agave services. If no permissions are explicitly set, only the owner of the Metadata and tenant administrators can access it.

The permissions available for Metadata and Metadata Schemata are listed in the following table. Please note that a user must have WRITE permissions to grant or revoke permissions on a metadata or schema item.

Name	Description
READ	User can view the resource
WRITE	User can edit, but not view the resource
READ_WRITE	User can manage the resource
ALL	User can manage the resource
NONE	User can view the resource

Listing all permissions

List the permissions on Metadata for a given user

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012/pems/rclemens

metadata-pems-list -u rclemens \
    7341557475441971686-242ac11f-0001-012

The response will look something like the following:

[
  {
    "username": "nryan",
    "permission": {
      "read": true,
      "write": true
    },
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/meta/v2/7341557475441971686-242ac11f-0001-012/pems/nryan"
      },
      "parent": {
        "href": "https://sandbox.agaveplatform.org/meta/v2/7341557475441971686-242ac11f-0001-012"
      },
      "profile": {
        "href": "https://sandbox.agaveplatform.org/meta/v2/nryan"
      }
    }
  }
]

To list all permissions for a metadata item, make a GET request on the metadata item’s permission collection

List permissions for a specific user

List the permissions on Metadata for a given user

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012/pems/nryan

metadata-pems-list -u rclemens \
    7341557475441971686-242ac11f-0001-012

The response will look something like the following:

{
  "username":"nryan",
  "permission":{
    "read":true,
    "write":true
  },
  "_links":{
    "self":{
      "href":"https://sandbox.agaveplatform.org/meta/v2/7341557475441971686-242ac11f-0001-012/pems/nryan"
    },
    "parent":{
      "href":"https://sandbox.agaveplatform.org/meta/v2/7341557475441971686-242ac11f-0001-012"
    },
    "profile":{
      "href":"https://sandbox.agaveplatform.org/meta/v2/nryan"
    }
  }
}

Checking permissions for a single user is simply a matter of adding the username of the user in question to the end of the metadata permission collection.

Grant permissions

Grant read access to a metadata item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST
    --data '{"permission":"READ"}'
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012/pems/rclemens

metadata-pems-addupdate -u rclemens \
    -p READ 7341557475441971686-242ac11f-0001-012

Grant read and write access to a metadata item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST
    --data '{"permission":"READ_WRITE"}'
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012/pems/rclemens

metadata-pems-addupdate -u rclemens \
    -p READ_WRITE 7341557475441971686-242ac11f-0001-012

The response will look something like the following:

{
  "username": "rclemens",
  "permission": {
    "read": true,
    "write": true
  },
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/7341557475441971686-242ac11f-0001-012/pems/rclemens"
    },
    "parent": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/7341557475441971686-242ac11f-0001-012"
    },
    "profile": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/jstubbs"
    }
  }
}

To grant another user read access to your metadata item, assign them READ permission. To enable another user to update a metadata item, grant them READ_WRITE or ALL access.

Delete single user permissions

Delete permission for single user on a Metadata item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    -X DELETE
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012/pems/rclemens

metadata-pems-delete -u rclemens 7341557475441971686-242ac11f-0001-012

An empty response will come back from the API.

Permissions may be deleted for a single user by making a DELETE request on the metadata user permission resource. This will immediately revoke all permissions to the metadata item for that user.

Deleting all permissions

Delete all permissions on a Metadata item

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    -X DELETE
    https://sandbox.agaveplatform.org/meta/v2/data/7341557475441971686-242ac11f-0001-012/pems

metadata-pems-delete 7341557475441971686-242ac11f-0001-012

An empty response will be returned from the service.

Permissions may be deleted for a single user by making a DELETE request on the metadata resource permission collection.

Metadata Schemata

  /$$$$$$          /$$
 /$$__  $$        | $$
| $$  \__/ /$$$$$$| $$$$$$$  /$$$$$$ /$$$$$$/$$$$  /$$$$$$
|  $$$$$$ /$$_____| $$__  $$/$$__  $| $$_  $$_  $$|____  $$
 \____  $| $$     | $$  \ $| $$$$$$$| $$ \ $$ \ $$ /$$$$$$$
 /$$  \ $| $$     | $$  | $| $$_____| $$ | $$ | $$/$$__  $$
|  $$$$$$|  $$$$$$| $$  | $|  $$$$$$| $$ | $$ | $|  $$$$$$$
 \______/ \_______|__/  |__/\_______|__/ |__/ |__/\_______/

Schema can be provided in JSON Schema form. The service will validate that the schema is valid JSON and store it. To validate Metadata against it, the schema UUID should be given as a parameter, schemaId, when uploading Metadata. If no schemaId` is provided, the Metadata service will accept any JSON Object or plain text string and store it accordingly. This flexible approach allows Agave a high degree of flexibility in handling structured and unstructured metadata alike.

For more on JSON Schema please see http://json-schema.org/

To add a metadata schema to the repository:

Creating schemata

Example JSON Schema document, schema.json

{
  "title": "Example Schema",
  "type": "object",
  "properties": {
    "species": {
      "type": "string"
    }
  },
  "required": [
    "species"
  ]
}

Creating a new metadata schema

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    -X POST -H "Content-Type: application/json"
    --data-binary '{ "title": "Example Schema", "type": "object", "properties": { "species": { "type": "string" } },"required": ["species"] }'
    https://sandbox.agaveplatform.org/meta/v2/schemas/

metadata-schema-addupdate -v -F schema.json

The response will look something like the following:

{
  "uuid": "4736020169528054246-242ac11f-0001-013",
  "internalUsername": null,
  "lastUpdated": "2016-08-29T04:52:11.474-05:00",
  "schema": {
    "title": "Example Schema",
    "type": "object",
    "properties": {
      "species": {
        "type": "string"
      }
    },
    "required": [
      "species"
    ]
  },
  "created": "2016-08-29T04:52:11.474-05:00",
  "owner": "nryan",
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013"
    },
    "permissions": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013/pems"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    }
  }
}

To create a new metadata schema that can be used to validate metadata items upon addition or updating, POST a JSON Schema document to the service.

More JSON Schema examples can be found in the Agave Samples project.

Updating schema

Update a metadata schema

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST
    -H 'Content-Type: application/json'
    --data-binary '{ "title": "Example Schema", "type": "object", "properties": { "species": { "type": "string" }, "description": {"type":"string"} },"required": ["species"] }'
    https://sandbox.agaveplatform.org/meta/v2/data/4736020169528054246-242ac11f-0001-013

metadata-addupdate -v -F - 4736020169528054246-242ac11f-0001-013 <<< '{ "title": "Example Schema", "type": "object", "properties": { "species": { "type": "string" }, "description": {"type":"string"} },"required": ["species"] }'

The response will look something like the following:

{
  "uuid": "4736020169528054246-242ac11f-0001-013",
  "internalUsername": null,
  "lastUpdated": "2016-08-29T04:52:11.474-05:00",
  "schema": {
    "title": "Example Schema",
    "type": "object",
    "properties": {
      "species": {
        "type": "string"
      }
    },
    "required": [
      "species"
    ]
  },
  "created": "2016-08-29T04:52:11.474-05:00",
  "owner": "nryan",
  "_links": {
    "self": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013"
    },
    "permissions": {
      "href": "https://sandbox.agaveplatform.org/meta/v2/schemas/4736020169528054246-242ac11f-0001-013/pems"
    },
    "owner": {
      "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
    }
  }
}

Updating metadata schema is done by POSTing an updated schema object to the existing resource. When updating, it is important to note that it is not possible to change the schema uuid, owner, lastUpdated or created fields. Those fields are managed by the service.

Deleting schema

Delete a metadata schema

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN"
    -X DELETE
    https://sandbox.agaveplatform.org/meta/v2/data/4736020169528054246-242ac11f-0001-013

metadata-schema-delete 4736020169528054246-242ac11f-0001-013

An empty response will be returned from the service.

To delete a metadata schema, simply make a DELETE request on the metadata schema resource.

Specifying schemata as $ref

When building new JSON Schema definitions, it is often helpful to break each object out into its own definition and use $ref fields to reference them. The metadata service supports such references between metadata schema resources. Simply provide the fully qualified URL of another valid metadata schema resources as the value to a $ref field and Agave will resolve the reference internally, applying the appropriate authentication and authorization for the requesting user to the request to the referenced resource.

Monitors

 /$$      /$$                  /$$  /$$
| $$$    /$$$                 |__/ | $$
| $$$$  /$$$$ /$$$$$$ /$$$$$$$ /$$/$$$$$$   /$$$$$$  /$$$$$$
| $$ $$/$$ $$/$$__  $| $$__  $| $|_  $$_/  /$$__  $$/$$__  $$
| $$  $$$| $| $$  \ $| $$  \ $| $$ | $$   | $$  \ $| $$  \__/
| $$\  $ | $| $$  | $| $$  | $| $$ | $$ /$| $$  | $| $$
| $$ \/  | $|  $$$$$$| $$  | $| $$ |  $$$$|  $$$$$$| $$
|__/     |__/\______/|__/  |__|__/  \___/  \______/|__/

The Agave Monitors API provides a familiar paradigm for monitoring the use ability and accessibility of storage and execution systems you registered with Agave. Similar to services like Pingdom, Pagerduty, and WebCron, the Monitors API allows you to to create regular health checks on a registered system. Unlike standard uptime services, Agave will check that your system is responsive and accessible by performing proactive tests on availability (ping), accessibility (authentication), and functionality (listing or echo). Each check result is persisted and the check history of a given monitor is queryable through the API. As with all resources in the Agave Platform, a full event model is available so you can subscribe to event you care about such as failed checks, restored system availability, and system disablement.

Creating Monitors

Create a new default monitor

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
     -H "Content-Type: application-json" \
     -X POST --data-binary '{"target": "storage.example.com"}' \
     https://sandbox.agaveplatform.org/monitors/v2/

monitors-addupdate -S storage.example.com

The response will look something like the following:

{
    "active": true,
    "created": "2016-06-03T17:22:59.000-05:00",
    "frequency": 60,
    "id": "5024717285821443610-242ac11f-0001-014",
    "internalUsername": null,
    "lastCheck": null,
    "lastSuccess": null,
    "lastUpdated": "2016-06-03T17:22:59.000-05:00",
    "nextUpdate": "2016-06-03T18:22:59.000-05:00",
    "owner": "nryan",
    "target": "storage.example.com",
    "updateSystemStatus": false,
    "_links": {
        "checks": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks"
        },
        "notifications": {
            "href": "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=5024717285821443610-242ac11f-0001-014"
        },
        "owner": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
        }
    }
}

The only piece of information needed to monitor a system is the system ID. Sending a POST request to the Monitors API with a monitor definition containing just the systemId field with a valid system ID or UUID will create a monitor that will run hourly health checks starting an hour from when you sent the request.

Custom frequency and start time

Create a monitor with a custom frequency

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
     -H "Content-Type: application-json" \
     -X POST --data-binary '{"target": "storage.example.com","frequency":15}' \
     https://sandbox.agaveplatform.org/monitors/v2/

monitors-addupdate -S storage.example.com -I 15

The response will look something like the following:

{
    "_links": {
        "checks": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks"
        },
        "notifications": {
            "href": "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=5024717285821443610-242ac11f-0001-014"
        },
        "owner": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
        }
    },
    "active": true,
    "created": "2016-06-03T17:22:59.000-05:00",
    "frequency": 15,
    "id": "5024717285821443610-242ac11f-0001-014",
    "internalUsername": null,
    "lastCheck": null,
    "lastSuccess": null,
    "lastUpdated": "2016-06-03T17:22:59.000-05:00",
    "nextUpdate": "2016-06-03T17:37:59.000-05:00",
    "owner": "nryan",
    "target": "storage.example.com",
    "updateSystemStatus": false
}

If you need the monitor to run more frequently, you can customize the frequency and time at which a monitor runs by including the interval and startTime fields in your monitor definition. By providing a time expression in the interval field, you can control the frequency at which a monitor runs. The maximum interval you can set for a monitor is one month. The minimum interval varies from tenant to tenant, but is generally no less than 5 minutes.

The startTime field allows you to schedule when you would like Agave to start the monitor on your system. Any date or time expression representing a moment between the current time and one month from then is acceptable. If you do not specify a value for startTime, Agave will add the value of interval to the current time and use that as the startTIme. Setting stop times or “off hours” is not currently supported.

Automating system status updates

Create a monitor that updates system status on change

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
     -H "Content-Type: application-json" \
     -X POST \
     --data-binary '{"target": "storage.example.com","frequency":15,"updateSystemStatus"=true}' \
     https://sandbox.agaveplatform.org/monitors/v2/

monitors-addupdate -S storage.example.com -I 15 -U true

The response will look something like the following:

{
    "active": true,
    "created": "2016-06-03T17:22:59.000-05:00",
    "frequency": 15,
    "id": "5024717285821443610-242ac11f-0001-014",
    "internalUsername": null,
    "lastCheck": null,
    "lastSuccess": null,
    "lastUpdated": "2016-06-03T17:22:59.000-05:00",
    "nextUpdate": "2016-06-03T17:37:59.000-05:00",
    "owner": "nryan",
    "target": "storage.example.com",
    "updateSystemStatus": true,
    "_links": {
        "checks": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks"
        },
        "notifications": {
            "href": "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=5024717285821443610-242ac11f-0001-014"
        },
        "owner": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
        }
    }
}

In the section on Events and notifications, we cover the ways in which you can get alerted about events pertaining to a monitor. Here we will simply point out that a convenience field, updateStatus, is built into all monitors. Setting this field to true will authorize Agave to update the status of the monitored system based on the result of the monitor checks. This is a convenient way to ensure that the status value in your system description matches the actual operational status of the system.

To automatically update your system status when a monitor changes status, set updateStatus to “true” in your monitor definition.

Updating an existing monitor

Update an existing monitor

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
     -H "Content-Type: application-json" \
     -X POST \
     --data-binary '{"target": "storage.example.com","frequency":5,"updateSystemStatus"=false}' \
     https://sandbox.agaveplatform.org/monitors/v2/5024717285821443610-242ac11f-0001-014

monitors-addupdate -S storage.example.com -I 5 -U false 5024717285821443610-242ac11f-0001-014

The response will look something like the following:

{
    "active": true,
    "created": "2016-06-03T17:22:59.000-05:00",
    "frequency": 15,
    "id": "5024717285821443610-242ac11f-0001-014",
    "internalUsername": null,
    "lastCheck": null,
    "lastSuccess": null,
    "lastUpdated": "2016-06-03T17:24:59.000-05:00",
    "nextUpdate": "2016-06-03T17:29:59.000-05:00",
    "owner": "nryan",
    "target": "storage.example.com",
    "updateSystemStatus": false,
    "_links": {
        "checks": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks"
        },
        "notifications": {
            "href": "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=5024717285821443610-242ac11f-0001-014"
        },
        "owner": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
        }
    }
}

Monitors can be managed by making traditional GET, POST, and DELETE operations. When updating a monitor, pay attention to the response because the time of the next check will change. In fact, any change to a monitor will recalculate the time when the next health check will run.

Disabling an existing monitor

Disable an existing monitor

curl -sk -H "Authorization: Bearer $AUTH_TOKEN"
    -H "Content-Type: application/json"
    -X PUT --data-binary '{"action": "disable"}'
    https://sandbox.agaveplatform.org/monitors/v2/5024717285821443610-242ac11f-0001-014

monitors-disable 5024717285821443610-242ac11f-0001-014

The response will look something like the following:

{
    "active": false,
    "created": "2016-06-03T17:22:59.000-05:00",
    "frequency": 15,
    "id": "5024717285821443610-242ac11f-0001-014",
    "internalUsername": null,
    "lastCheck": null,
    "lastSuccess": null,
    "lastUpdated": "2016-06-03T17:24:59.000-05:00",
    "nextUpdate": "2016-06-03T17:29:59.000-05:00",
    "owner": "nryan",
    "target": "storage.example.com",
    "updateSystemStatus": false,
    "_links": {
        "checks": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks"
        },
        "notifications": {
            "href": "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=5024717285821443610-242ac11f-0001-014"
        },
        "owner": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
        }
    }
}

There may be times when you need to pause a monitor. If your system has scheduled maintenance periods, you may want to disable the monitor until the maintenance period ends. You can do this by making a PUT request on a monitor with the a field name action set to “disabled”. While disabled, all health checks will be skipped.

Enabling an existing monitor

Enable an existing monitor

curl -sk -H "Authorization: Bearer $AUTH_TOKEN"
    -H "Content-Type: application/json"
    -X PUT --data-binary '{"action": "enable"}'
    https://sandbox.agaveplatform.org/monitors/v2/5024717285821443610-242ac11f-0001-014

monitors-enable 5024717285821443610-242ac11f-0001-014

{
    "active": true,
    "created": "2016-06-03T17:22:59.000-05:00",
    "frequency": 15,
    "id": "5024717285821443610-242ac11f-0001-014",
    "internalUsername": null,
    "lastCheck": null,
    "lastSuccess": null,
    "lastUpdated": "2016-06-03T17:24:59.000-05:00",
    "nextUpdate": "2016-06-03T17:29:59.000-05:00",
    "owner": "nryan",
    "target": "storage.example.com",
    "updateSystemStatus": false,
    "_links": {
        "checks": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks"
        },
        "notifications": {
            "href": "https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=5024717285821443610-242ac11f-0001-014"
        },
        "owner": {
            "href": "https://sandbox.agaveplatform.org/profiles/v2/nryan"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
        }
    }
}

Deleting a monitor

Deleting an existing monitor

curl -sk -H "Authorization: Bearer $AUTH_TOKEN"
    -H "Content-Type: application/json"
    -X DELETE
    https://sandbox.agaveplatform.org/monitors/v2/5024717285821443610-242ac11f-0001-014

monitors-delete 5024717285821443610-242ac11f-0001-014

An empty response will be returned

To delete a monitor, simply make a DELETE request on the monitor.

Monitor Checks

Listing past monitor checks

curl -sk -H "Authorization: Bearer $AUTH_TOKEN"
    'https://sandbox.agaveplatform.org/monitors/v2/5024717285821443610-242ac11f-0001-014/checks?limit=1'

monitors-checks-list -v -l 1
    -M 5024717285821443610-242ac11f-0001-014

The response will look something like the following:

[
    {
        "created": "2016-06-03T17:29:59.000-05:00",
        "id": "4035070921477123610-242ac11f-0001-015",
        "message": null,
        "result": "PASSED",
        "type": "STORAGE",
        "_links": {
            "monitor": {
                "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
            },
            "self": {
                "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks/4035070921477123610-242ac11f-0001-015"
            },
            "system": {
                "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
            }
        }
    }
]

Each instance of a monitor testing a system is called a Check. Monitor Checks are persisted over time and query able as a collection of a monitor resource. Monitor checks can be queried by result, timeframe, and type. By default, the last check is injected into a monitor description as the lastCheck field.

Each monitor check has a unique ID and represents a formal, addressable resource in the API. Here we see a typical successful monitor check. Checks will have one of two states: PASSED or FAILED. Successful monitors have a status of PASSED and no message. Unsuccessful monitors have a status of FAILED and a message describing why they failed.

Searching check history

Searching check history for a monitor

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \   
    'https://sandbox.agaveplatform.org/monitors/v2/5024717285821443610-242ac11f-0001-014/checks?limit=1&result.eq=PASSED'

monitors-checks-search -v -l 1 \
    -M 5024717285821443610-242ac11f-0001-014 \
    result.eq=PASSED

The response will look something like the following:

[
    {
        "created": "2016-06-03T17:29:59.000-05:00",
        "id": "4035070921477123610-242ac11f-0001-015",
        "message": null,
        "result": "PASSED",
        "type": "STORAGE",
        "_links": {
            "monitor": {
                "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
            },
            "self": {
                "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks/4035070921477123610-242ac11f-0001-015"
            },
            "system": {
                "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
            }
        }
    }
]

Long-running monitor checks can build up a large history which can become prohibitive to page through. When generating graphs and looking for specific incidents, you can search for specific checks based on result, startTime, endTime, type, and id. The standard JSON SQL search syntax used across the rest of the Science APIs is supported for monitor checks as well.

Manually running a check

Forcing a monitor check to run

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
     -H "Content-Type: application-json" \
     -X POST --data-binary '{}' \
    https://sandbox.agaveplatform.org/monitors/v2/5024717285821443610-242ac11f-0001-014/checks

monitors-fire -v 5024717285821443610-242ac11f-0001-014

The response will look something like the following:

{
    "created": "2016-06-10T11:30:58.920-05:00",
    "id": "5314048891498786330-242ac11f-0001-015",
    "message": null,
    "result": "PASSED",
    "type": "STORAGE",
    "_links": {
        "monitor": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
        },
        "self": {
            "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/checks/5314048891498786330-242ac11f-0001-015"
        },
        "system": {
            "href": "https://sandbox.agaveplatform.org/systems/v2/storage.example.com"
        }
    }
}

If you need to verify the accessibility of your system, or behavior of your monitor, you can force an existing monitor to run on demand by sending a POST request to the monitor checks collection. When doing this, you are still subject to the same minimum check interval configured for your tenant.

Permissions

At this time, monitors do not have permissions associated with them.

History

List the change history of a monitor

curl -sk -H "Authorization: Bearer $AUTH_TOKEN" \
     -H "Content-Type: application-json" \
     -X POST --data-binary '{}' \
    https://sandbox.agaveplatform.org/monitors/v2/5024717285821443610-242ac11f-0001-014/history

monitors-history -v 5024717285821443610-242ac11f-0001-014

The response will look something like the following:

[
  {
    "createdBy": "nryan",
    "created": "2016-06-12T19:10:22Z",
    "status": "CREATED",
    "description": "This monitor was created by nryan",
    "id": "5705275956568068582-242ac11f-0001-035",
    "_links": {
      "self": {
        "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014/history/5705275956568068582-242ac11f-0001-035"
      },
      "monitor_event": {
        "href": "https://sandbox.agaveplatform.org/monitor/v2/5024717285821443610-242ac11f-0001-014"
      }
    }
  }
]

A full history of the lifecycle of a monitor is available via the monitor history collection. Here you can list events that have occurred during the life of the monitor.

Events

The following events will be thrown by the Monitors API.

API	Description
CREATED	The monitor was created
UPDATED	The monitor was updated
DELETED	The monitor was deleted
ENABLED	The monitor was enabled
DISABLED	The monitor was disabled
PERMISSION_GRANT	A new user permission was granted on this monitor
PERMISSION_REVOKE	A user permission was revoked on this sytem
FORCED_CHECK_REQUESTED	A status check was requested by the user outside of the existing monitor schedule.
CHECK_PASSED	The status check passed
CHECK_FAILED	The status check failed
CHECK_UNKNOWN	The status check finished in an unknown state
STATUS_CHANGE	The status condition of the monitored resource changed since the last check
RESULT_CHANGE	The cumulative result of all checks performed on the monitored resource changed since the last suite of checks

User Profiles

 /$$$$$$$                   /$$$$$$ /$$/$$
| $$__  $$                 /$$__  $|__| $$
| $$  \ $$/$$$$$$  /$$$$$$| $$  \__//$| $$ /$$$$$$  /$$$$$$$
| $$$$$$$/$$__  $$/$$__  $| $$$$   | $| $$/$$__  $$/$$_____/
| $$____| $$  \__| $$  \ $| $$_/   | $| $| $$$$$$$|  $$$$$$
| $$    | $$     | $$  | $| $$     | $| $| $$_____/\____  $$
| $$    | $$     |  $$$$$$| $$     | $| $|  $$$$$$$/$$$$$$$/
|__/    |__/      \______/|__/     |__|__/\_______|_______/

The Agave hosted identity service (profiles service) is a RESTful web service that gives organizations a way to create and manage the user accounts within their Agave tenant. The service is backed by a redundant LDAP instance hosted in multiple datacenters making it highly available. Additionally, passwords are stored using the openldap md5crypt algorithm.

Tenant administrators can manage only a basic set of fields on each user account within LDAP itself. For more complex profiles, we recommend combing the profiles service with the metadata service. See the section on Extending the Basic Profile with the Metadata Service below.

The service uses OAuth2 for authentication, and user’s must have special privileges to create and update user accounts within the tenant. Please work with the Agave development team to make sure your admins have the user-account-manager role.

In addition to the web service, there is also a basic front-end web application providing user sign up. The web application will suffice for basic user profiles and can be used as a starting point for more advanced use cases.

Creating

Create a user account by sending a POST request to the profiles service, providing an access token of a user with the user-account-manager role. The fields username, password and email are required to create a new user.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -X POST \
    -d "username=testuser" \
    -d "password=abcd123" \
    -d "email=testuser@test.com" \
    https://sandbox.agaveplatform.org/profiles/v2

profiles-create -u testuser -p abcd123 -e testuser@test.com

The response to this call for our example user looks like this:

{
  "message":"User created successfully.",
  "result":{
    "email":"testuser@test.com",
    "first_name":"",
    "full_name":"testuser",
    "last_name":"testuser",
    "mobile_phone":"",
    "phone":"",
    "status":"Active",
    "uid":null,
    "username":"testuser"
  },
  "status":"success",
  "version":"2.0.0-SNAPSHOT-rc3fad"
}

The complete list of available fields and their descriptions is provided in the table below.

Field Name	Description	Required?
username	The username for the user; must be unique across the tenant	Yes
email	The email address for the user.	Yes
password	The password for the user.	Yes
first_name	First name of the user	No
last_name	Last name of the user	No
phone	User’s phone number	No
mobile_phone	User’s mobile phone number.	No

Note that the service does not do any password strength enforcement or other password management policies. We leave it to each organization to implement the policies best suited for their use case.

Extending with Metadata

Sample metadata object extending a user profile

{
  "name":"user_profile",
  "value":{
    "firstName":"Test",
    "lastName":"User",
    "email":"testuser@test.com",
    "city":"Springfield",
    "state":"IL",
    "country":"USA",
    "phone":"636-555-3226",
    "gravatar":"http://www.gravatar.com/avatar/ed53e691ee322e24d8cc843fff68ebc6"
  }
}

Save the extended profile document to the metadata service

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    -X POST \
    -F "fileToUpload=@profile_ex" \
    https://sandbox.agaveplatform.org/meta/v2/data/?pretty=true

metadata-addupdate -v -F profile_ex

The response would resemble something like the following:

{
  "status" : "success",
  "message" : null,
  "version" : "2.1.0-rc0c5a",
  "result" : {
    "uuid" : "0001429724043699-5056a550b8-0001-012",
    "owner" : "jstubbs",
    "schemaId" : null,
    "internalUsername" : null,
    "associationIds" : [ ],
    "lastUpdated" : "2015-04-22T12:34:03.698-05:00",
    "name" : "user_profile",
    "value" : {
      "firstName" : "Test",
      "lastName" : "User",
      "email" : "testuser@test.com",
      "city" : "Springfield",
      "state" : "IL",
      "country" : "USA",
      "phone" : "636-555-3226",
      "gravatar" : "http://www.gravatar.com/avatar/ed53e691ee322e24d8cc843fff68ebc6"
    },
    "created" : "2015-04-22T12:34:03.698-05:00",
    "_links" : {
      "self" : {
        "href" : "https://sandbox.agaveplatform.org/meta/v2/data/0001429724043699-5056a550b8-0001-012"
      }
    }
  }
}

We do not expect the fields above to provide full support for anything but the most basic profiles. The recommended strategy is to use the profiles service in combination with the metadata service the (see Metadata Guide for more details) to store additional information. The metadata service allows you to create custom types using JSON schema, making it more flexible than standard LDAP from within a self-service model. Additionally, the metadata service includes a rich query interface for retrieving users based on arbitrary JSON queries.

The general approach used by existing tenants has been to create a single entry per user where the entry contains all additional profile data for the user. Every metadata item representing a user profile can be identified using a fixed string for the “name” attribute (e.g., “user_profile’). The value of the metadata item contains a unique identifier for the user (e.g. username or email address) along with all the additional fields you wish to track on the profile. One benefit of this approach is that it cleanly delineates multiple classes of profiles, for example "admin_profile”, “developer_profile”, “mathematician_profile”, etc. When consuming this information in a web interface, such user-type grouping makes presentation significantly easier.

Another issue to consider when extending user profile information through the Metadata service is ownership. If you create the user’s account, then prompt them to login before entering their extended data, it is possible to create the user’s metadata record under their account. This has the advantage of giving the user full ownership over the information, however it also opens up the possibility that the user, or a third-party application, could modify or delete the record.

A better approach is to use a service account to create all extended profile metadata records and grant the user READ access on the record. This still allows third-party applications to access the user’s information at their request, but prevents any malicious things from happening.

The example above represents a possible JSON document that could be used to store a metadata record representing a profile:

Updating

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X PUT -d "password=abcd123&email=testuser@test.com&first_name=Test&last_name=User" https://sandbox.agaveplatform.org/profiles/v2/testuser

profiles-addupdate -v -p abcd123 -e "testuser@test.com" -f Test -l User testuser

The response to this call looks like this:

{
  "message":"User updated successfully.",
  "result":{
    "create_time":"20150421153504Z",
    "email":"testuser@test.com",
    "first_name":"Test",
    "full_name":"Test User",
    "last_name":"User",
    "mobile_phone":"",
    "phone":"",
    "status":"Active",
    "uid":0,
    "username":"testuser"
  },
  "status":"success",
  "version":"2.0.0-SNAPSHOT-rc3fad"
}

Updates to existing users can be made by sending a PUT request to https://sandbox.agaveplatform.org/profiles/v2/ and passing the fields to update. For example, we can add a gravatar attribute to the account we created above.

Deleting

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X DELETE https://sandbox.agaveplatform.org/profiles/v2/testuser

profiles-delete -v testuser

The response to this call looks like this:

{
"message": "User deleted successfully.",
"result": {},
"status": "success",
"version": "2.0.0-SNAPSHOT-rc3fad"
}

To delete an existing user, make a DELETE request on their profile resource.

Registration Web Application

The account creation web app provides a simple form to enable user self-sign. Here is a screenshot of the sign up form:

Account creation web app form

The web application also provides an email loop for verification of new accounts. The code is open source and freely available from bitbucket: Account Creation Web Application

Most likely you will want to customize the branding and other aspects of the application, but for simple use cases, the Agave team can deploy a stock instance of the application in your tenant. Work with the Agave developer team if this is of interest to your organization.

UUID

 /$$   /$$ /$$   /$$ /$$$$$$ /$$$$$$$
| $$  | $$| $$  | $$|_  $$_/| $$__  $$
| $$  | $$| $$  | $$  | $$  | $$  \ $$
| $$  | $$| $$  | $$  | $$  | $$  | $$
| $$  | $$| $$  | $$  | $$  | $$  | $$
| $$  | $$| $$  | $$  | $$  | $$  | $$
|  $$$$$$/|  $$$$$$/ /$$$$$$| $$$$$$$/
 \______/  \______/ |______/|_______/

The Agave UUID service resolves the type and representation of one or more Agave UUID. This is helpful, for instance, when you need to expand the hypermedia response of another resource, get the URL corresponding to a UUID, or fetch the representations of multiple resources in a single request.

Resolving a single UUID

Resolving a uuid

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/uuid/v2/0001409758089943-5056a550b8-0001-002

uuid-lookup -v 0001409758089943-5056a550b8-0001-002

The response will look something like this:

{
  "uuid":"0001409758089943-5056a550b8-0001-002",
  "type":"FILE",
  "_links":{
    "file":{
      "href":"https://sandbox.agaveplatform.org/files/v2/history/system/data.agaveplatform.org/nryan/picksumipsum.txt"
    }
  }
}

A single UUID can be resolved by making a GET request on the UUID resource. The response will include the UUID and the type of the resource to which it is associated. The canonical resource URL is available in the hypermedia response. All calls to the UUID API are authenticated, however no permission checks will be made when doing basic resolving.

Expanding a UUID query

Resolving a uuid to a full resource representation

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/uuid/v2/0001409758089943-5056a550b8-0001-002?expand=true&pretty=true

uuid-lookup -v -e 0001409758089943-5056a550b8-0001-002

The response will include the entire representation of the resource just as if you queried the Files API.

{
  "internalUsername":null,
  "lastModified":"2014-09-03T10:28:09.943-05:00",
  "name":"picksumipsum.txt",
  "nativeFormat":"raw",
  "owner":"nryan",
  "path":"/home/nryan/picksumipsum.txt",
  "source":"http://127.0.0.1/picksumipsum.txt",
  "status":"STAGING_QUEUED",
  "systemId":"data.agaveplatform.org",
  "uuid":"0001409758089943-5056a550b8-0001-002",
  "_links":{
    "history":{
      "href":"https://sandbox.agaveplatform.org/files/v2/history/system/data.agaveplatform.org/nryan/picksumipsum.txt"
    },
    "self":{
      "href":"https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt"
    },
    "system":{
      "href":"https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
    }
  }
}

Often times you need more information about the resource associated with the UUID. You can save yourself an API request by adding expand=true to the URL query. The resulting response, if successful, will include the full resource representation of the resource associated with the UUID just as if you had called its URL directly. Filtering is also supported, so you can specify just the fields you want returned in the response.

Resolving multiple UUID

Resolving multiple UUID.

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/uuid/v2/?uuids.eq=0001409758089943-5056a550b8-0001-002,0001414144065563-5056a550b8-0001-007?expand=true&pretty=true

uuid-lookup -v -E 0001409758089943-5056a550b8-0001-002 0001414144065563-5056a550b8-0001-007

The response will be similar to the following.

[
  {
    "uuid":"0001409758089943-5056a550b8-0001-002",
    "type":"FILE",
    "url":"https://sandbox.agaveplatform.org/files/v2/history/system/data.agaveplatform.org/nryan/picksumipsum.txt",
    "_links":{
      "file":{
        "href":"https://sandbox.agaveplatform.org/files/v2/history/system/data.agaveplatform.org/nryan/picksumipsum.txt"
      }
    }
  },
  {
    "uuid":"0001414144065563-5056a550b8-0001-007",
    "type":"JOB",
    "url":"https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007",
    "_links":{
      "file":{
        "href":"https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
      }
    }
  }
]

To resolve multiple UUID, make a GET request on the uuids collection and pass the UUID in as a comma-separated list to the uuids query parameter. The response will contain a list of resolved resources in the same order that you requested them.

Expanding multiple UUID

Resolving multiple UUID to their resource representations

curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" \
    https://sandbox.agaveplatform.org/uuid/v2/?uuids.eq=0001409758089943-5056a550b8-0001-002,0001414144065563-5056a550b8-0001-007?expand=true&pretty=true

uuid-lookup -v -e 0001409758089943-5056a550b8-0001-002 0001414144065563-5056a550b8-0001-007

The response will include an array of the expanded representations in the order they were requested in the URL query.

[
  {
    "id":"$JOB_ID",
    "name":"demo-pyplot-demo-advanced test-1414139896",
    "owner":"$API_USERNAME",
    "appId":"demo-pyplot-demo-advanced-0.1.0",
    "executionSystem":"$PUBLIC_EXECUTION_SYSTEM",
    "batchQueue":"debug",
    "nodeCount":1,
    "processorsPerNode":1,
    "memoryPerNode":1.0,
    "maxRunTime":"01:00:00",
    "archive":false,
    "retries":0,
    "localId":"10321",
    "outputPath":null,
    "status":"STOPPED",
    "submitTime":"2014-10-24T04:48:11.000-05:00",
    "startTime":"2014-10-24T04:48:08.000-05:00",
    "endTime":null,
    "inputs":{
      "dataset":"agave://$PUBLIC_STORAGE_SYSTEM/$API_USERNAME/inputs/pyplot/testdata.csv"
    },
    "parameters":{
      "chartType":"bar",
      "height":"512",
      "showLegend":"false",
      "xlabel":"Time",
      "background":"#FFF",
      "width":"1024",
      "showXLabel":"true",
      "separateCharts":"false",
      "unpackInputs":"false",
      "ylabel":"Magnitude",
      "showYLabel":"true"
    },
    "_links":{
      "self":{
        "href":"https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007"
      },
      "app":{
        "href":"https://sandbox.agaveplatform.org/apps/v2/demo-pyplot-demo-advanced-0.1.0"
      },
      "executionSystem":{
        "href":"https://sandbox.agaveplatform.org/systems/v2/$PUBLIC_EXECUTION_SYSTEM"
      },
      "archiveData":{
        "href":"https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/outputs/listings"
      },
      "owner":{
        "href":"https://sandbox.agaveplatform.org/profiles/v2/$API_USERNAME"
      },
      "permissions":{
        "href":"https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/pems"
      },
      "history":{
        "href":"https://sandbox.agaveplatform.org/jobs/v2/0001414144065563-5056a550b8-0001-007/history"
      },
      "metadata":{
        "href":"https://sandbox.agaveplatform.org/meta/v2/data/?q=%7b%22associationIds%22%3a%220001414144065563-5056a550b8-0001-007%22%7d"
      },
      "notifications":{
        "href":"https://sandbox.agaveplatform.org/notifications/v2/?associatedUuid=0001414144065563-5056a550b8-0001-007"
      }
    }
  },
  {
    "internalUsername":null,
    "lastModified":"2014-09-03T10:28:09.943-05:00",
    "name":"picksumipsum.txt",
    "nativeFormat":"raw",
    "owner":"nryan",
    "path":"/home/nryan/picksumipsum.txt",
    "source":"http://127.0.0.1/picksumipsum.txt",
    "status":"STAGING_QUEUED",
    "systemId":"data.agaveplatform.org",
    "uuid":"0001409758089943-5056a550b8-0001-002",
    "_links":{
      "history":{
        "href":"https://sandbox.agaveplatform.org/files/v2/history/system/data.agaveplatform.org/nryan/picksumipsum.txt"
      },
      "self":{
        "href":"https://sandbox.agaveplatform.org/files/v2/media/system/data.agaveplatform.org/nryan/picksumipsum.txt"
      },
      "system":{
        "href":"https://sandbox.agaveplatform.org/systems/v2/data.agaveplatform.org"
      }
    }
  }
]

Expansion also works when querying UUID in bulk. Simply add expand=true to the URL query in your request and the full resource representation of each UUID will be returned in an array with the original UUID request order maintained. If any of the resolutions fail due to permission violation or server error, the error response object will be provided rather than resource representation.

Events

 /$$$$$$$$                          /$$            
| $$_____/                         | $$            
| $$   /$$    /$$/$$$$$$ /$$$$$$$ /$$$$$$  /$$$$$$$
| $$$$|  $$  /$$/$$__  $| $$__  $|_  $$_/ /$$_____/
| $$__/\  $$/$$| $$$$$$$| $$  \ $$ | $$  |  $$$$$$ 
| $$    \  $$$/| $$_____| $$  | $$ | $$ /$\____  $$
| $$$$$$$\  $/ |  $$$$$$| $$  | $$ |  $$$$/$$$$$$$/
|________/\_/   \_______|__/  |__/  \___/|_______/

Events underpin everything in the Agave Platform. This section covers the events available to each resource.

Event Reference

Apps

Event	Description
UPDATED	The app was updated
DELETED	The app was deleted
PUBLISHED	The app was made available for public use.
CLONED	The app was cloned as another app
PERMISSION_GRANT	A user permission was updated
PERMISSION_REVOKE	A user permission was deleted
RESTORED	App was restored from disabled status
UNPUBLISHED	App was unpublished. It will no longer be available for public use
PUBLISHING_FAILED	The app failed to complete publishing. The id given in the original request is not valid and the app will not be publicly available.
DISABLED	App was disabled and is not currently available for use.
CLONING_FAILED	The app failed to complete publishing. The id given in the original request is not valid and the app will not be available for use.
REGISTERED	A new app was registered

Files

Event	Description
CREATED	File or directory was created
DELETED	The file was deleted
INDEX_START	Indexing of file/folder started
INDEX_COMPLETE	Indexing of file/folder completed
INDEX_FAILED	Indexing of file/folder failed
RENAME	The file was renamed
MOVED	The file was moved to another path
OVERWRITTEN	The file was overwritten
PERMISSION_GRANT	A user permission was added
PERMISSION_REVOKE	A user permission was deleted
STAGING_QUEUED	File/folder queued for staging
STAGING	File or directory is currently in flight
STAGING_FAILED	Staging failed
STAGING_COMPLETED	Staging completed successfully
PREPROCESSING	Prepairing file for processing
TRANSFORMING_QUEUED	File/folder queued for transform
TRANSFORMING	Transforming file/folder
TRANSFORMING_FAILED	Transform failed
TRANSFORMING_COMPLETED	Transform completed successfully
UPLOAD	New content was uploaded to the file.
CONTENT_CHANGED	Content changed within this file/folder. If a folder, this event will be thrown whenever content changes in any file within this folder at most one level deep.
DOWNLOAD	The file item was downloaded.

Internal Users

Event	Description
CREATED	The internal user was updated
DELETED	The internal user was deleted
UPDATED	The internal user was updated

Jobs

Event	Description
CREATED	The job was updated
UPDATED	The job was updated
DELETED	The job was deleted
PERMISSION_GRANT	User permission was granted
PERMISSION_REVOKE	Permission was removed for a user on this job
PENDING	Job accepted and queued for submission.
STAGING_INPUTS	Transferring job input data to execution system
CLEANING_UP	Job completed execution
ARCHIVING	Transferring job output to archive system
STAGING_JOB	Job inputs staged to execution system
FINISHED	Job complete
KILLED	Job execution killed at user request
FAILED	Job failed
STOPPED	Job execution intentionally stopped
RUNNING	Job started running
PAUSED	Job execution paused by user
QUEUED	Job successfully placed into queue
SUBMITTING	Preparing job for execution and staging binaries to execution system
STAGED	Job inputs staged to execution system
PROCESSING_INPUTS	Identifying input files for staging
ARCHIVING_FINISHED	Job archiving complete
ARCHIVING_FAILED	Job archiving failed
HEARTBEAT	Job heartbeat received
JOB_RUNTIME_CALLBACK_EVENT	This is the default event thrown when a job pushes out runtime information using the AGAVE_JOB_CALLBACK_NOTIFICATION macro.
EMPTY_STATUS_RESPONSE	An empty response was received from the remote execution system when querying for job status
REMOTE_STATUS_CHANGE	The status of the job on the remote system was changed by an external process. The change does not reflect a change in Agave’s understanding of the job’s status.
UNKNOWN_TERMINATION	The job experienced an unknown termination event and is no longer running on the remote system. The job will be failed by Agave momentarily.

Metadata

Event	Description
CREATED	The metadata was updated
UPDATED	The metadata was updated
DELETED	The metadata was deleted
PERMISSION_GRANT	User permission was granted
PERMISSION_REVOKE

Metadata Schema

Event	Description
CREATED	The schema was updated
UPDATED	The schema was updated
DELETED	The schema was deleted
PERMISSION_GRANT	User permission was granted
PERMISSION_REVOKE

Monitors

API	Description
CREATED	The monitor was created
UPDATED	The monitor was updated
DELETED	The monitor was deleted
ENABLED	The monitor was enabled
DISABLED	The monitor was disabled
PERMISSION_GRANT	A new user permission was granted on this monitor
PERMISSION_REVOKE	A user permission was revoked on this sytem
FORCED_CHECK_REQUESTED	A status check was requested by the user outside of the existing monitor schedule.
CHECK_PASSED	The status check passed
CHECK_FAILED	The status check failed
CHECK_UNKNOWN	The status check finished in an unknown state
STATUS_CHANGE	The status condition of the monitored resource changed since the last check
RESULT_CHANGE	The cumulative result of all checks performed on the monitored resource changed since the last suite of checks

Notifications

Event	Description
CREATED	Notification was created
UPDATED	Notification was updated
DELETED	Notification was deleted
DISABLED	Notification was diabled
ENABLED	Notification was enabled
FAILURE	Notificaiton delivery failed
SUCCESS	Notification was successfully delivered
SEND_ERROR	Notification attempt was unsuccessful
RETRY_ERROR	Notification retry attempt was unsuccessful
PERMISSION_REVOKE	One or more user permissions were revoked on this tag
PERMISSION_GRANT	One or more user permissions were granted on this tag
FORCED_ATTEMPT	Notification attempt was forced by user

PostIts

Event	Description
CREATED	The metadata was updated
UPDATED	The metadata was updated
REFRESHED	PostIt was refreshed back to its original quotas or extended for another day
DELETED	The metadata was deleted
REDEEMED	User permission was granted

Profiles

Event	Description
CREATED	A new user account was created.
DELETED	The user account was deleted.
UPDATED	The user account was updated.
ACCOUNT_ACTIVATED	The user’s account was activated.
ACCOUNT_DEACTIVATED	The user’s account was deactivated.
ROLE_GRANTED	The user had a role added.
ROLE_REVOKED	The user had a role revoked.
QUOTA_EXCEEDED	The user has exceeded one or more quotas.

Systems

Event	Description
CREATED	The system was created
UPDATED	The system was updated
DELETED	The system was deleted
ROLES_GRANT	User permission was granted
ROLES_REVOKE	User role was removed from the system
STATUS_CHANGE	The system status changed

Transfers

Event	Description
CREATED	A new transfer was created
CANCELLED	The system was deleted
QUEUED	Transfer queued and waiting to start
COMPLETED	Transfer completed successfully
FAILED	Transfer failed while transferring
PAUSED	Transfer paused
RETRYING	Transfer failed, beginning to retry
TRANSFERRING	Transfer has started

Search

  /$$$$$$                                     /$$
 /$$__  $$                                   | $$
| $$  \__/ /$$$$$$  /$$$$$$  /$$$$$$  /$$$$$$| $$$$$$$
|  $$$$$$ /$$__  $$|____  $$/$$__  $$/$$_____| $$__  $$
 \____  $| $$$$$$$$ /$$$$$$| $$  \__| $$     | $$  \ $$
 /$$  \ $| $$_____//$$__  $| $$     | $$     | $$  | $$
|  $$$$$$|  $$$$$$|  $$$$$$| $$     |  $$$$$$| $$  | $$
 \______/ \_______/\_______|__/      \_______|__/  |__/

Search is a fundamental feature of the Agave Platform. Most of the core science APIs support a mature, URL-based query mechanism allowing you to search using a sql-inspired json syntax. The two exceptions are the Files and Metadata APIs. The Files service does not index the directory or file contents of registered systems, so there is no way for it to performantly search the file system. The metadata service supports MongoDB query syntax, thus allowing more flexible, and slightly more complex, querying syntax.

Query syntax

http://sandbox.agaveplatform.org/jobs/v2?name=test%20job

You can include as multiple search expressions to build a more restrictive query.

http://sandbox.agaveplatform.org/jobs/v2?name=test%20job&executionSystem=aws-demo&status=FAILED

By default, search is enabled on each collection endpoint allowing you to trim the response down to the results you care about most. The list of available search terms is identical to the attributes included in the JSON returned when requesting the full resource description.

To search for a specific attribute, you simply append a search expression into the URL query of your request. For example:

Search operators

# systems with cloud in their name  
systems/v2?name.like=*cloud*

# apps modified between October 1 and October 30 of this year  
apps/v2?lastModified.between=10/1,10/30

# jobs with status equal to PENDING or ARCHIVING  
jobs/v2?id.in=PENDING,ARCHIVING

# systems with cloud in their name  
systems-search 'name.like=*cloud*'

# apps modified between October 1 and October 30 of this year  
apps-search 'lastModified.between=10/1,10/30'

# jobs with status equal to PENDING or ARCHIVING  
jobs-search 'id.in=PENDING,ARCHIVING'

By default, all search expressions are evaluated for equality. In order to perform more complex queries, you may append a search operator to the attribute in your search expression. The following examples should help clarify:

For resources with nested collections, you may use JSON dot notation to query the subresources in the collection.

# systems using Amazon S3 as the storage protocol  
systems/v2?storage.protocol.eq="S3"

# systems with a batch queue allowing more than 10 concurrent user jobs  
systems/v2?queues.maxUserJobs.gt=10

# systems using Amazon S3 as the storage protocol  
systems-search 'storage.protocol.eq=S3'

# systems with a batch queue allowing more than 10 concurrent user jobs  
systems-search 'queues.maxUserJobs.gt=10'

Multiple operators

# jobs whos app has hadoop in the name, ran on a system with id aws-demo, and started
# any time during the last business week
jobs/v2?appId.like=*hadoop*&executionSystem.eq=aws-demo&startTime.between=last%20monday,last%20friday

# users who profile has a last name ending in ross and an email address ending in texas.edu
profiles/v2?lastname.like=*ross&email.like=*texas.edu

# failed login checks on the a system with uuid 0001409867973952-5056a550b8-0001-014
monitors/v2/?target.like=*ec2*&result.eq=FAILED&type=LOGIN

# jobs whos app has hadoop in the name, ran on a system with id aws-demo, and started
# any time during the last business week
jobs-search 'appId.like=*hadoop*' \
            'executionSystem.eq=aws-demo' \
            'startTime.between=last%20monday,last%20friday'

# users who profile has a last name ending in ross and an email address ending in texas.edu
profiles-search 'lastname.like=*ross' 'email.like=*texas.edu'

# failed login monitor checks on systems with "ec2" in the name
monitors-checks-search -M target.like=*ec2* \
                      'result.eq=FAILED' \
                      'type=LOGIN'

As before you can include multiple search expressions to narrow your results.

The full list of search operators is given in the following table.

Operator	Values	Description
eq	mixed	Matches values equal to the given search value. All comparisons are case sensitive. This cannot be used for complex object comparison.
on	datestring	Matches dates falling on the given datestring. Regardless of the precision given in the datestring, the search will look for matches from midnight to midnight on the resovled date.
neq	mixed	Matches values not equal to the given search value. All comparisons are case sensitive. This cannot be used for complex object comparison.
lt	mixed	Matches values less than the given search value.
before	datestring	Matches dates falling before the given datestring. Single second precision is supported.
lte	mixed	Matches values less than or equal to the given search value.
gt	mixed	Matches values greater than the given search value.
after	datestring	Matches values after the given datestring.
gte	mixed	Matches values greater than or equal to the given search value.
in	comma-separated list	Matches values in the given comma-separated list. This is equivalent to applying the like operator to each comma-separated value .
nin	comma-separated list	Matches values not in the given comma-separated list. This is equivalent to applying the nlike operator to each comma-separated value .
like	string	Matches values similar to the given search term. Wildcards (*) may be used to perform partial matches.
nlike	string	Matches values different from the given search term. Wildcards (*) may be used to perform partial matches.
between	comma-separated datestring	Matches dates falling within the given range. Single second precision is supported at either end of the range.

Date support

Dates returned from the Agave core science API are always formatted as ISO8601 dates. When searching, however, a much more flexible date syntax is supported. The following table lists supported expressions by example.

Expression	Equivalent Expression
	08:00:00.000
4pm or 04:00pm or 16:00	16:00:00.000
430pm or 04:30pm or 16:30	16:30:00.000
4pm	17:00:00.000
+1 second\|minute\|hour\|day\|week\|month\|year	now +1 second\|minute\|hour\|day\|week\|month\|year
-1 second\|minute\|hour\|day\|week\|month\|year	now -1 second\|minute\|hour\|day\|week\|month\|year
next Tuesday
last Tuesday
now	new Date()
today	00:00:00.000
midnight	00:00:00.000 +24 hours
morning or this morning	07:00:00.000
noon	12:00:00.000
afternoon or this afternoon	13:00:00.000
evening or this evening	17:00:00.000
tonight	20:00:00.000
tomorrow	now +24 hours
tomorrow morning	morning +24 hours
noon tomorrow or tomorrow noon	noon +24 hours
tomorrow afternoon	afternoon +24 hours
yesterday	now -24 hours
all the permutations of yesterday and morning, noon, afternoon, and evening	#colspan#
2004
October or Oct	10/1
Tuesday or Tue	Calendar date of the next Tuesday
October 26, 1981 or Oct 26, 1981	10/26/1981
October 26 or Oct 26	10/26
26 October 1981	10/26/1981
26 Oct 1981	10/26/1981
26 Oct 81	10/26/1981
10/26/1981 or 10-26-1981
10/26/81 or 10-26-81
1981/10/26 or 1981-10-26	10/26/1981
10/26 or 10-26

Custom search result

Search with multiple operators and return a custom response

# jobs whos app has hadoop in the name, ran on a system with id aws-demo, and started
# any time during the last business week
jobs/v2?appId.like=*cloud*&executionSystem.like=*docker*&startTime.after=2016-01-01&naked=true&limit=3

# jobs whos app has hadoop in the name, ran on a system with id aws-demo, and started
# any time during the last business week
jobs-search -v --limit=3 \
            --filter=id,appId,executionSystem,status,created \
            'appId.like=*cloud*' \
            'executionSystem.like=*docker*'
            'startTime.after=2016-01-01' \
            'naked=true'

There response will be a JSON array of custom objects comprised of only the fields you specified in the filter query parameter.

[
  {
    "id":"2974032102330798566-242ac115-0001-007",
    "appId":"cloud-runner-0.1.0u1",
    "executionSystem":"docker.tacc.utexas.edu",
    "status":"FINISHED",
    "created":"2016-11-03T16:04:53.000-05:00"
  },
  {
    "id":"8643408718823550490-242ac115-0001-007",
    "appId":"cloud-runner-0.1.0u1",
    "executionSystem":"docker.tacc.utexas.edu",
    "status":"FINISHED",
    "created":"2016-11-03T15:17:24.000-05:00"
  },
  {
    "id":"9049010248689521126-242ac115-0001-007",
    "appId":"cloud-runner-0.1.0u1",
    "executionSystem":"docker.tacc.utexas.edu",
    "status":"FINISHED",
    "created":"2016-11-03T15:17:07.000-05:00"
  }
]

By combining the search, filtering, and naked query parameters, you can query the API and return just the information you care about. The example search will return a JSON array of job objects with just the id, appId, executionSystem, status, and created fields from the full job object in the response. This combination of search, filtering, and pagination provides a powerful mechanism for generating custom views of the data.

Tooling

Sometimes the hardest part of a new project is taking the first step. Agave Tooling helps make taking that first step a little easier through reference web applications, boilerplate integrations scripts, and integrations with popular CMS and frameworks through native plugins and modules.

CLI

Checkout the source code

git clone https://github.com/agaveplatform/agave-cli

The Agave command-line interface (CLI) is an complete interface to the Agave REST API. The scripts include support for creating persistent authentication sessions, creating/renaming apps, registering and sharing systems, uploading and managing data, creating PostIts, etc. For existing projects looking to leverage Agave for back-end processing, for users wishing to integrate Agave into their existing scripted solutions, or for those new to Agave who just want to kick the tires, the Agave CLI is a powerful tool for all of these things. The Agave CLI can be checked out from the Agave git repository.

For more information on using the Agave CLI in common tasks, please consult the Guides which reference it in all their examples, or check out the Agave Samples project for sample data and examples of how to use it to populate and interact with your tenant.

Agave ToGo

Get a head start on your next development sprint by leveraging the open source Agave ToGo project. This AngularJS webapp can be reused in your existing project or used as-is for a clean, responsive, client-side web application that brings the full power of Agave to your browser.

Microsites

Agave Microsites are reference single-purpose web applications focused on delivering a specific solution to a target audience. Current microsite implementations focus on providing execution and management of a single app to a group of users. Upcoming microsites will focus on data management, automation, and data collection. All the Agave Microsites are white labeled and completely open source. You can view the latest Microsite Demo in our [Github repository]((https://github.com/agaveplatform/microsites).

Jupyter Hub

Jupyter notebooks (formerly iPython notebooks) provide users with interactive computing documents that contain both computer code and a mix of rich text elements such as data visualizations, text paragraphs, hyperlinks, formatted equations, etc. The code cells in notebooks can be executed interactively, cell by cell, and the results of the executions are displayed in subsequent cells in the notebook. The notebooks can also be exported to a serialized JSON formatted file and executed like a traditional program.

JupyterHub is an open source project to provide multi-user hosted notebook servers as a service. When a user signs in to JupyterHub, a notebook server with pre-configured software is automatically launched for them. The Agave team integrated JupyterHub into its identity and access management stack and made several other additional enhancements and customizations to enable the use of Agave’s language SDKs such as agavepy and the CLI, persistent storage, and multiple kernel support, directly from their notebooks with very minimal setup. Agave’s deployment of JupyterHub, which runs each user’s notebook server in a Docker container to further enhance reproducibility, is freely available for use in Agave’s Public Tenant.

You can get started with JupyterHub today at https://github.com/agaveplatform/jupyter-notebook.

Integrations

Several integrations exist out of the box to help you integrate Agave functinality into your favorite framework. If you’d like to see a integration into a framework not included here, let us know.

AngularJS

oauth-ng: A custom multitenant fork of the popular oauth-ng module preconfigured to authenticate against the Agave Platform.
Agave Filemanager: A fork of the angular-filemanager project customized to interact with the Agave Platform. Available as a standalone app, modal, and directive.

Elm

Elm auth: A sample Elm application demonstrating native OAuth implicit flow authenication against the Agave Platform.

Wordpress

wp-oauth: A custom fork of the WP-OAuth plugin configured with multitenant authentication against the Agave Platform. Account federation and user mapping are fully integrated to allow for seamless integration with existing installations.

Field name	Type	Description
name	string; 1-256	`required` An alphanumeric key unique within the set of tags for a given user, which can be used in leu of the id.
associationIds	array;	An JSON array of zero or more UUID to which this tag should be associated.

Event	Description
CREATED		Tag was registered
UPDATED		Tag was updated
DELETED		Tag was deleted from active use
RESOURCE_ADDED	Tag was restored from deleted status
RESOURCE_REMOVED	Tag was disabled
PUBLISHED	Tag was published for public use
UNPUBLISHED	Tag was unpublished. It will no longer be available for public use
PERMISSION_REVOKE	One or more user permissions were revoked on this tag
PERMISSION_GRANT	One or more user permissions were granted on this tag