Seven Bridges platforms provide a few different methods for data import:
sevenbridges2
packageIn this chapter we will explain how you can use the
sevenbridges2
API library to upload your files to the
Platform.
Although it is more intuitive to have these operations available on
the File
object, they are separated and stored directly on
the authentication object Auth
, because there are a
separate group of endpoints themselves.
You can upload files from your local computer to the Platform using
the upload()
method on your Auth
object. The
method allows you to upload only a single file for now.
To upload a file, you should provide its full path on your local
computer as the path
parameter.
To specify the upload destination for your file you can use either
project
or parent
parameter. These two
parameters should not be used together.
Project
object or project
ID.File
object (of type
Folder
) or its ID.By calling the upload()
method you are creating an
upload job that by default starts to run immediately. If you don’t want
to start the job immediately, just set the init
parameter
to TRUE
in order to only initialize the object.
This upload job is wrapped into an object of the class
Upload
where you can see its details and call other actions
on it.
Let’s initialize an upload job that will upload a file into a project:
# Authenticate
a <- Auth$new(platform = "aws-us", token = "<your-token>")
# Get the desired project to upload to
destination_project <- a$projects$get(project = "<project_id>")
# Create upload job and set destination project
upload_job <- a$upload(
path = "/path/to/your/file.txt",
project = destination_project,
overwrite = TRUE,
init = TRUE
)
If you would like to upload your file into a folder, you need to set
the parent
parameter:
# Get destination folder object
destination_folder <- a$files$get(id = "<folder_id>")
up <- a$upload(
path = "/path/to/your/file.txt",
parent = destination_folder,
overwrite = TRUE,
init = TRUE
)
Since we have initialized the upload job, let’s see which actions can we run.
First, let’s print the Upload
object to see what the API
returned as the response.
up$print()
── Upload ─────────────────────────────────────────────────────────────────────
• initialized: TRUE
• part_length: 1
• part_size: 33554432
• file_size: 232
• overwrite: FALSE
• filename: file.txt
• project: <username_or_division>/api-testing
• path: /path/to/your/file.txt
• upload_id: 4OvRx8Z9vghNoAUqsgYtNuM2IsiIM8kghhjgi7igu79HX9QKZpDEh5TZDrmhPxF
In the previous example we can see that the API returned the upload
id and some information about sizes. First we see the
file_size
in bytes (232), which is the real size of the
file. File upload actually splits files into parts in the background;
parts are then being uploaded one by one or in parallel and then merged
again on destination. Each part can weigh a maximum of 5 GB, while the
default part_size
is recommended and set to be 32MB (which
is 33554432B in our example).
Lastly, number of parts or part_length
field, is also an
important measure. Maximum number of parts can be 10.000.
Since users can control part size through the part_size
parameter in upload()
function, they should be careful not
to set a size that is too small for very large files, so that total
number of parts doesn’t exceed the limit of maximum 10.000.
Call the start()
method on the upload job object do
start the upload process.
# Start upload
up$start()
If you want to skip the step where you need to call the
start()
method to start the actual upload process, just set
the init
parameter back to FALSE
when creating
the upload job and the upload process will start right away.
# Create upload job and start it immediately
up <- a$upload(
path = "/path/to/your/file.txt",
project = destination_project,
overwrite = TRUE,
init = FALSE
)
In order to track the progress of the job, you can call the
info()
method on the upload object.
# Get upload progress info
up$info()
Apart from basic information, the result will also provide the info on the number of uploaded parts up to that moment.
Going back to the authentication object, there are two more
operations for uploads manipulation. One is the method
list_ongoing_uploads()
that allows you to see the list of
all ongoing upload processes.
# List ongoing uploads
a$list_ongoing_uploads()
The other one is abort()
. You can abort any upload
process using the upload_abort()
method. To do so, you need
to provide the ID of a process within the upload_id
parameter.
# Abort upload
a$abort_upload(upload_id = "<id_of_the_upload_process>")
Note that in practice, if you start a big upload job, your R session will be blocked until this process is finished. This functionality is work in progress but the idea is to not block your main session in the future, while the upload is running. For now, you can create another R session on your own and track the progress of the upload job there.
Cloud storage providers come with their own interfaces, features, and terminology. At a certain level, though, they all view resources as data objects organized in repositories. Authentication and operations are commonly defined on those objects and repositories, and while each cloud provider might call these things different names and apply different parameters to them, their basic behavior is the same.
Seven Bridges environments mediate access to these repositories using volumes. A volume is associated with a particular cloud storage repository that you have enabled Seven Bridges to read from (and, optionally, to write to). Currently, volumes may be created using two types of cloud storage repositories: Amazon Web Services’ (AWS) S3 buckets and Google Cloud Storage (GCS) buckets.
A volume enables you to treat the cloud repository associated with it as external storage. You can ‘import’ files from the volume to your Seven Bridges environment to use them as inputs for computation. Similarly, you can write files from the Seven Bridges environment to your cloud storage by ‘exporting’ them to your volume.
Learn more about volumes on the Seven Bridges Platform, CGC, BDC and CAVATICA.
All volume related operations for querying volumes, fetching a single
volume, and creating volumes are grouped under volumes
path
(Volumes
resource class) on the authentication object.
When operating with a single volume, it is represented as an object
of the Volume
class which stores all volume information
returned from the API and additional methods you are able to call
directly on the volume, like volume update, deactivation, listing
content, volume members management etc.
Note that all operations with volumes require
advance_access
parameter to be set to TRUE. In most of the
volume operations it is pre-set to TRUE
by default.
You can list all volumes you’ve registered by calling the
volumes$query()
method from the authentication object. The
method doesn’t have any additional query parameters that could allow you
to search for volumes by specific criteria, except the ones that control
the number of results returned using limit
and
offset
parameters.
# Query volumes
a$volumes$query()
The result returned is the Collection
object with
pagination ability.
In order to retrieve information about a single volume of interest,
you can get it using the volumes$get()
method using its id
as parameter. Volume ID is usually presented in the
<division_name>/<volume_name>
form for
Enterprise users, while for public program users it can be in the
<volume_owner>/<volume_name>
form.
# Get volume
a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
For creating volumes we have exposed several functions for different cloud providers and authentication types:
create_s3_using_iam_user
: creates S3 volume using IAM
User authentication typecreate_s3_using_iam_role
: creates S3 volume using IAM
Role authentication typecreate_google_using_iam_user
: creates GC volume using
IAM User authentication typecreate_google_using_iam_role
: creates GC volume using
IAM Role authentication typecreate_azure
: creates Azure volume (only RO privileges
allowed)create_ali_oss
: creates AliCloud volume (only RO
privileges allowed)For each of the functions it is possible to provide parameters via
path (from_path
) to a JSON file where all required fields
should be listed.
Examples of use are shown below:
# Create AWS volume using IAM User authentication type
aws_iam_user_volume <- a$volumes$create_s3_using_iam_user(
name = "my_new_aws_user_volume",
bucket = "<bucket-name>",
description = "AWS IAM User volume",
access_key_id = "<access-key>",
secret_access_key = "<secret-access-key>"
)
aws_iam_user_volume_from_path <- a$volumes$create_s3_using_iam_user(
from_path = "path/to/my/json/file.json"
)
# Create AWS volume using IAM Role authentication type
aws_iam_role_volume <- a$volumes$create_s3_using_iam_role(
name = "my_new_aws_role_volume",
bucket = "<bucket-name>",
description = "AWS IAM Role volume",
role_arn = "<role-arn-key>",
external_id = "<external-id>"
)
aws_iam_role_volume_from_path <- a$volumes$create_s3_using_iam_role(
from_path = "path/to/my/json/file.json"
)
# Create Google Cloud volume using IAM User authentication type
gc_iam_user_volume <- a$volumes$create_google_using_iam_user(
name = "my_new_gc_user_volume",
access_mode = "RW",
bucket = "<bucket-name>",
description = "GC IAM User volume",
client_email = "<client_email>",
private_key = "<private_key-string>"
)
gc_iam_user_volume_from_path <- a$volumes$create_google_using_iam_user(
from_path = "path/to/my/json/file.json"
)
# Create Google Cloud volume using IAM Role authentication type
# by passing configuration parameter as named list
gc_iam_role_volume <- a$volumes$create_google_using_iam_role(
name = "my_new_gc_role_volume",
access_mode = "RO",
bucket = "<bucket-name>",
description = "GC IAM Role volume",
configuration = list(
type = "<type-name>",
audience = "<audience-link>",
subject_token_type = "<subject_token_type>",
service_account_impersonation_url = "<service_account_impersonation_url>",
token_url = "<token_url>",
credential_source = list(
environment_id = "<environment_id>",
region_url = "<region_url>",
url = "<url>",
regional_cred_verification_url = "<regional_cred_verification_url>"
)
)
)
# Create Google Cloud volume using IAM Role authentication type
# by passing configuration parameter as string path to configuration file
gc_iam_role_volume_config_file <- a$volumes$create_google_using_iam_role(
name = "my_new_gc_role_volume_cnf_file",
access_mode = "RO",
bucket = "<bucket-name>",
description = "GC IAM Role volume - using config file",
configuration = "path/to/config/file.json"
)
# Create Google Cloud volume using IAM Role authentication type
# using from_path parameter
gc_iam_role_volume_from_path <- a$volumes$create_google_using_iam_role(
from_path = "path/to/full/config/file.json"
)
# Create Azure volume
azure_volume <- a$volumes$create_azure(
name = "my_new_azure_volume",
description = "Azure volume",
endpoint = "<endpoint>",
container = "<bucket-name",
storage_account = "<storage_account-name>",
tenant_id = "<tenant_id>",
client_id = "<client_id>",
client_secret = "<client_secret>",
resource_id = "<resource_id>"
)
azure_volume_from_path <- a$volumes$create_azure(
from_path = "path/to/my/json/file.json"
)
# Create Ali Cloud volume
ali_volume <- a$volumes$create_ali_oss(
name = "my_new_azure_volume",
description = "Ali volume",
endpoint = "<endpoint>",
bucket = "<bucket-name",
access_key_id = "<access_key_id>",
secret_access_key = "<secret_access_key>"
)
ali_volume_from_path <- a$volumes$create_ali_oss(
from_path = "path/to/my/json/file.json"
)
When you’ve created a new volume, you can notice it is represented as
an object of the Volume
class. To preview all volume
information, use the print()
method:
# Print volume info
print(aws_iam_user_volume)
Within this volume you have the following operations available to execute:
update
: update volume informationlist_contents
: list volume contentget_file
: get single volume file infodeactivate
: deactivate volumereactivate
: reactivate previously deactivated
volumelist_members
: list all volume membersadd_member
: add new volume memberremove_member
: remove volume memberget_member
: get a volume member informationmodify_member_permissions
: modify member permissions on
the volumedelete
: delete previously deactivated volumereload
: reload volume object to sync informationlist_imports
: list all imports from the specified
volumelist_exports
: list all exports to the specified
volumeYou can update volume’s description
,
access_mode
and service
information. Please
consult our API
documentation on how to use the service
parameter.
# If the volume is created with RO access mode and RO credential parameters,
# and now we want to change it to RW, we should also set proper credential
# parameters that are connected to the RW user on the bucket.
# If it's created with RW credentials, but access mode is set to RO, then no
# change is needed in the credentials parameters.
aws_iam_user_volume$update(
description = "Updated to RW",
access_mode = "RW",
service = list(
credentials = list(
access_key_id = "<access_key_id_for_rw>",
secret_access_key = "<secret_access_key_for_rw>",
)
)
)
To keep your local Volume
object up to date with the
volume on the platform, you can always call the reload()
function:
# Reload volume object
aws_iam_user_volume$reload()
This operation lists all volume files in the root directory of the
bucket, unless the parent
parameter is specified. In that
case, it lists the content of that directory on the bucket. The output
is a VolumeContentCollection
collection object, that
contains two fields:
-items
for storing a list of VolumeFile
objects (files on the volume) and -prefixes
for storing a
list of VolumePrefix
objects or folders on the volume.
You can also specify the limit
parameter to control the
number of results returned.
Same as Collection
objects, here we also have pagination
functions to return either the next page of results or all results.
However, backward pagination is not available for volume contents.
Users can also navigate through pages of results by using the
continuation token
parameter or link
to fetch
the next chunk of results. If you use the link
parameter,
it will overwrite all other parameters if set, since it already contains
the limit
and continuation_token
info.
# List all files in root bucket directory
content_collection <- aws_iam_user_volume$list_contents(limit = 20)
# Print collection
content_collection
# List all files from a specific directory on the bucket
folder_files_collection <- aws_iam_user_volume$list_contents(
prefix = "<directory_name>"
)
# Get the next group of results by setting the continuation token
content_collection <- aws_iam_user_volume$list_contents(
limit = 20,
continuation_token = "<continuation_token>"
)
# Preview volume files
content_collection$items
# Preview volume prefixes/folders
content_collection$prefixes
# Preview links
aws_iam_user_volume$links
# Get the next group of results by setting the link parameter
aws_iam_user_volume$list_contents(link = "<link_to_next_results>")
# Or use VolumeContentCollection object's next_page() method for this:
content_collection$next_page()
# You can also fetch all results with the all() method
content_collection$all()
Volume files and prefixes are also treated as objects and they contain some operations that can be called on them.
This operation returns a single volume file information. The input
parameter can be file’s id which is represented as location on the
bucket (location)
, or a link to that file resource. The
link is a href
field of the desired file received from the
response when returning a list of volume contents with
list_contents()
. Empty arguments are not allowed along with
setting both parameters together.
# Get single volume file info - by setting file_location
vol_file1 <- aws_iam_user_volume$get_file(
location = "<file_location_on_bucket>"
)
# Get single volume file info - by setting link
vol_file1 <- aws_iam_user_volume$get_file(link = "full/request/link/to/file")
To keep your local VolumeFile object up to date with the volume file
on the platform, you can always call the reload()
function:
vol_file1$reload()
Unfortunately we don’t have a separate operation to fetch only
prefixes on the volume, therefore, we can get its prefixes only by using
the list_contents()
operation and look for the
prefixes
field in the returned
VolumeContentCollection
object.
# List content
volume_content <- aws_iam_user_volume$list_contents()
# Extract prefixes
volume_prefixes <- volume_content$prefixes
# Select one of the volume folders to list its content
volume_folder <- volume_prefixes[[1]]
# Print volume prefix information
volume_folder$print()
You can also list the content of a volume prefix/folder on the
volume, by calling list_contents()
directly on the
VolumePrefix
object.
## Select one of the volume folders to list its content
volume_folder <- volume_prefixes[[1]]
# List content
volume_folder_content <- volume_folder$list_contents()
In order to fetch members of one volume or a specific member by its
username, you can use list_members()
and
get_member()
operations:
# List volume members
aws_iam_user_volume$list_members() # limit = 2
# Get single member
aws_iam_user_volume$get_member(user = "<member-username>")
Volume admins can remove volume members by providing its username or
object of the Member
class to the
remove_member()
function:
# Remove member
aws_iam_user_volume$remove_member("<member-username>")
# Remove member using the Member object
members <- aws_iam_user_volume$list_members()
aws_iam_user_volume$remove_member(members$items[[3]])
The function for adding new members to the volume can accept a
Member
object (for example used in a project) or its
username.
# Add member via username
aws_iam_user_volume$add_member(user = "<member-username>", permissions = list(
read = TRUE, copy = TRUE, write = FALSE, admin = FALSE
))
# Add member via Member object
aws_iam_user_volume$add_member(
user = Member$new(
username = "<member-username>",
id = "<member-username>"
),
permissions = list(
read = TRUE, copy = TRUE, write = FALSE,
admin = FALSE
)
)
Users can modify specific member’s permissions on the volume by providing the privileges they want to change:
# Modify member permissions
aws_iam_user_volume$modify_member_permissions(
user = "<member-username>",
permissions = list(write = TRUE)
)
Once deactivated, you cannot import from, export to, or browse within
a volume. As such, the content of the files imported from this volume
will no longer be accessible on the platform. However, you can update
the volume and manage members. Note that you cannot deactivate the
volume if you have running imports or exports unless you force the
operation using the query parameter force = TRUE
.
Note that to delete a volume, first you must deactivate it and delete all files which have been imported from the volume to the platform.
To reactivate the volume, just use the reactivate()
function.
# Deactivate volume
aws_iam_user_volume$deactivate()
# Reactivate volume
aws_iam_user_volume$reactivate()
To be able to delete a volume, you first need to deactivate it and then delete all files on the Platform that were previously imported from the volume.
# Deactivate volume
aws_iam_user_volume$deactivate()
# Delete volume
aws_iam_user_volume$delete()
Creating and connecting volumes to the Platform allows you to import
your files/folders from a cloud bucket to the Platform. Imports
operations are related to volumes, but in the API they are separated
under /imports
endpoints, so in our library they are also
grouped under imports path on the authentication object
(Imports
resource class).
A single import job is represented as an Import
class
object containing information about which file/folder has been or is
being imported, from which volume, to which project/folder on the
platform, import start and finish time, status of the job, logs etc.
To preview and query all import jobs you’ve created use the
query()
function on Auth$imports
path:
# List imports
all_imports <- a$imports$query()
# Limit results to 5
imp_limit5 <- a$imports$query(limit = 5)
# Load next page of 5 results
imp_limit5$next_page(advance_access = TRUE)
# Load all results at once until last page
imp_limit5$all(advance_access = TRUE)
It is possible to use some query parameters as different criteria for
filtering results like volume
, project
,
state
etc:
# List imports with state being RUNNING or FAILED
imp_states <- auth$imports$query(state = c("RUNNING", "FAILED"))
# List imports to the specific project
imp_project <- auth$imports$query(project = "<project_id>")
Listing imports is also available within Project
and
Volume
objects, where resulting imports are related to the
specific project or volume where they’re called from.
## Get the volume from which you want to list all imports
vol1 <- auth$volumes$get(id = "<volumes_owner_or_division>/<volume-name>")
vol1$list_imports()
## Get the project object for which you want to list imports
test_proj <- auth$projects$get("<project_id>")
test_proj$list_imports()
Similar to other resource classes, the get()
method will
return a single import job object when provided with a job id.
# Get single import
imp_obj <- a$imports$get(id = "<import_job_id>")
Users are able to fetch details for multiple import jobs by calling
one bulk action - the bulk_get()
method. The accepted input
can be a list of import job IDs or a list of import job objects (of
class Import). The result will be a Collection object containing a list
of (updated) import jobs.
# Get details of multiple import jobs
import_jobs <- a$imports$bulk_get(
imports = list("<import_job_id-1>", "<import_job_id-1>")
)
In order to import volume files into a project, users can use the
submit_import()
method from the Auth$imports
path, or directly on the selected VolumeFile
object (file
they want to import) where this function is also available.
## First, get the volume you want to import files from
vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
## Then, get the project object/id where you want to import files
test_proj <- a$projects$get("<project_id>")
## List all volume files on the volume
vol1_content <- vol1$list_contents()
## Select one of the volume files
volume_file_import <- vol1_content$items[[3]]
## Perform a file import
imp_job1 <- a$imports$submit_import(
source_location = volume_file_import$location,
destination_project = test_proj,
autorename = TRUE
)
# Alternatively you can also call import() directly on the VolumeFile object
imp_job1 <- volume_file_import$import(
destination_project = test_proj,
autorename = TRUE
)
Preview import job details with the print()
method:
# Print Import object
print(imp_job1)
You can also import folders from the volume into the project, with the option to preserve or not to preserve folder structure:
# Select one of the volume folders to import
volume_folder_import <- vol1_content$prefixes[[1]]
# Perform a folder import
imp_job2 <- a$imports$submit_import(
source_location = volume_folder_import$prefix,
destination_project = test_proj,
overwrite = TRUE,
preserve_folder_structure = TRUE
)
# Alternatively you can also call import() directly on the VolumePrefix object
imp_job2 <- volume_folder_import$import(
destination_project = test_proj,
overwrite = TRUE,
preserve_folder_structure = TRUE
)
# Print Import object
print(imp_job2)
In order to refresh the import job object and get the up to date info
about its state, you can always call the reload()
function:
# Reload import object
imp_job1$reload()
Users are able to perform the bulk action on multiple volume files or
folders and import them into a project using a single call of the
bulk_submit_import()
method from the
Auth$imports
path.
The required input should be a nested list of elements for each
file/folder you want to import containing specific fields:
source_volume
, source_location
,
destination_project
, destination_parent
,
name
, autorename
and
perserve_folder_structure.
## First, get the volume you want to import files from
vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
## Then, get the project object or the ID of the project into which you want
# to import files
test_proj <- a$projects$get("<project_id>")
## List all volume files
vol1_content <- vol1$list_contents()
## Preview the content and select one VolumeFile object and two VolumePrefix
## objects (folders) for the purpose of this example
volume_file_import <- vol1_content$items[[1, 2]]
volume_file_import
volume_folder_import <- vol1_content$prefixes[[1]]
volume_folder_import
## Construct the inputs list by filling the necessary information for each
# file/folder to import
to_import <- list(
list(
source_volume = "rfranklin/my-volume",
source_location = "chimeras.html.gz",
destination_project = "rfranklin/my-project"
),
list(
source_volume = vol1,
source_location = "my-folder/",
destination_project = test_proj,
autorename = TRUE,
preserve_folder_structure = TRUE
),
list(
source_volume = "rfranklin/my-volume",
source_location = "my-volume-folder/",
destination_parent = "parent-id",
name = "new-folder-name",
autorename = TRUE,
preserve_folder_structure = FALSE
)
)
bulk_import_jobs <- a$imports$bulk_submit_import(items = to_import)
# Preview the results
bulk_import_jobs
# Get updated status by fetching details with bulk_get() and by passing the
# list of import jobs created in the previous step
a$imports$bulk_get(imports = bulk_import_jobs$items)
As you may see from the example above, users are able to import folders from the volume into the project or a project directory, with the option to preserve or not to preserve folder structure.
Moreover, you are able to pass the objects of classes Volume, Project
or File (with type = ‘folder’) for source_volume
,
destination_project
and destination_parent
fields when constructing the inputs list, besides using their string
IDs.
In the example above, the list of items for the bulk import job was
manually created. Alternatively, you can use the
prepare_items_for_bulk_import()
utility function to
generate the items list.
This function allows you to prepare the list of bulk import items
based on the provided VolumeFile
or
VolumePrefix
objects, filling the following fields for each
item: source_volume
, source_location
,
destination_project
or destination_parent
,
autorename
, preserve_folder_structure
.
Note that the same
destination_project
/destination_parent
and
autorename
values will be applied uniformly across all
items in the resulting list. The preserve_folder_structure
parameter, if provided, applies exclusively to VolumePrefix
items.
## First, get the volume you want to import files from
vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
## Then, get the project object or the ID of the project into which you want
# to import files
test_proj <- a$projects$get("<project_id>")
## List all volume files
vol1_content <- vol1$list_contents()
## Select two VolumeFile objects
volume_file_1_import <- vol1_content$items[[1]]
volume_file_2_import <- vol1_content$items[[2]]
volume_files_to_import <- list(volume_file_1_import, volume_file_2_import)
## Construct the inputs list using the prepare_items_for_bulk_import() utility
# function
to_import <- prepare_items_for_bulk_import(
volume_items = volume_files_to_import,
destination_project = test_proj
)
bulk_import_jobs <- a$imports$bulk_submit_import(items = to_import)
# Preview the results
bulk_import_jobs
# Get updated status by fetching details with bulk_get() and by passing the
# list of import jobs created in the previous step
a$imports$bulk_get(imports = bulk_import_jobs$items)
Keep in mind that prepare_items_for_bulk_import()
is
designed solely to assist in constructing the list of items for
submitting a bulk import job. It operates under certain constraints;
refer to the function’s documentation for further details. After
obtaining the function’s output, you can manually adjust individual
items as needed.
Exports are the actions of exporting your files from the Platform
into a cloud bucket represented as a volume. Export operations are also
related to volumes, but in the API they are separated under
/exports
endpoints, so in our library they are also grouped
under the exports
path on the authentication object
(Exports
resource class).
A single export job is represented as an Export
class
object containing information about a file has been or is being
exported, from which project/folder on the Platform, to which volume,
export start and finish time, status of the job, logs etc.
Users can preview and query all export jobs they’ve created for the
purpose of exporting their files from the Platform into a cloud bucket
using volumes. The output is a Collection
object storing a
list of exports in its items
field and providing pagination
options.
# List exports
all_exports <- a$exports$query()
# Limit results to 5
exp_limit5 <- a$exports$query(limit = 5)
# Load next page of 5 results
exp_limit5$next_page(advance_access = TRUE)
# List all results until last page
exp_limit5$all()
It is possible to use some query parameters as different criteria for
filtering results like volume
, state
etc:
# List exports with status RUNNING or FAILED
exp_states <- a$exports$query(state = c("RUNNING", "FAILED"))
# List exports into a specific volume
exp_volume <- a$exports$query(
volume = "<volume_owner_or_division>/<volume_name>" # volume object or id
)
Listing exports is also available within Volume
objects,
where results contain all files exported to the specific volume they’re
being called from.
# Get the volume for which you want to list all exports
vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
# List exports
vol1$list_exports()
Similar to other resource classes, the get()
method will
return a single export job object when provided with job id.
# Get a single export
exp_obj <- auth$exports$get(id = "<export_job_id>")
Users are able to fetch details for multiple export jobs by calling
one bulk action - the bulk_get()
method. The accepted input
can be a list of export job IDs or a list of export job objects (of
class Export). The result will be a Collection object containing a list
of (updated) export jobs.
# Get details of multiple export jobs
export_jobs <- a$exports$bulk_get(
exports = list("<export_job_id-1>", "<export_job_id-1>")
)
In order to export platform files into volumes, users can use
the
submit_export()
method from the auth$exports
path, or directly on the selected File
object (file they
want to export) where this function is also available.
# First, get the volume you want to export files to
vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
# Get the File object/id you want to export from the platform
test_file <- a$files$get("<file_id>")
# Perform a file export
exp_job1 <- a$exports$submit_export(
source_file = test_file,
destination_volume = vol1,
destination_location = "new_volume_file.txt" # new name
)
Preview export job details with the print()
method:
# Print export job info
print(exp_job1)
Bear in mind that folders export from the platform to volumes is not possible with this function. For such cases (or export of multiple files) it is better to use bulk actions that will be added to the package soon.
Users can also export files into specific volume directories, by
providing the prefix
within the location
parameter as a folder name, which will then be virtually created on the
volume:
# Export file into the folder 'test_folder'
exp_job2 <- a$exports$submit_export(
source_file = test_file,
destination_volume = vol1,
destination_location = "test_folder/new_volume_file.txt" # new name
)
# Print export job info
print(exp_job2)
Important :
access_mode
parameter to RW when creating or modifying
a volume.In order to refresh the export job object and get the up to date info
about its state, you can always call the reload()
function:
# Reload export object
exp_job1$reload()
Users are able to perform the bulk action on multiple project files
and export them into a volume using a single call of the
bulk_submit_export()
method from the
Auth$exports
path.
The required input should be a nested list of elements for each file
you want to export containing specific fields: source_file
,
destination_volume
, destination_location
,
overwrite
and properties
that accepts a list
with fields: sse_algorithm
, sse_aws_kms_key_id
and/or aws_canned_acl
.
## First, get the project and files you want to export
test_proj <- a$projects$get("<project_id>")
proj_files <- test_proj$list_files()
## Choose the first 3 files to export
files_to_export <- proj_files$items[1:3]
## Then, get the volume you want to export files into
vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
## Construct the inputs list by filling the necessary information for each
# file to export
to_export <- list(
list(
source_file = files_to_export[[1]],
destination_volume = vol1,
destination_location = files_to_export[[1]]$name
),
list(
source_file = "second-file-id",
destination_volume = vol1,
destination_location = "my-folder/exported_second_file.txt",
overwrite = TRUE
),
list(
source_file = files_to_export[[3]],
destination_volume = vol1,
destination_location = files_to_export[[3]]$name,
overwrite = FALSE,
properties = list(
sse_algorithm = "AES256"
)
),
copy_only = FALSE
)
bulk_export_jobs <- a$exports$bulk_submit_export(items = to_export)
# Preview the results
bulk_export_jobs
# Get updated status by fetching details with bulk_get() and by passing the
# list of export jobs created in the previous step
a$exports$bulk_get(exports = bulk_export_jobs$items)
As you may see from the example above, users are able to export files into a folder on a volume.
Moreover, you are able to pass the objects of classes Volume or File
(with type = ‘file’ only, since folders can’t be exported) when
constructing the inputs list, in addition to using their string IDs for
source_file
and destination_volume
fields.
However, destination_location
on the volume must be set as
character field.
Lastly, copy_only
field will apply to all files being
exported, which means that each file would be copied to the volume,
while source location would remain on the Platform if
copy_only
is set to TRUE.
In the example above, the list of items for the bulk export job was
manually created. Alternatively, you can use the
prepare_items_for_bulk_export()
utility function to
generate the items list.
This function allows you to prepare the list of bulk export items
based on the provided File
objects, filling the following
fields for each item: source_file
,
destination_volume
, destination_location
,
overwrite
, properties
.
Check the function’s documentation and API documentation for more details.
## First, get the project and files you want to export
test_proj <- a$projects$get("<project_id>")
proj_files <- test_proj$list_files()
## Then, get the volume you want to export files into
vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
## Select two File objects
file_1_export <- proj_files[[1]]
file_2_export <- proj_files[[2]]
files_to_export <- list(file_1_export, file_2_export)
## Construct the inputs list using the prepare_items_for_bulk_export() utility
# function
to_export <- prepare_items_for_bulk_export(
files = files_to_export,
destination_volume = vol1,
destination_location_prefix = "my-folder/"
)
bulk_export_jobs <- a$exports$bulk_submit_export(items = to_export)
# Preview the results
bulk_export_jobs
# Get updated status by fetching details with bulk_get() and by passing the
# list of export jobs created in the previous step
a$exports$bulk_get(exports = bulk_export_jobs$items)
Keep in mind that prepare_items_for_bulk_export()
is
designed solely to assist in constructing the list of items for
submitting a bulk export job. It operates under certain constraints;
refer to the function’s documentation for further details. After
obtaining the function’s output, you can manually adjust individual
items as needed.