Blob Storage

The BlobStorage class provides a high-level interface for interacting with the Datatailr blob storage service. It supports operations such as uploading, downloading, listing, tagging, and managing access control for objects (blobs) in storage buckets.

Overview

  • Typical Usage:

    from dt.cloud.blob_storage import BlobStorage
    
    blob_storage = BlobStorage()
    blob_storage.put_object(b'Hello, world!', 'test-object')
    data = blob_storage.get_object('test-object')
    blob_storage.delete_object('test-object')
    

Class: BlobStorage

Constructor

BlobStorage(region=None, use_cpp_api=False)
  • region (str, optional): The region for the storage service. Defaults to the current host region.
  • use_cpp_api (bool, optional): Whether to use the C++ API implementation (for AWS). Defaults to False.

Methods

get_object(name, bucket=None)

Retrieve the contents of a blob as bytes.

  • name (str): The blob's name.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.
  • Returns: bytes (decoded from base64)

put_object(bytes_buffer, name, acl=None, bucket=None)

Upload a blob to the storage.

  • bytes_buffer (bytes): The data to upload.
  • name (str): The blob's name.
  • acl (ACL, optional): Access control list for the blob.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.

delete_object(name, bucket=None)

Delete a single blob from the storage.

  • name (str): The blob's name.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.

delete_objects(names, bucket=None)

Delete multiple blobs from the storage.

  • names (list[str]): List of blob names.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.

delete(prefix, bucket=None)

Delete all blobs with a given prefix.

  • prefix (str): The prefix to match.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.

list_objects(prefix='', bucket=None, recursive=True, max_keys=0)

List all objects in a bucket, optionally filtered by prefix.

  • prefix (str, optional): Prefix to filter objects.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.
  • recursive (bool, optional): List recursively. Defaults to True.
  • max_keys (int, optional): Maximum number of objects to return. 0 means no limit.
  • Yields: FileObject or DirectoryObject instances.

download_file(name, dest_dir, bucket=None, callback=None)

Download a blob to a local file.

  • name (str): The blob's name.
  • dest_dir (str): Local directory to save the file.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.
  • callback (Callable, optional): Callback function for progress.

tags(name, bucket=None)

Get the tags associated with a blob.

  • name (str): The blob's name.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.
  • Returns: List of tags.

set_tags(name, tags, bucket=None)

Set tags for a blob.

  • name (str): The blob's name.
  • tags (list[str]): List of tags to set.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.

acl(name, bucket=None)

Get the Access Control List (ACL) for a blob.

  • name (str): The blob's name.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.
  • Returns: ACL object.

set_acl(name, acl, bucket=None)

Set the Access Control List (ACL) for a blob.

  • name (str): The blob's name.
  • acl (ACL): The ACL to set.
  • bucket (str, optional): The bucket name. Defaults to the user's bucket.

Data Classes

FileObject

  • name (str): File name.
  • last_modified (datetime.datetime): Last modification time.
  • size (int): File size in bytes.
  • acl (ACL, optional): Access control list.

DirectoryObject

  • name (str): Directory name.

Example Usage

from dt.cloud.blob_storage import BlobStorage
from dt.cloud.acl import ACL, Permissions

blob_storage = BlobStorage()

# Upload a file
blob_storage.put_object(b'Hello, world!', 'test-object')

# Download a file
data = blob_storage.get_object('test-object')

# List all objects
for obj in blob_storage.list_objects():
    print(obj)

# Set tags
blob_storage.set_tags('test-object', ['tag1', 'tag2'])

# Get tags
tags = blob_storage.tags('test-object')

# Set ACL
acl = ACL(users_and_permissions={'user1': Permissions('rw')})
blob_storage.set_acl('test-object', acl)

# Get ACL
acl = blob_storage.acl('test-object')

# Delete a file
blob_storage.delete_object('test-object')

Notes

  • The default bucket is determined by the current user's host prefix and is typically named <host_prefix>datatailr-user-data.
  • Access control is managed via the ACL and Permissions classes.