Skip to content

Azure Blob

Retrieve (or list) Microsoft Azure Storage Blobs (Block Storage).

FieldTypeRequiredDescription
triggertriggerHow often to run the command.
modeModeThe operating mode for this input.
container-namestringThe storage service container for created blobs.
blob-namesstringThe name for the blobs.
timestamp-modeTimestamp ModeDerive a timestamp for this blob for filtering purposes based on the selected strategy.
maximum-agestringRemove any blobs older than this many seconds from the candidate list.
fingerprintingboolean (bool)Enable object fingerprinting, so an object will only be downloaded once.
maximum-fingerprint-ageduration (string)Remove any object fingerprints older than this from the tracker.
preprocessorsPreprocessorsPreprocessors (process downloaded data before making it available to the job) these processors will be run in the order they are specified.
ignore-linebreaksboolean (bool)Treat object as one event.
include-regexregex (string)Include blobs matching the specified regular expressions.
exclude-regexregex (string)Exclude blobs matching the specified regular expressions.
retryRetryHow to retry after failure.
Authentication
FieldTypeRequiredDescription
storage-accountstringThe Storage Account Name to be used (credential).
storage-master-keystringThe Storage Master Key to be used (credential).
Object Properties
FieldTypeRequiredDescription
blob-name-fieldevent-field (string)The field that a blob name from an operation should be stored in.
creation-time-fieldevent-field (string)The field that the blob creation time should be stored in.
last-modified-fieldevent-field (string)The field that the blob last modified time should be stored in.
content-length-fieldevent-field (string)The field that the blob content length information should be stored in.
content-type-fieldevent-field (string)The field that the blob content type information should be stored in.
content-md5-fieldevent-field (string)The field that the blob content md5 should be stored in.
etag-fieldevent-field (string)The field that the object ETag should be stored in.
data-fieldevent-field (string)A field that the blob data should be nested in.
FieldTypeRequiredDescription
countintegerHow to retry? Either forever or for a limited number of times.
pausestringHow long to pause before re-trying.
ValueNameDescription
list-and-download-objectslist-and-download-objectsList Objects and Download
list-objectslist-objectsList Objects
download-objectsdownload-objectsDownload Given Objects
ValueNameDescription
nonenoneThe default mode, do not filter based on timestamps
last-modifiedlast-modifiedFilter object on the last-modified timestamp reported by the service
blob-name-patternblob-name-patternFilter blobs on the timestamp derived from the object name for example: relevant-name-pattern: =(?P<Y>[\\d]{4,4})-(?P<m>[\\d]{2,2})-(?P<d>[\\d]{2,2})/
ValueNameDescription
extensionextensionPreprocess the object or blob based on the extension of the object or blob name (.gz, .parquet)
gzipgzipUnGzip the received data
parquetparquetExtract the received data as JSON rows from a parquet file
base64base64Encode the binary data as base64