Skip to content

Google Cloud Storage

Read objects from Google Cloud Storage.

FieldTypeRequiredDescription
triggertriggerHow often to run the command.
bucket-namestringThe storage service container for created objects.
object-namesstringNames for objects. If we are listing these are prefixes.
modeModeList-and-download, List or just download.
ignore-linebreaksboolean (bool)Treat object as one event.
credentialsgcs_input:credentialscredentials for accessing the object.
timestamp-modeTimestamp ModeDerive a timestamp for this blob for filtering purposes based on the selected strategy.
maximum-agestringRemove any blobs older than this many seconds from the candidate list.
fingerprintingboolean (bool)Enable object fingerprinting, which will cause an object to only be downloaded once.
maximum-fingerprint-ageduration (string)Remove any object fingerprints older than this from the tracker.
preprocessorsPreprocessorsPreprocessors (process downloaded data before making it available to the job) these processors will be run in the order they are specified.
include-regexstringInclude objects matching the specified regular expressions.
exclude-regexstringExclude objects matching the specified regular expressions.
retryRetryHow to retry failed operations.
Object Properties
FieldTypeRequiredDescription
object-name-fieldevent-field (string)The field that a object name from an operation should be stored in.
creation-time-fieldevent-field (string)The field that the object creation time should be stored in.
last-modified-fieldevent-field (string)The field that the object last modified time should be stored in.
content-length-fieldevent-field (string)The field that the object content length information should be stored in.
content-type-fieldevent-field (string)The field that the object content type information should be stored in.
etag-fieldevent-field (string)The field that the object ETag should be stored in.
data-fieldevent-field (string)A field to take the object data (default is to merge fields if possible).
FieldTypeRequiredDescription
countintegerHow to retry? Either forever or for a limited number of times.
pausestringHow long to pause before re-trying.
ValueNameDescription
list-and-download-objectslist-and-download-objectsList Objects and Download
list-objectslist-objectsList Objects
download-objectsdownload-objectsDownload Given Objects
ValueNameDescription
nonenoneThe default mode, do not filter based on timestamps
last-modifiedlast-modifiedFilter object on the last-modified timestamp reported by the service
blob-name-patternblob-name-patternFilter blobs on the timestamp derived from the object name for example: relevant-name-pattern: =(?P<Y>[\\d]{4,4})-(?P<m>[\\d]{2,2})-(?P<d>[\\d]{2,2})/
ValueNameDescription
extensionextensionPreprocess the object or blob based on the extension of the object or blob name (.gz, .parquet)
gzipgzipUnGzip the received data
parquetparquetExtract the received data as JSON rows from a parquet file
base64base64Encode the binary data as base64