Skip to content

Job inputs

From a high level view, these are the ways to get data into the system:

  • files - read files from a directory, as they appear. Useful for consuming logs and the output of other programs. [S]
  • echo - specify events for a job directly [S}]
  • exec - execute a command in the current shell and capture its output. [S]
  • object stores: s3, azure, gcp, filesystem. The unifying idea is remote objects are contained in a bucket, and one can match object names. Objects can be uncompressed and Parquet files can be streamed as JSON events. [S]
  • http-poll - perform an HTTP request. Specify headers, query parameters and body. [S]
  • http-server - start listening on a port and accept incoming HTTP requests.
  • internal-messages - listen to Edge IQ messages.
  • window-event-log - (Windows Only) listen to system events.
  • worker-channel - listen to events generated by other jobs.

The inputs marked with [S] are scheduled, that is they are triggered by messages or a regular time interval. These all have a Trigger option.

The other inputs wait for data as it becomes available, so for instance files waits for new files to be created.

Most inputs have shared concepts of Ignore Linebreaks and JSON.

You set JSON if the data is known to be JSON, otherwise it is arbitrary text that must be quoted - ’{“_raw”: “arbitrary text”}’

Normally, we process events on a line by line basis, but setting Ignore Linebreaks will grab all of the output as a single event. For instance, some HTTP APIs return JSON documents with linefeeds, so both of these options need to be enabled.

The set of events that arrive all in a ‘chunk’ (as a result of a executable single run or a HTTP request) is called a document; can ask to batch these things together [[Output Batching]]

exec and http-poll have also got Batch, which has a different function to the output Batch. It allows you to label the events of a document. If enabled then by default it adds line-num and line-count to each event.

Many operations (input, output or action?) can fail semi-randomly but can be successfully retried. The best example is when doing a HTTP request. Retrying involves specifying two things (a) number of times to retry? and (b) an optional pause between retries.

After each retry, the pause is doubled, up to a maximum of 15 seconds.