Data Fingerprinter

The Data – Fingerprinter node generates a unique identifier or fingerprint for any incoming data object. This fingerprint can be used to detect duplicates, ensure data integrity, or track data lineage across a pipeline.

Input

  • Data – Accepts any structured or unstructured data input. This may include text, files, records, or serialized objects. The fingerprint will be generated based on the content

Output

  • Data – Forwards the original data downstream, now enriched with a fingerprint tag or identifier. The fingerprint does not alter the content but appends metadata

How to Use

  • Connect any data-producing node to the Data input (example parser, file loader, database reader)
  • The node computes a consistent fingerprint for the input data
  • Connect the Data output to downstream nodes that require identity checking, deduplication, or integrity validation

Notes

  • Fingerprints are typically generated using cryptographic hashes (example SHA-256)
  • This node does not modify the data payload; it appends a non-intrusive metadata tag
  • Useful for caching, change tracking, and deduplication strategies