An exhaustive description of the Table_comparison transform is given in the following Wiki pages:
This page summarizes the settings for an important parameter for table comparison efficiency, the Comparison method. Select one of three options:
- Row-by-row select: Select this option to have the transform look up the target table using SQL every time it receives an input row. This option is best if the target table is large compared to the number of rows the transform will receive as input. Make sure the appropriate indexes exist on the lookup columns in the target table for optimal performance.
- Cached comparison table: Select this option to load the comparison table into memory. In this case, queries to the comparison table access memory rather than the actual table. However, the table must fit in the available memory. This option is best when the table fits into memory and you are comparing the entire target table.
- Sorted input: Often the most efficient solution when dealing with large data sources, because DS reads the comparison table only once. This option can only be selected when it is guaranteed that the incoming data are sorted in exactly the same order as the primary key in the comparison table. In most cases incoming data must be pre-sorted, e.g. using a Query transform with an Order-by (that may be pushed down to the underlying database), to take advantage of this functionality.