Handling Late-Arriving Members in SSIS

I was chatting with some other SQL Community members recently and the topic of handling late-arriving members for a warehouse load came up. I figured it was worth taking a couple of minutes to share the general concept I’ve been using to handle late-arriving members in SSIS. The steps are relatively straightforward if you’ve done much SSIS, but sometimes the little things can trip you up. The example below is based loosely on the WideWorldImporters database samples from MS.

The general steps are

  1. Set up your source query.
  2. Pass the data through a Lookup for your Dimension with the missing results routed to a “No Match” output.
  3. Insert those “No Match” rows into your Dimension using a SQL task – checking to make sure that this particular row hasn’t already been inserted (this is important).
  4. Do another lookup using a “Partial Cache” to catch these newly-inserted members.
  5. Use a UNION ALL transform to bring the existing and late-arriving members together.

Inferred Member Data Flow

Sample code for the Inferred Insert:

DECLARE @CustomerID INT

SELECT @CustomerID = ?

IF NOT EXISTS (SELECT *

               FROM   WideWorldImportsDW.Dimension.Customer

               WHERE  [WWI Customer ID] = @CustomerID)

  INSERT WideWorldImportersDW.Dimension.Customer

         ([WWI Customer ID],

          Customer,

          [Bill To Customer],

          Category,

          [Buying Group],

          [Primary Contact],

          [Postal Code],

          [Valid From],

          [Valid To],

          [Lineage Key])

  VALUES ( @CustomerID,— WWI Customer ID – int

           N’Unknown:’

           + Cast(@CustomerID AS NVARCHAR(10)),— Customer – nvarchar(100)

           N”,— Bill To Customer – nvarchar(100)

           N”,— Category – nvarchar(50)

           N”,— Buying Group – nvarchar(50)

           N”,— Primary Contact – nvarchar(50)

           N”,— Postal Code – nvarchar(10)

           Sysdatetime(),— Valid From – datetime2(7)

           ‘9999-12-31 23:59:59.9999999’,— Valid To – datetime2(7)

           2 — Lineage Key – int

  ) 

Advertisements

3 thoughts on “Handling Late-Arriving Members in SSIS

  1. Pingback: Handling Late Arrivals In SSIS – Curated SQL

  2. Will there be any performance issue since here we are handling row by row basis using OLEDB Command component for inserting missing member. Your thought on this

    Like

    • There are obviously going to be some performance hits if you have a lot of late-arriving members. The idea is that the first lookup is fully cached. The lookup for the newly-inserted members is partially cached. You’ll still want a decent index to avoid inserting duplicate rows in the “execute SQL” task, but ideally your main dimension load catches most, if not all, members. This is a way to ensure that you can catch anything that was created between the dimension load and the fact load.

      It’s probably worth noting that you need some way to handle updating those inferred members later. I have usually added a step in the Dimension load for that. If you find that you get a _lot_ of inferred members that are never updating, you’ll probably want to clear those out and have a talk with your developers about why there are so many unmatched members.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s