When performing a migration or integration of data, one consideration that should be high on your list is translation performance. Perhaps you’re looking to translate just a small amount of data (hundreds, as opposed to thousands of records) and the impact on the systems involved will be minimal. In this case, typically little attention is given to the stress on the systems and the length of time needed to translate the entire record set.

However, when the translation of data involves thousands, or hundreds of thousands of records, serious consideration should be given to: when the data translation should occur, the volume of records being translated, what the source medium of the record set is (and how complex the source record set is) and the connection method and complexity of the target system of the record set. Typically the target system is the bottleneck during translation of data. There are many factors that contribute to this, including indexes on the target tables, look-ups that are occurring on non-indexed fields (typically for finding matching record to prevent translation of duplicate data) and the medium the translation is occurring through, such as an application API, and, if there are workflows occurring on the records that are being translated. This is just a short list of things that can significantly slow the translation process.

What I want to talk about today is not necessarily the myriad of ways the target can be configured that may help or hurt the translation, but rather the processing of multiple records simultaneously as opposed to a single-threaded process. Because of my strong background with Scribe Insight, I’ll speak to this using the Scribe process as an example.

Let’s say you’re translating a .csv file as your source data, to Microsoft Dynamics CRM as your target. The translation is to insert new accounts, or update existing accounts. The Scribe mapping document would then use and ODBC technology to connect to the .csv file, to map as the source data. Typically you would use the Scribe adapter for Dynamics™ CRM to connect to Dynamics™ CRM as the target. The adapter connection works through the API of Dynamics CRM and will connect to any of the deployments available with Dynamics CRM (on premise, IFD, online). In this configuration, when the Scribe mapping file is run, it will query and cache the entire source record set, and then translate one record at a time to the target through the Scribe Dynamics CRM adapter. Depending on how you’ve built the mapping and what types (and how many) look-ups you are performing, this method can be somewhat laborious.

You can, however, convert this single-threaded translation method into a multi-threaded translation very easily with Scribe Insight. The trick is to use a Scribe Query Publisher to break up your source data into separate XML documents. Scribe will automatically drop those XML documents into the MSMQ. A Queue Based integration process can then be used to translate the XML data to Dynamics CRM instead of the .csv connection method. The Scribe Queue Based integration process can be assigned multiple threads, allowing multiple XML (records) to be translated simultaneously. Out of the box, you can assign seven threads, making the translation at least 7 times faster than with the single-threaded .csv source method.

If you are translating a large volume of records, translating via a multi-threaded process can save you hours of translation time and greatly lessen the impact of the systems involved. This is definitely something you should consider.

For more information about C5 Insight or this blog entry, please Contact Us