Migrating gigabytes of binaries into SDL Web 8

migration

I recently worked with a small team within Content Bloom to migrate over 200 gigabytes of binary files into SDL Web. Actually it was 244 Gigs to be exact ☺

It was quite a learning experience and I wanted to share some of that learning.

The Tool

To get these files into SDL Web, We wrote a .Net Console application that uses the SDL CoreService API. I’m pretty sure most people reading this blog have created a little app before to perform certain Tridion functions. If you haven’t, there’s a lot of information out there on how to get started.

There was however a little gotcha when using the CoreService API in Web 8, in that I now needed to provide an AccessToken to the UploadBinaryContent method, this is documented in the release docs here.

It gave me some grief at first, but thanks to some help on Stack Exchange, I managed to get up and running.

Here’s the issues I encountered in getting used to the new Access Token, if you’re new to it, you will certainly find them useful.

Thanks SDL community, please do up vote those fine answers.

Dealing with the files

The files that we’re importing are stored in a directory structure; these have been exported from the Documentum system.

Initially we started working from the Content Manager machine remotely (the CM is in the Amazon Cloud) but I was finding that the speed in uploading via CoreService was slowing down my progress. 200 gigs over CoreService remotely is never going to be ideal.

From a time perspective, take note that unzipping thousands and thousands of images of this size can take over a day. If you then need to get them onto your CMS server, you can easily lose a couple of days getting them there to be processed… a good tip if you’re planning your time.

Importing

Now that we’ve moved our little tool and all of our files onto the actual CME server we can import at speed. Another great tip here is to use the net.tcp endpoint rather than the wshttp it’s much faster and doesn’t seem to drop as much.

So now we’re running we’re creating thousands of components every couple of hours, life is great.. but not so fast, we’re seeing some weird errors in the log files.

The first one is:

Unable to save Component (tcm:0-0-0). The transaction has aborted.

To resolve this, we went into the Tridion_CM database (note MS SQL Server) and ran the following query:

SELECT 1 FROM BINARIES WHERE ID = -1 AND CONTENT IS NULL

Then sometimes, we would see this issue:

The timeout period elapsed prior to completion of the operation or the server is not responding.
 A database error occurred while executing Stored Procedure "EDA_ITEMS_UPDATEBINARYCONTENT".EDA_ITEMS_UPDATEBINARYCONTENT

To solve this you simply need to run sp_updatestats in your database.

Given my volume and the amount of hammering on the database, I was scheduling both of my DB fixes to run 3 times a day and this solved any issues in the uploading.

Hope these tips help!

2 thoughts on “Migrating gigabytes of binaries into SDL Web 8

  1. Hi John,

    Great to see you sharing these experiences. I just wondered if you had any insights into the mechanism behind that ‘select from binaries’ fix. How does that work?

    Cheers
    Dominic

  2. hi John, i had similar experience when uploading large binary files, database often goes nuts and have to run sp_updatestats as a part of the process within a timespan. also coreservice token expired and file name conflict… etc many fun work.

    Lucas

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>