I recently worked with a small team within Content Bloom to migrate over 200 gigabytes of binary files into SDL Web. Actually it was 244 Gigs to be exact ☺
It was quite a learning experience and I wanted to share some of that learning.
To get these files into SDL Web, We wrote a .Net Console application that uses the SDL CoreService API. I’m pretty sure most people reading this blog have created a little app before to perform certain Tridion functions. If you haven’t, there’s a lot of information out there on how to get started.
There was however a little gotcha when using the CoreService API in Web 8, in that I now needed to provide an AccessToken to the UploadBinaryContent method, this is documented in the release docs here.
It gave me some grief at first, but thanks to some help on Stack Exchange, I managed to get up and running.
Here’s the issues I encountered in getting used to the new Access Token, if you’re new to it, you will certainly find them useful.
- Coreservice : Upload Binary File (Couldn’t upload the binary file Provided access token has expired)
- AccessToken – configure time?
Thanks SDL community, please do up vote those fine answers.
Dealing with the files
The files that we’re importing are stored in a directory structure; these have been exported from the Documentum system.
Initially we started working from the Content Manager machine remotely (the CM is in the Amazon Cloud) but I was finding that the speed in uploading via CoreService was slowing down my progress. 200 gigs over CoreService remotely is never going to be ideal.
From a time perspective, take note that unzipping thousands and thousands of images of this size can take over a day. If you then need to get them onto your CMS server, you can easily lose a couple of days getting them there to be processed… a good tip if you’re planning your time.
Now that we’ve moved our little tool and all of our files onto the actual CME server we can import at speed. Another great tip here is to use the net.tcp endpoint rather than the wshttp it’s much faster and doesn’t seem to drop as much.
So now we’re running we’re creating thousands of components every couple of hours, life is great.. but not so fast, we’re seeing some weird errors in the log files.
The first one is:
Unable to save Component (tcm:0-0-0). The transaction has aborted.
To resolve this, we went into the Tridion_CM database (note MS SQL Server) and ran the following query:
SELECT 1 FROM BINARIES WHERE ID = -1 AND CONTENT IS NULL
Then sometimes, we would see this issue:
The timeout period elapsed prior to completion of the operation or the server is not responding. A database error occurred while executing Stored Procedure "EDA_ITEMS_UPDATEBINARYCONTENT".EDA_ITEMS_UPDATEBINARYCONTENT
To solve this you simply need to run sp_updatestats in your database.
Given my volume and the amount of hammering on the database, I was scheduling both of my DB fixes to run 3 times a day and this solved any issues in the uploading.
Hope these tips help!