Generating Web.Sitemap from Tridion (SiteMapDataSource)

In spirit of Nuno’s post last week (http://nunolinhares.blogspot.com/2012/01/its-little-things-creating-page.html) regarding website navigation and more specifically, how to generate a Breadcrumb from Structure Groups in Tridion, I thought it was time to publish a related post (and a TBB) I’ve been baking for a while now.  Behold: hooking up Tridion to .NET’s standard Navigation Controls.  I’m talking about Menu, SiteMapPath, and TreeView.  Turns out they all feed off the same, rather basic, XML file called web.sitemap.  Here is the C# code for a generic TBB that creates such a file from your SDL Tridion Web Structure Group and Pages hierarchy.

Here is the basic structure of a .sitemap file:

<?xml version="1.0" encoding="utf-8" ?>
<siteMap xmlns="http://schemas.microsoft.com/AspNet/SiteMap-File-1.0">
<siteMapNode title="Movies Homepage" url="/" >
<siteMapNode title="Action" url="/action" …other attributes… >
<siteMapNode title="MI4" url="mi4.aspx" …other attributes… />

</siteMapNode>
<siteMapNode title="Thrillers" url="/thrillers" …other attributes…/>

<siteMapNode … />
</siteMapNode>
</siteMap>

Naturally, in Tridion (and almost any website) the Structure Groups are organized in much the same way.  So it shouldn’t be too hard to generate one of these files by recursively traversing through the SGs and pages.  … and writing a recursive algorithm is always fun.

So here is a TBB that you can plug into a new Page Component Template that will generate and publish a web.sitemap file based on your Structure Groups. 

I do want to mention that if you have some funky business rules and handle them through Page or SG Metadata fields (most implementations do), then add a method to this code that gets the relevant Page or SG (e.g. engine.GetObject(tcmId); ), extracts the field value(s) and appends them as attributes to the siteMapNodes.

Just copy/paste the following code into a new class in your TBB project assembly.  Happy Tridionizing y’all!

using System.Xml;
using Tridion.ContentManager.Templating;
using Tridion.ContentManager;
using Tridion.ContentManager.CommunicationManagement;
using Tridion.ContentManager.Templating.Assembly;
using System;

namespace ContentBloom.Templates
{
    ///
    ///
<summary>
    /// This SDL Tridion building block recursively iterates over the
    /// Structure Groups of a publication and generates an ASPX.NET Web.sitemap
    /// file which can be pluged in as a DataSource into a TreeView, Breadcrumb
    /// and other .NET navigation User Controls.
    /// </summary>

    [TcmTemplateTitle("Generate Web.sitemap")]
    public class GenerateWebSitemap : ITemplate
    {
        private Engine engine;
        private Package package;
        private static readonly TemplatingLogger log = TemplatingLogger.GetLogger(typeof(GetWebSitemap));

        private static string xmlns = "http://schemas.microsoft.com/AspNet/SiteMap-File-1.0";

        public void Transform(Engine engine, Package package)
        {
            //Initialize
            this.engine = engine;
            this.package = package;

            Page page = null;
            Item pageItem = package.GetByType(ContentType.Page);
            if (pageItem != null)
            {
                page = engine.GetObject(pageItem.GetAsSource().GetValue("ID")) as Page;
            }
            else
            {
                throw new InvalidOperationException("No Page found.  Verify that this template is used with a Page.");
            }
            Publication mainPub = page.ContextRepository as Publication;
            StructureGroup rootSg = mainPub.RootStructureGroup;
            XmlDocument sitemapDoc = GenerateSitemap(rootSg);

            package.PushItem(Package.OutputName, package.CreateXmlDocumentItem(ContentType.Xml, sitemapDoc));
        }

        private XmlDocument GenerateSitemap(StructureGroup sg, XmlElement parent=null, XmlDocument doc=null)
        {
            if (doc == null)
            {
                //create base doc
                doc = new XmlDocument();
                XmlDeclaration xmlDeclaration = doc.CreateXmlDeclaration("1.0", "utf-8", null);
                doc.InsertBefore(xmlDeclaration, doc.DocumentElement);
            }
            if (parent == null)
            {
                //create root sitemap element and append to the doc.
                XmlElement root = doc.CreateElement("siteMap", xmlns);
                doc.AppendChild(root);

                parent = doc.CreateElement("siteMapNode", xmlns);
                root.AppendChild(parent);
                parent.SetAttribute("url", sg.PublishLocationUrl);
                parent.SetAttribute("title", RemoveNumberPrefix(sg.Title));
            }

            // Get the list of child structure groups and pages in this SG
            Filter filter = new Filter();
            filter.Conditions["ItemType"] = 68;//pages, structure groups
            filter.BaseColumns = ListBaseColumns.Extended;

            XmlElement childItems = sg.GetListItems(filter);
            foreach (XmlElement item in childItems.SelectNodes("*"))
            {
                string isPublished = item.GetAttribute("IsPublished");
                string itemType = item.GetAttribute("Type");
                string itemId = item.GetAttribute("ID");
                if (int.Parse(itemType) == (int)ItemType.Page)
                {
                    //only include published pages.  we also exclude index pages
                    //because the parent structure group's URL will default to them.
                    if (bool.Parse(isPublished))
                    {
                        Page childPage = engine.GetObject(itemId) as Page;
                        if (!ShouldBeExcluded(childPage))
                        {
                            XmlElement pageElem = doc.CreateElement("siteMapNode", xmlns);
                            parent.AppendChild(pageElem);
                            pageElem.SetAttribute("url", childPage.PublishLocationUrl);
                            pageElem.SetAttribute("title", RemoveNumberPrefix(childPage.Title));
                        }
                    }
                }
                else
                {
                    log.Debug("sgId=" + itemId);
                    //if it's a structure group, then
                    //get the object, create a child sitemap node
                    StructureGroup childSg = engine.GetObject(itemId) as StructureGroup;
                    if (!ShouldBeExcluded(childSg))
                    {
                        XmlElement sgElem = doc.CreateElement("siteMapNode", xmlns);
                        parent.AppendChild(sgElem);
                        sgElem.SetAttribute("url", childSg.PublishLocationUrl);
                        sgElem.SetAttribute("title", RemoveNumberPrefix(childSg.Title));

                        GenerateSitemap(childSg, sgElem, doc);
                    }
                }
            }
            return doc;
        }

        ///
        ///
<summary>
        /// Pages and Structure Groups are often named with a numeric prefix to help sort
        /// them in a particular order.  This prefix does not need to be included in the
        /// web.sitemap file, so we strip it out.
        /// </summary>

        ///
        ///
<param name="p"></param>
        /// <returns></returns>
        private string RemoveNumberPrefix(string p)
        {
            string NUMBER_PREFIX_SEPARATOR = "_";
            return p.Substring(p.IndexOf(NUMBER_PREFIX_SEPARATOR) + 1);
        }

        ///
        ///
<summary>
        /// This function checks if the page should be
        /// excluded from the sitemap. For example, it makes sense to filter
        /// out the index pages, since their parent structure group's url
        /// will point to them, and we don't want to include the index page
        /// twice.
        ///
        /// We also don't want to include user controls, i.e. ascx pages,
        /// so we'll filter those out too.
        /// </summary>

        ///
        ///
<param name="page"></param>
        /// <returns></returns>
        private bool ShouldBeExcluded(Page page)
        {
            string url = page.PublishLocationUrl;
            string title = page.Title;

            //filter out user controls
            if (url.EndsWith("ascx") || url.EndsWith("sitemap"))
            {
                return true;
            }

            //filter out if index page
            string[] indexPageNames = new string[] {"index.html", "index.htm", "index.aspx", "default.aspx"};
            foreach (string indexName in indexPageNames)
                if (url.ToLower().EndsWith(indexName))
                {
                    return true;
                }

            return false;
        }

        ///
        ///
<summary>
        /// It may make sense to exclude any structure groups that don't have any childen pages or other structure
        /// groups.
        /// </summary>

        ///
        ///
<param name="sg"></param>
        /// <returns></returns>
        private bool ShouldBeExcluded(StructureGroup sg)
        {
            Filter filter = new Filter();
            filter.Conditions["ItemType"] = ItemType.StructureGroup;
            filter.Conditions["Recursive"] = true;
            filter.BaseColumns = ListBaseColumns.Id;
            int pages = sg.GetListItems(filter).SelectNodes("*").Count;
            return pages == 0 ? true : false;
        }
    }
}

14 thoughts on “Generating Web.Sitemap from Tridion (SiteMapDataSource)

  1. Great example and definitely on my list of things to try and break! :-)

    A nice feature on ASP.NET sitemaps is the ability to include references to other sitemap files. See http://msdn.microsoft.com/en-us/library/ie/ms178426.aspx.

    For example, one of the nodes could be:
    <siteMapNode siteMapFile=”child.sitemap” />

    For organizations that are doing a phased or split implementation, this could be a way to have all the more content-heavy parts of the site (e.g. “AboutUs”) completely managed by typical Tridion published pages, but keep the more dynamic parts created using .NET with the content delivery APIs.

  2. That’s a great suggestion Alvin. I like the idea of using several sitemap files. The code above can be extended to read a Structure Group ID from the page metadata so that it generates the sitemap file starting from that node.

    It’s also possible to include the standard .NET Navigation controls inside the Dreamweaver templates so that Tridion managed pages also make a use of this. Whatever you’d normally put in the code-behind of a page (such as Page_Load(), just put inside a

  3. I would like to know how can i create the same sitmap , if i am designing a java website.
    What are the steps to follow
    1. What is the dll file to be invoked?
    2. Is the C# act dll file or whats is it?
    3. Is there any simpler steps so i could proceed.

    I am going to developa dynamic website based on jsp i wanted your help with this

    [WORDPRESS HASHCASH] The poster sent us ‘0 which is not a hashcash value.

  4. Hi Raj,
    Tridion building blocks are developed using C#.net. TBBs are part of a Compound Component Template or a Compound Page Template used to render either a component or a page, and it doesn’t matter whether the markup that is rendered is Java or .NET (by the same principle you can render plain HTML, PHP or any other markup you chose). I do suggest participating in the SDL Tridion training to gain a better understanding of creating Tridion templates. I also highly recommend to get very intimate with the SDL Tridion Templating Manual and the SDL Tridion Content Management Implementation Manual.

    You can certainly follow the same design pattern to create an XML file in a different format, a format that your jsp or custom taglib can consume. Have a look at the code I’ve published, you’ll see that it’s just a matter of massaging the XML results returned by Tridion into any other format you chose.

    Regarding your questions:
    1) The DLL invoked would be a TBB of type .NET Assembly. You will also need a TBB of type C# fragment that references the class within the .NET assembly. Note: if you use the TCMUploadAssembly.exe it will create all the required TBBs. Refer to the Templating Manual for a detailed description of who to set this up.
    2) The C# TBB that generates the XML which gets published to the delivery server is part of a Page Template. As I mentioned in my response to #1, the DLL is simply a collection of TBB classes.
    3) Publishing a web.sitemap file does not require a dynamic template. The next Tridion Community Webinar on March 14th will have an indepth discussion of various navigation approaches in Tridion. Join us and have a listen. Here is a link for more details: http://www.julianwraith.com/sdl-tridion-community-webinars/community-webinar-14th-march-2012

  5. Thanks Nickoli Roussakov,
    1. I added your C# code as you said , i added them in my C# project, then i added all the reference DLL of Tridion.
    2. I was successfullt able to compile that in visual C#, but i didnt find any dll file in my debug folder. does c# supports only higher version i mean i was using visual studio 2005- is that the reason?

  6. hi Nickoli Roussakov

    I had generated the sitemap by using sample, now i am stuck in removing some structure group names which are for css images and other stuffs.

    I had created a Parameter schema like exclude folders, while running the Page template with the Template builder i add fields which i need to rewmove say i added css, images and so on.

    now how do i exculde them using my c# code in tridion any examples

  7. Good question Raj. There are a couple of things you can do to filter out SGs and/or pages.

    You can filter them out by name. For example, if the SG has title that starts with an ‘_’ then don’t add that XML node.

    Another way you can do this is by adding a field to the Structure Group metadata which would serve as an indicator, eg “show in Sitemap” (true/false). In the code where you have the StructureGroup object read the meta field value (default to true) to determine if the SiteMap node should be created or not.

  8. This is my code below
    1. I have added a parameter schema called “exclude”.
    2. I have text field called “exclude_folders”.
    3. Now i have enter the text in my page template to remove this structure groups.
    say exclude_folder = css,images;

    assets,system,test

    4. now i am getting my fields using below package in my c# code
    package.GetByName(“exclude_folders”);

    5. here i would wont ur help how do i get the list of exclude_folders and not inculde them in my sitemap.xml

    code:
    using System.Xml;
    using Tridion.ContentManager.Templating;
    using Tridion.ContentManager;
    using Tridion.ContentManager.CommunicationManagement;
    using Tridion.ContentManager.Templating.Assembly;
    using System;
    using System.Collections;
    namespace Navigation
    {
    [TcmTemplateTitle(“SiteMap”)]
    public class SiteMap : ITemplate
    {
    private Engine engine;
    private Package package;
    private static readonly TemplatingLogger log = TemplatingLogger.GetLogger(typeof(SiteMap));

    private static string xmlns = “http://schemas.microsoft.com/Jsp/SiteMap-File-1.0″;

    public void Transform(Engine engine, Package package)
    {
    this.engine = engine;
    this.package = package;

    Page page = null;

    Item pageItem = package.GetByType(ContentType.Page);

    if (pageItem != null)
    {
    page = engine.GetObject(pageItem.GetAsSource().GetValue(“ID”)) as Page;

    }
    else
    {
    throw new InvalidOperationException(“No Page found. Verify that this template is used with a Page.”);
    }
    Publication mainPub = page.ContextRepository as Publication;
    StructureGroup rootSg = mainPub.RootStructureGroup;
    TcmUri tcmObj = new TcmUri(pageItem.GetValue(“ID”));

    XmlDocument sitemapDoc = GenerateSitemap(rootSg);

    package.PushItem(Package.OutputName, package.CreateXmlDocumentItem(ContentType.Xml, sitemapDoc));
    package.GetByName(“exclude_folders”);

    }

    private XmlDocument GenerateSitemap(StructureGroup sg, XmlElement parent = null, XmlDocument doc = null)
    {

    if (doc == null)
    {

    doc = new XmlDocument();
    XmlDeclaration xmlDeclaration = doc.CreateXmlDeclaration(“1.0″, “utf-8″, null);
    doc.InsertBefore(xmlDeclaration, doc.DocumentElement);
    }
    if (parent == null)
    {

    XmlElement root = doc.CreateElement(“siteMap”, xmlns);
    doc.AppendChild(root);
    parent = doc.CreateElement(“siteMapNode”, xmlns);
    root.AppendChild(parent);
    parent.SetAttribute(“path”, sg.Path);
    parent.SetAttribute(“url”, sg.PublishLocationUrl);
    parent.SetAttribute(“title”, RemoveNumberPrefix(sg.Title));
    parent.SetAttribute(“tcm:uri”, (sg.Id));

    }

    Filter filter = new Filter();
    filter.Conditions[“ItemType”] = 68;

    filter.BaseColumns = ListBaseColumns.Extended;

    XmlElement childItems = sg.GetListItems(filter);
    foreach (XmlElement item in childItems.SelectNodes(“*”))
    {

    string itemType = item.GetAttribute(“Type”);
    string itemId = item.GetAttribute(“ID”);

    if (int.Parse(itemType) == (int)ItemType.Page)
    {

    Page childPage = engine.GetObject(itemId) as Page;

    XmlElement pageElem = doc.CreateElement(“siteMapNode”, xmlns);
    parent.AppendChild(pageElem);
    pageElem.SetAttribute(“path”, childPage.Path);
    pageElem.SetAttribute(“url”, childPage.PublishLocationUrl);
    pageElem.SetAttribute(“title”, RemoveNumberPrefix(childPage.Title));
    pageElem.SetAttribute(“tcm:uri”, childPage.Id);

    }
    else
    {
    log.Debug(“sgId=” + itemId);

    StructureGroup childSg = engine.GetObject(itemId) as StructureGroup;

    XmlElement sgElem = doc.CreateElement(“siteMapNode”, xmlns);
    parent.AppendChild(sgElem);
    sgElem.SetAttribute(“path”, childSg.Path);
    sgElem.SetAttribute(“url”, childSg.PublishLocationUrl);
    sgElem.SetAttribute(“title”, RemoveNumberPrefix(childSg.Title));
    sgElem.SetAttribute(“tcm:uri”, childSg.Id);
    GenerateSitemap(childSg, sgElem, doc);

    }
    }
    return doc;
    }

    private string RemoveNumberPrefix(string p)
    {
    string NUMBER_PREFIX_SEPARATOR = “_”;
    return p.Substring(p.IndexOf(NUMBER_PREFIX_SEPARATOR) + 1);
    }

    }
    }

  9. I am not sure I follow the approach you’re describing. If you added the “exclude” parameter schema to this TBB, you still need a way to determine which SG has the items to exclude. Also, multimedia components (i.e. images) would not show up as pages in Structure Groups. They get published to a specified Structure Group as binaries. You could publish CSS files as pages, but definitely not images. Either way, you need to determine if a given structure group contains css and/or images, and the best way to do that is to set an indicator on the structure group itself as described in my previous reply.

  10. I used your TBB and the output is exactly what i need. However, I have a question. What is the best solution of writing the output (the sitemap xml) to the We.sitemap file on the publication? Right now I made a custom deployer to do this. Is there any other way?

  11. Hi IonutC,
    This TBB is meant to be used in a Page Template, and therefore used to render a page. What you do is create a Page Template with the file extension .sitemap. In this PT add this TBB. Then in your (root) structure group create a Page that uses this PT. Give the page’s filename as “web” and the template will take care of the “.sitemap” extension. Now simply publish the page.

  12. Hi Nickoli,

    Thanks, your answer really helped me a lot, not only for the sitemap, but for other functionalities also. Thanks again.

  13. Hi Nickoli,

    I have completed the entire website by using your sample examples, i would like to know can we split the sitemap into two depending on page metadata, for example i have two cars 1. Royals Roce 2. Mercedez benz, now both has same sitmap
    In Royals race navigation i have
    1. Speed
    2. Controls
    3. GPS
    4. Safety sensors
    In Mercedez benz
    1. Speed
    2. Controls
    3. Safety sensors
    Can i use the same sitemap for this by having a page meta data in page level where i check box depending on which i wont for both the pages .
    If We can can you please say me how can we do this, do we need to write the .net code based on this choices

  14. Hi Raj,
    Yes, you can do this. ASP let’s you have multiple site map datasources. You can add a checkbox into your page metadata or any other meta values. Then in the sitemap TBB, you’ll need to extract the values for those fields using the ItemFields API and apply your logic to the sitemap generation based on the values.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>