Table of Contents

Content indexes

A content index is an index of page content on a solution – typically all pages, paragraphs, and various items instances. This type of index is built by enumerating all available pages and then indexing all active paragraphs and items associated with each page. A content index is typically built to make it possible to search for content on a website.

To create a content index:

  1. Go to Settings > Repositories and open/create a repository
  2. Under the indexes section, click manage
  3. Click New index
  4. Provide a Name to the index
  5. Select a Balancer
    • Dynamicweb.indexing.balancing.ActivePassive - selects the next instance on the list of instances – so if instance A is unavailable (building, has failed), instance B will be used unless it’s unavailable, in which case instance C will be used, and so on
    • Dynamicweb.indexing.balancing.LastUpdated - directs operations to the most recently updated index, ensuring users interact with the freshest data
  6. Click Save and close

On solutions with heavy traffic and frequent product data updates we recommend using the LastUpdated mode to ensure that visitors are always shown the most recently updated product data. On solutions with only two instances (the vast majority of solutions) it is not necessary to select a balancer mode, as the “other index” will always be used when an index is unavailable.

When this is done, an empty index has been created. You should now add instances to it. ContentIndex_01

Adding Instances

An instance refers to a specific file stored in the file archive. When a query is executed, it's this file that gets searched. It's common for instances to be rebuilt regularly to incorporate the latest changes to product data. For this reason, it's recommended to maintain at least two instances. Having multiple instances ensures that while one is being updated or rebuilt, the other remains available for searches.

  1. In the Indexes section in your repository, enter the index you want to add an instance to
  2. Click the Actions button the on top right corner and select Manage instances
  3. Click New instance
  4. Provide a name – you could call the first instance ‘A’ and the other instance ‘B’
  5. Select the LuceneIndexProvider
  6. Specify a folder to place the instance file under
  7. Click Save and close
  8. Repeat the process for the second instance

Once created, the instances will look like this:

ContentIndex_02

When an instance is built a set of index files are generated under System > Indexes > YourIndexName > YourInstanceName – but before you can build it you must create a build configuration.

Adding a Build Configuration

So now that you have two instances you want to build them – to do so, you need to create a build definition. Each type of index has a specific builder associated with it – in the case of a product index this builder is helpfully called the ContentIndexBuilder.

To add the build configuration:

  1. Enter the Index in which you want to create a Build
  2. Under the Builds section, select Manage
  3. Click New build
  4. Provide a name
  5. In the Builder section, select Dynamicweb.Content.ContentIndexBuilder
    • This opens a selection of builder settings
  6. Choose the Builder action - currently, only the Full build option, which rebuilds the entire index, is available
  7. Review the builder settings
  8. Set up Notifications if appropriate
  9. Click Save and close

The following builder settings are available – please review carefully to see if any of them are relevant for your setup:

Setting Value Comments
ExcludeItemsFromIndex Boolean – default is false Set to true to exclude item content from the index
SkipPageItemRelationLists Boolean – defaults to false
SkipParagraphItemRelationLists Boolean – defaults to False
AppsToHandle - Allows you to specify which ContentAppIndexProviders to include – defaults to all, unless a comma separated list of providers is set as the value. We only provide one – for the Forum app – but you can create a class inheriting from the ContentAppIndexProvider class and build your own.

Now, a build button has been added to your instances: ContentIndex_03 Now you’ve specified how you want the index to be built – next, you should specify what you want to include in the index.

Adding Fields

Lucene indexes are composed of small documents, with each document divided into named fields which contain either content which can be searched or data which can be retrieved. Each field added to the index can therefore be stored, indexed, and analysed depending on what you want to use it for:

  • Stored fields have their values stored in the index
  • Indexed fields can be searched, the value is stored as a single value
  • Analysed fields have their values run through an analyser and split into tokens (words)

Generally speaking:

  • A field you want to display in frontend must be indexed
  • A field where you want to search for part of the value in free-text search must be analysed
  • A field which are to be published using the Query publisher should be Stored
  • A field you want to display as facets should be indexed, but not analysed

To make things (a lot) easier for you, we’ve created a default set of fields typically used in product indexes – this default field set is defined in something called the ContentIndexSchemaExtender.

To add the fields from the schema extender to the index:

  1. Click the Fields tab
  2. Under the Fields section, click Manage
  3. Click New index field and select Schema extender
  4. Provide a name
  5. Provide a system name
  6. In the Field section, select ContentIndexSchemaExtender
  7. In the Settings section, select the fields you want to Include
  8. Click Save and close

This adds a whole bunch of fields to the index.

ContentIndex_04

The ContentIndexSchemaExtender contains the following type of fields:

  • All fields from the Page table – e.g. PageActive, PageID, PageItemType, etc.
  • A number of Page content fields:
    • Paragraph headers contains an array of all paragraph headers on a page
    • Paragraph texts contains an array of all paragraph text content on a page
    • Paragraph content contains an array of the item type properties for each item-based paragraph on a page
    • Page property item type contains the name of the item type used to extend the page properties of this page (if relevant)
  • All item type fields in the format [item.SystemName]_[itemField.SystemName] and Property_[item.SystemName]_[itemField.SystemName], except the fields marked as 'do not include in search' in the item field settings.
  • Possibly a number of App fields from a Forum app

Due to complexity issues, the ItemListEditor field type is never indexed.

Building the Index

Once you’ve added instances, a build configuration, and a set of fields to the index, you should build it – to do so, click the Build button beneath each instance you want to build.

ContentIndex_03

Of course, you don’t want to do this manually every time – you want to do a combination of the following:

  • Rebuild the index every time an integration job has been executed
  • Rebuild the index every time a product is saved in Products
  • Rebuild the index on a schedule – see the article on tasks
To top