Meta Tag Indexing


The Verity Spider automatically indexes HTML meta tags as collection fields when the collection field names are defined for the collection schema. When defined, you can allow users to search over collection fields by setting up field searching in your SEARCHScript templates.

During indexing, when the spider encounters a <META> tag, it produces a field token based on the tag and then the field token is stored as a field in the collection. In the collection's internal documents table, the field name is the name of the meta tag's name attribute, and the field value is the value of the content attribute in the meta tag.

To illustrate how meta tag filtering works, here's a sample <META> tag in HTML:

<META name="Abstract" content="This is a long document">

When filtering the HTML above, the spider produces a field token of this form:

ABSTRACT: This is a long document

A field definition that corresponds to the meta tag's name attribute must appear in the style.ufl file (the user fields file) in order for the field to be populated by the spider. The style.ufl file is one of the collection configuration files in the default collection style directory located in:

installdir/common/style

where installdir represents the name of the product installation directory.

It is important to note that the spider parses <META> tags it finds within a <HEAD> tag. The spider stops parsing for <META> tags after it encounters </HEAD> within a document.

Adding a Field Definition

If you want to add a field definition for a meta tag, you should make a copy of the default style files structure (that is, installdir/common/style) to preserve the original files. Verity recommends that you do not edit the style files in place. After making a copy of the styles directory, you can add a field definition that corresponds to the meta tag name in the style/style.ufl file.

In the example above, you would add a field definition for the "Abstract" field as shown in the sample style.ufl syntax below:


# Copyright (C) 1987-1996 Verity, Inc.
# style.ufl - Application-specific User Fields
# These fields are included in the internal documents table. For
# more information about adding fields to the internal documents
# table, see the "Defining Custom Fields" chapter in the
# Collection Building Guide.
#
# Example:
#
# data-table: ddf
# {
# varwidth: MyTitle dxa
# }
# -----------------------------------------------------------------
# Specify additional application-specific fields here in their own
# data-table[s].#
data-table: ddj
{
varwidth: Abstract ddk
}
$$
For information about making field definitions in the style.ufl file, refer to Chapter 6 of the Verity Collection Building Guide.





Copyright © 1998, Verity, Inc. All rights reserved.