Using extract and mkvdk


Using the extract and mkvdk tools, you can rebuild a collection much quicker than re-indexing the collection using the Verity Spider.

Extracting and Reindexing

Rebuild a meta-collection using the extract and mkvdk utilities, as follows.

1. On UNIX, the shared library environment variable should specify the directory which contains the Verity extract and mkvdk tools. During installation, a single line is added to the Web server shell script start to map the locations of the shared libraries.

library_path=/installdir/platform/bin:$library_path; export library_path

where installdir is the full path for the directory to which you installed Information Server and platform and library_path are operating system-specific. For a complete list of supported operating systems and platforms, see the Information Server Product Notes.

2. Run the extract utility to generate a bulk submit file for the meta-collection, as follows:

extract -n basename collectionname.clm

where basename represents the root file name for the bulk submit file, and collectionname is the file name of the meta-collection's map file. It is recommended that you assign collectionname. to the basename.

Warning! A utility named extract already exists in Digital's Alpha OS/F V4.0. Specify the full path to Verity's extract utility.

As a result, you will have a bulk submit file for the meta-collection. The extract utility will automatically assign an extension of .vdk to the basename you supplied.


collectionname.vdk
3. In the bulk submit file, strip out all fields and keep only values for VdkVgwKey and the <<EOD>> symbol.

4. Run the mkvdk utility to index the documents for the collection:


mkvdk -create -style isstyle -collection newcoll
-bulk collectionname.vdk
where isstyle is the name of the Verity Spider V3.6 style files (which are in installdir/common/style), newcoll is the name of the new collection to be built by mkvdk, and collectionname.vdk is the name of the bulk insert file generated by the extract utility.

NOTE: The newcoll name cannot end with a .clm extension, nor can you simply drop the .clm extension from the meta-collection name to create the new name. This is because already part of the meta-collection structure is a subdirectory named collstuff where the meta-collection name is collstuff.clm. Therefore you should create an entirely different name, or add something similar to a .new extension to differentiate it.

Example

You want to upgrade the help.clm meta-collection to the single helpcoll collection.

1. Run extract on the meta-collection map file:

%extract -n help help.clm

2. In the bulk submit file, remove all fields but VdkVgwKey and the <<EOD>> symbol.

3. Run mkvdk:

%mkvdk -create -style /is/common/style -collection helpcoll -bulk help.vdk

Updating Information Server

With the meta-collections rebuilt into new collections, you need to update the Information Server configuration file, inetsrch.ini, to recognize the new collections.

1. Delete the meta-collection's files and directory structure. For more information, see "Deleting Meta-collection Files" below.

2. If you want to keep the new name you assigned above when you ran mkvdk, skip to the next step. Otherwise, rename the collection to the original meta-collection name, minus the .clm extension.

3. Edit the Path line of the CollectionList entry in inetsrch.ini. For example:


[S97IS\Primary\CollectionList\Collections(3)]
Alias=Developer's Kit
Name=Developer's Kit
Path=d:\search97\s97is\colls\develope.clm
Description=All of the available online DK docs.
State=Enabled
DefaultList=True
ReadOnly=False
Profile=False
Enter the value for newcoll which you created when you ran mkvdk, or just delete the .clm from the file name.

4. Restart the Information Server admin server and your web server so that inetsrch.ini is reread into memory and the changes are recognized. This is required whenever you edit inetsrch.ini.

Deleting Meta-collection Files

Once you have the new collections in Information Server and you are sure everything is working properly, you can safely remove the old meta-collection files and directory structures.

For example:

Old meta-collection structure Old meta-collection structure with new collection New collection only, with meta-collection removed
/colls/coll1.clm
      /coll1/
            /cache/
            /html/
            /mail/
            /pdf/
            /text/
            /wysiwyg

/colls/newcoll1
/coll1.clm
      /coll1/
            /cache/
            /html/
            /mail/
            /pdf/
            /text/
            /wysiwyg

/colls/newcoll1

NOTE: You can also rename the collection coll1 to retain the base name. However, you must make sure inetsrch.ini reflects the correct collection name.

Extract Utility Limitation

When using the extract utility, you should note the following limitations if your collections were built with custom style files, using features described in the Verity Collection Building Guide:





Copyright © 1998, Verity, Inc. All rights reserved.