Updating Only Certain Documents


You want to update a large collection, but only with those documents that were last indexed at least 30 hours ago.

vspider -cmdfile /verity/spider/update.cmd

where update.cmd consists of:

-collection icd.coll
-refresh
-refreshtime 1 day 6 hours

Case-specific Options

Option
Reason
-refreshtime
Since it is a large collection, you do not want to index documents that were indexed in the last 30 hours. Using -refreshtime allows you to specify a time threshold for documents to refresh.

Another way to specify 30 hours is to simply use only hours:

-refreshtime 30 hours

Keep in mind that if you have recently performed a resync, documents may not have a last-indexed date as -resync removes that information from the collection's persistent store.

Unnecessary Options for this Case

Option
Reason
-start, -restart
When you use the -refresh option, you cannot use the -start or -restart options.





Copyright © 1998, Verity, Inc. All rights reserved.