/usr/verity on a UNIX machine named Goliath and your collections directory is /host1/usr/verity/collections
/host2/usr/verity with a collections directory of /host2/usr/verity/colls.
NOTE: All Verity Spider commands must be issued as a single line from the command-line. They are broken up here for readability.
- vspider -collection
/host1/usr/verity/collections/internal.coll- -start http://web.verity.com/docs/index.htm
- -domain verity.com
- -mimeinclude text/html -exclude `http://www.verity.com/*'
- -jumps 5
- vspider -collection
/host1/usr/verity/collections/internal.coll- -restart
- -domain verity.com
- -mimeinclude text/html -exclude `http://www.verity.com/*'
- -jumps 5
-start entries and instead use the -restart option. You do need to include the other options such as inclusion criteria.
Use the Admin interface for Information Server to import the collection.
- vspider -collection
/host1/usr/verity/collections/internal.coll- -refresh
- -refreshtime 4 hours
- -domain verity.com
- -mimeinclude text/html -exclude `http://www.verity.com/*'
- -jumps 5
-refreshtime option allows you to pass over documents recently indexed. You know the documents change often, but not constantly.
- vspider -collection
/host1/usr/verity/collections/internal.coll- -start /usr/docs/
- -indmimeinclude application/msword
- vspider -collection
/host1/usr/verity/collections/internal.coll- -restart
- -indmimeinclude application/msword
internal.coll collection has become very popular for searching, and requires refreshing every four hours and one second. To alleviate the strain on the current server, you decide to move the collection to a larger server running a different operating system. The instructions which follow assume that the directories are linked via NFS. To accomplish the move, do the following:
/host1/usr/verity/collections/internal.coll /host2/usr/verity/colls
- vspider -collection
/host2/usr/verity/colls/internal.coll- -resync
internal.coll collection now on the new server, run the command:
- vspider -collection
/host2/usr/verity/colls/internal.coll- -refresh -refreshtime 4 hours
- -domain verity.com
- -mimeinclude text/html -exclude `http://www.verity.com/*'
- -mimeinclude application/msword
- -jumps 5
-mimeinclude application/msword to avoid also picking up the Word documents.