Acrobat Catalog 3.0 Online Guide

72
Online Guide • Commands • Using Acrobat Catalog • Preparing PDF document collec- tions for indexing • Building an index • Distributing and maintaining an index • Troubleshooting Adobe ® Acrobat ® Catalog

description

 

Transcript of Acrobat Catalog 3.0 Online Guide

Page 1: Acrobat Catalog 3.0 Online Guide

1

On

line

Gu

ide

• Commands

• Using Acrobat Catalog

• Preparing PDF document collec-tions for indexing

• Building an index

• Distributing and maintaining an index

• Troubleshooting

Adobe

®

Acrobat

®

Catalog

Page 2: Acrobat Catalog 3.0 Online Guide

2

Commands

Windows

Macintosh

The Macintosh version has a Preferences command.

Windows preferences

are set in the acrocat.ini file.

To define and build an in-

To change an index defini-To build an already-de-Scheduling automaticPurging and rebuilding an

Index

New...Open…Build...Schedule...Purge...

To define andTo change an

File

NewOpenQuit Setting prefer-

Edit

Preferences

To build an al-SchedulingPurging and

Index

BuildSchedulePurge

Page 3: Acrobat Catalog 3.0 Online Guide

3

Using Acrobat Catalog

You use Adobe Acrobat Catalog to build full-text indexes of PDF document collections. A

full-text

index is a searchable database of all text in a docu-ment or set of documents. Readers of the docu-ments you have indexed use the

Acrobat Search

plug-in for Acrobat Exchange or Acrobat Reader to search them. Acrobat Search appears in Exchange or Reader as the Search command on the Tools menu.

Index building has three phases:

1

Preparing documents for indexing

2

Building the index for the documents

3

Maintaining the index

The third phase is particularly important with dynamic indexes—indexes for constantly changing information such as PDF business documents stored on a network file server.

Page 4: Acrobat Catalog 3.0 Online Guide

4

Preparing PDF document collections for indexing

Before you index a document collection, you need to organize the documents on the disk drive or net-work server volume.

If the documents have chapters or sections, con-sider breaking them up into smaller files. Consider using the Document Info fields in the documents to make them easier to search. If you are building the index on one platform and it will be used on another, name the documents carefully.

Organizing a PDF collection and its index

Separating PDF documents into parts

Adding information to PDF documents for efficient searching

Naming PDF documents

Page 5: Acrobat Catalog 3.0 Online Guide

5

Organizing a PDF collection and its index

When you define and build an index, Catalog gives the index-definition file a .pdx extension and creates an index support folder with nine subfolders and the same name as the PDX file. It places the PDX file and the support folder in the same folder.

Page 6: Acrobat Catalog 3.0 Online Guide

6

The simplest organization is to have the index itself—the PDX file and the support folder containing the nine subfolders—in the folder that contains the indexed document collection:

leglindx

Legal

leglindx.pdx breachestorts

Index for documents in torts and breaches

Page 7: Acrobat Catalog 3.0 Online Guide

7

This structure is the easiest for users to understand and the easiest to move to another drive or server volume.

If this structure isn’t feasible, any structure in which the index and the indexed documents are in a folder branch that can be moved as a single unit simplifies the move.

However, almost any organization is possible. There is only one restriction on the Macintosh and three on other platforms. On all platforms, the entire index—both the PDX file and the support folder containing the nine subfolders—must be in a single folder. And on all platforms except the Macintosh, two further restrictions apply: The indexed documents must reside on a single disk drive or network server volume, and the index must be on the same drive or volume as the indexed documents.

Page 8: Acrobat Catalog 3.0 Online Guide

8

Separating PDF documents into parts

Consider creating a separate PDF file for each chap-ter or section. When you separate a document into parts and then search it, the search results are more sharply focused.

For example, searching a travel book separated in this way for

beach

would return chapters about locations with beaches and probably give them a high relevance ranking. If the search returned chap-ters about locations without beaches at all, it would give them a low ranking. See

Interpreting relevance ranking

in the Acrobat Search Guide for details.

Page 9: Acrobat Catalog 3.0 Online Guide

9

Adding information to PDF documents for efficient searching

When PDF documents are indexed, Acrobat Search users can limit searches to just those documents that contain specific Document Info field values. For example, a search could be limited to just those documents whose author is “Bob Jones” and that list “Status report” as the subject.

For this reason, encourage document publishers who use

PDF Writer

to enter Document Info field values for all their documents during conversion to PDF. If the documents are already converted, encourage them to use

Acrobat Exchange

.

The standard Document Info fields are Title, Subject, Author, and Keywords. PDF documents cre-ated with

Acrobat Distiller

already have Author and Title information. See

Tips on filling in Document Info fields

.

Page 10: Acrobat Catalog 3.0 Online Guide

10

You can also define custom data fields such as Document Type, Document Number, and Document Identifier. But note that such fields appear only in custom versions of Acrobat Exchange and that you need a good understanding of the PDF format to customize Exchange. For details, see

Supporting custom Document Info fields

.

Page 11: Acrobat Catalog 3.0 Online Guide

11

Tips on filling in Document Info fields

It’s a good idea to standardize usage in the Document Info fields across your organization. For example:

Always put a descriptive title in the Title field. Even though the filename of the document appears in the Search Results dialog box if the title field is empty, file-names are often not very descriptive.

Always use the same field for category information. For example, don’t use the Subject field for some docu-ments and the Keywords field for others.

Always use the same word for the same category. For example, don’t use

biology

for some documents and

life sciences

for others.

You might use the Author field to identify the group responsible for the document. For example, the author of a hiring policy document might be the Human Resources department.

Page 12: Acrobat Catalog 3.0 Online Guide

12

If documents are identified by part numbers, add the numbers as keywords. For example, add something like

doc#=m234

to the Keywords field.

To categorize documents by type, use the Subject or Keywords field, or both. For example, you might use

status report

as a Subject value and

monthly

or

weekly

as a Keywords field value for a single document.

If you are publishing a large number of documents, make a table that shows which values are assigned to which documents. While you are developing the index, use the table to maintain consistency. When you publish the index, include the table as part of the documentation.

Page 13: Acrobat Catalog 3.0 Online Guide

13

Supporting custom Document Info fields

To support a custom data field used in a customized version of Acrobat Exchange, you declare the field in the acrocat.ini file (Windows) or in the Catalog Preferences dialog box (Macintosh).

Note:

For information on customizing Exchange, see the Acrobat Software Development Kit. (You can find the Kit at http://www.adobe.com/acrobat/moreinfo.) For information on the integer, date, and string data types mentioned in this topic, see

Portable Document Format Reference Manual, version 2.

To define a custom field (Windows)

To define a custom field (Macintosh)

Sample custom fields

Page 14: Acrobat Catalog 3.0 Online Guide

14

To define a custom field (Windows)

1

After the line containing [Fields] in the acrocat.ini file, insert a line with the following syntax:

Field0=

CustomFieldName

,

DataType

where

CustomFieldName

is the name of the field and can be up to 64 characters and where

DataType

is one of the following:

int

for integer fields

date

for date fields

str

for string fields.

For more details, see

Sample custom fields

.

2

To declare additional fields, insert similar lines. Begin each line with Field1, Field2, Field3, and so on. Do not skip field numbers.

Page 15: Acrobat Catalog 3.0 Online Guide

15

3

Save the acrocat.ini file, and then create a new index- definition (PDX) file—existing PDX files won’t work. The new index definition will contain the custom field definitions.

Page 16: Acrobat Catalog 3.0 Online Guide

16

To define a custom field (Macintosh)

1

Choose Edit > Preferences, and select Custom Fields in the left panel of the Preferences dialog box.

2

Type the name of the custom field in the Field Name text box. (For details on this step and the next, see

Sample custom fields

.)

3

Select the Field Type option (integer, date, string) from the Field Type menu.

4

Click Add to include the custom field in the scroll box.

5

Click OK to accept all of the changes you have made to the Catalog Preferences dialog box, including Custom Fields. Click Cancel to cancel all of the changes to this dialog box.

Note:

You cannot edit a custom field. To change a field, remove it and add it again.

Page 17: Acrobat Catalog 3.0 Online Guide

17

Sample custom fields

If you defined two custom fields, DocumentIdenfier and DocumentType, they might appear as follows in the Windows acrocat.ini file:

Field0=DocumentIdentifier,intField1=DocumentType,str

On a Macintosh, custom fields would appear as fol-lows in the Custom Fields group of preferences:

DocumentIdentifier (Integer)DocumentType (String)

The DocumentIdentifier field takes integer values from 0 to 65,535, and the DocumentType field takes strings from 0 to 256 characters. If you want to use document identifiers with letters and special char-acters, you must declare the DocumentIdentifier field as a string.

Page 18: Acrobat Catalog 3.0 Online Guide

18

Naming PDF documents

When you name PDF documents and build indexes for cross-platform document collections, the safest approach is to observe MS-DOS filenaming conven-tions.

Acrobat Catalog and Acrobat Search use unique document identifiers as well as pathnames to iden-tify indexed documents, so MS-DOS filenames may not be absolutely necessary. However, ambiguities caused when names created for one platform are mapped to names usable on another can slow searches and even prevent documents from being located.

Page 19: Acrobat Catalog 3.0 Online Guide

19

If you are using the Macintosh version of Catalog to build a cross-platform indexed document collection and you don’t want to change long PDF filenames to MS-DOS filenames, check Make Include/Exclude Folders DOS Compatible in the Index group of

preferences

before you build your index. If you check this preference, you must use MS-DOS filenaming conventions for the folder names. You do not have to use these conventions for the names of the files that the folders contain.

If you are using the Macintosh version with OS/2 LAN Server but want to be sure that the indexed files are searchable on all PCs, either configure LAN Server Macintosh (LSM) to enforce MS-DOS filenaming conven-tions or index only FAT volumes. (HPFS volumes may contain unretrievable long filenames.)

Page 20: Acrobat Catalog 3.0 Online Guide

20

Do not alternate between using a Windows and a Macintosh version of Catalog when you build or update an index if you are indexing PDF documents with long filenames that will be truncated for Windows use.

Even for documents that will be searched only by Macintosh users, do not use deeply nested folders or pathnames longer than 256 characters.

Page 21: Acrobat Catalog 3.0 Online Guide

21

If you are planning to deliver the document collection and index on an ISO 9660-formatted CD-ROM, you should use ISO 9660 filenames. With the Macintosh version of Catalog, check Log Compatibility Warnings in the Logging group of

preferences

to be warned of noncompliant filenames.

Avoid using high ANSI characters, such as some non-English characters, in the names of files and folders used for the index or the indexed files. The font used by Catalog does not support character codes 133 through 159.

For information on MS-DOS and ISO 9660 filename conventions, see

Naming conventions

.

Page 22: Acrobat Catalog 3.0 Online Guide

22

Building an index

Before you can build an index, you need to ensure enough free disk space to accommodate the index and the temporary files created during the build.

The index files require 10 to 30% of the space required by the documents being indexed, closer to 10% if the documents contain many graphics. The temporary files require 10 to 30% of the space required by the index files.

Building an index

Choosing options for an index

Setting preferences for Acrobat Catalog

Page 23: Acrobat Catalog 3.0 Online Guide

23

Building an index

You have to define an index before Acrobat Catalog can build it. The definition lists the folders containing the documents to be indexed and indicates any changes to the default settings for index options. It should also include an index title and a description of the index. These will be available, and useful, to users of the index.

On the Macintosh, you can simplify index building by dragging and dropping the folders rather than listing them.

To define and build an index

To build an already-defined index

To change an index definition

Drag-and-drop index building (Macintosh)

Stopping a build

Page 24: Acrobat Catalog 3.0 Online Guide

24

To define and build an index

1

Choose Index > New (Windows) or File > New (Macintosh), and give the index a useful title.

2

Provide

useful information

about the index (up to 250 characters) in the Index Description text box.

3

For each folder that contains the documents to be indexed, click Add in the Include Directories box and use the Select dialog box to locate and select the folder. If an included folder contains a subfolder with PDF files that you do not want to index, click Add in the Exclude Directories box and select the folder. To select a folder:

In Windows, open the folder by double-clicking the folder name; then click OK.

Page 25: Acrobat Catalog 3.0 Online Guide

25

On a Macintosh, select the folder by clicking the folder name; then click Select

folder name

. If you don’t plan to

move the index and document collection

, you can add folders from multiple servers or disk drives. Before doing so, however, select Allow Indexing On a Separate Drive in the Index group of preferences. (Choose

Edit > Preferences

to reach the group.)

To change the default

index options

for this index, click Options and select the index options you want. You can exclude specific words (

stopwords

) from the index, exclude numbers, remove some of the user’s search options (Case Sensitive, Sounds Like, Word Stemming), or adapt the index to documents created in Acrobat 1.0 or to CD-ROM use.

4

Click Build to display the Save Index File dialog box. The build begins only after you have saved the index definition (next step).

Page 26: Acrobat Catalog 3.0 Online Guide

26

5

Name the index-definition file, select a folder for it, and click OK (Windows) or Save (Macintosh). For the file-name, retain the .pdx extension provided. The path-name of the folder should not contain high ANSI charac-ters (such as some foreign characters) or the slash (/) character. Acrobat Catalog begins building the index.

The

folder

you select for the PDX file will also con-tain the folder for the nine subfolders containing the index data files being built.

• In Windows, this folder must be on the disk or network server volume where the documents to be indexed are stored.

• On the Macintosh, if you don’t plan to move the index and documents, you can put the folder on a different disk or network server volume from the disk or volume where the documents to be indexed are stored. Before doing so, select Allow Indexing On a Separate Drive.

Page 27: Acrobat Catalog 3.0 Online Guide

27

As Catalog builds the index, it displays messages that report the progress of the build. You can stop the build at any time by clicking Stop. If the mes-sage displayed reports an error, see Catalog error messages for help with correcting it. All messages are saved, with date and time stamps, in a log file.

Page 28: Acrobat Catalog 3.0 Online Guide

28

About Catalog log files

Every time Acrobat Catalog builds or updates an index, it displays messages that report on the progress or failure of the build and writes these messages to a log file.

In the log file, each message is date and time stamped. Over time, the file compiles a record of every document that is indexed and records the dates and times when the index is updated. The file is deleted when it reaches a maximum size—1 MB by default.

The log file for an index is created the first time the index is built:

• In Windows, Catalog creates the log file in the same folder as the index-definition (PDX) file and gives it the same name, except that the extension is .log rather than .pdx.

Page 29: Acrobat Catalog 3.0 Online Guide

29

• On the Macintosh, Catalog creates the log file in the Catalog application folder by default. You can use the Logging group of preferences to save the log file in the same folder as the index or in a folder you select. You can also change the name of the log file.

Page 30: Acrobat Catalog 3.0 Online Guide

30

To build an already-defined index

1 Choose Index > Build to open the Select Index File to Build dialog box.

2 Locate and select the PDX file for the index you want to build, and click OK. Acrobat Catalog builds the index and places it in the selected folder.

Note: On the Macintosh, you can also build an already-defined index by dragging and dropping. For information on building already-defined indexes in batches, see Purging and rebuilding an index.

Page 31: Acrobat Catalog 3.0 Online Guide

31

To change an index definition

1

Choose File > Open (Macintosh) or Index > Open (Windows).

2

Locate and select the PDX file you want to revise, and click OK.

3

Make your changes in the dialog box called Edit Index Definition (Windows) or Index Definition (Macintosh).

4

If the changes are minor, click Build to rebuild the index. If the changes are major (for example, adding or removing support for search options), skip the build step at this time.

Purge and rebuild

after you have saved the revised definition.

5

Click Save to save the revised definition.

CATALOG Page 31 Wednesday, September 25, 1996 1:14 PM

Page 32: Acrobat Catalog 3.0 Online Guide

32

Drag-and-drop index building (Macintosh)

You can control the details of a drag-and-drop index build by providing the index definition yourself or by altering Drop Folders preferences. Alternatively, you can leave the details to Acrobat Catalog.

To build an index:

Drag a folder containing PDF documents to the Catalog application icon. Or drag multiple folders or an entire disk.

When you release the mouse button, Catalog begins building the index.

• If a folder contains a PDX file, Catalog uses that defi-nition to index the documents in the folder and in any other folders listed in the definition.

• If a folder does not contain a PDX file, Catalog places a new default index (named index.pdx) in the folder and uses it to index the documents in the folder.

Page 33: Acrobat Catalog 3.0 Online Guide

33

To specify a new name for the PDX file and index support folder:

Type a name in the Default Index Name text box of Drop Folders Preferences.

To save the PDX file and index support folder outside of the document folder:

Select Outside Dropped Folder from the Save Index menu of Drop Folders Preferences.

To build an index from the contents of only the folder or folders that you just dropped:

Check Delete Existing Indexes in Drop Folders Preferences.

If you do not select this option, Catalog adds the dropped folder or folders to the Include Directories list of the PDX file, indexes the files that have not yet been indexed, and merges the indexes.

Page 34: Acrobat Catalog 3.0 Online Guide

34

Stopping a build

When a build is in progress, you can stop it at any time. After a build is stopped, Acrobat Catalog is ready to build a new index, update an existing index, or process scheduled builds.

When you stop a build, Catalog maintains the partial results of the build. This way, the next time you update the index, the work that has already been done will be preserved.

Page 35: Acrobat Catalog 3.0 Online Guide

35

To stop a build:

Click Stop in the Catalog window to stop a build. When you click Stop, an error message appears in the Message text box and is written to the log file as follows:

Search Engine Message: E3-0024 (VDK): [specific to error]

This message is normal and indicates only that the build was not completed. If necessary, the partial index can be searched.

To restart a stopped build:

1 Select Schedule from the Index menu.

2 Click Start to restart the scheduled builds. Even if the messages that appear indicate that the build can’t be restarted, give the process some time before stopping it again. It may be successful despite the messages.

Page 36: Acrobat Catalog 3.0 Online Guide

36

Choosing options for an index

Choosing options for an index

Excluding words (stopwords) from an index

Excluding numbers from an index

Choosing not to support search options

Optimizing for CD-ROM

Adding unique document identifiers to 1.0 PDF files

Page 37: Acrobat Catalog 3.0 Online Guide

37

Choosing options for an index

You use the Options dialog box to change Acrobat Catalog defaults for a particular index definition. You can exclude specified terms (stopwords) and numbers, and disable support for Acrobat Search’s Match Case, Sounds Like, and Word Stemming fea-tures. If the collection contains PDF files created by version 1.0 of Acrobat PDF Writer or Acrobat Distiller, select the Add IDs to Acrobat 1.0 PDF Files option.

In Windows, the defaults are fixed. You can change them for a particular definition, but not permanently. On the Macintosh, you can change the defaults for most of the options in the Index Defaults group of preferences.

Page 38: Acrobat Catalog 3.0 Online Guide

38

To choose any index option:

1 In the dialog box called New Index Definition or Edit Index Definition (Windows) or Index Definition (Macintosh), click Options. You display the dialog box by choosing New or Open from the Index menu (Windows) or File menu (Macintosh).

2 Make additions or changes in the Options dialog box.

3 Click OK. The changes apply the next time the index is built or updated.

Page 39: Acrobat Catalog 3.0 Online Guide

39

Excluding words (stopwords) from an index

You can exclude—”stop”—up to 500 words from an index. You might exclude articles such as “the” and “a”, conjunctions such as “but” and “or”, and prepo-sitions such as “for” and “by.”

The advantage of excluding stopwords from an index is that it makes the index smaller—typically 10 to 15% smaller.

The disadvantage is that users of the index cannot find phrases that contain the stopwords. In all their searches, they have to work around them. To help them, you should provide a list of the stopwords with the index.

Page 40: Acrobat Catalog 3.0 Online Guide

40

To exclude stopwords from an index:

• To exclude a stopword, type the word in the Word text box of the Options dialog box and click Add. Stopwords can be up to 24 characters long and are case sensitive. (To stop the, you need to enter both “The” and “the.”)

• To remove a word from the list of stopwords, select a word in the Word To Not Include In Index text box and click Remove.

Page 41: Acrobat Catalog 3.0 Online Guide

41

Excluding numbers from an index

To reduce the size of an index, typically by 10 to 20%, you can exclude numbers from it.

In Windows, the default is to include numbers—you need to specify exclusion in the index definition. On a Macintosh, you can change the default in the Index Defaults group of preferences as well as spec-ify exclusion for a particular index.

The disadvantage of excluding numbers from an index is that users of the index cannot find phrases that contain numbers. For this reason, you should inform users of an index when numbers are excluded from it.

To exclude numbers from an index:

Check Do Not Include Numbers in the Options dialog box.

Page 42: Acrobat Catalog 3.0 Online Guide

42

Choosing not to support search options

The three Acrobat Catalog word options—Case Sensitive, Sounds Like, and Word Stemming—sup-port options used with the Search command in Acrobat Exchange and Reader. But they increase the time required for index updates, the time required for searches, and the size of the index. (The Case Sensitive and Sounds Like options increase the size of the index by 5 to 10% apiece, and the Word Stem-ming option increases it by 10 to 20%.)

You can disable support for any or all of the options. In Windows, the Catalog options are enabled by default and you need to disable them in the index definition. On a Macintosh, you can change option defaults in the Index Defaults group of preferences as well as disable them for a particular index.

Page 43: Acrobat Catalog 3.0 Online Guide

43

• Case Sensitive supports the Match Case option in Acrobat Search.

• Sounds Like supports the Sounds Like option in Acrobat Search. The option expands searches for proper names.

• Word Stemming supports the Word Assistant in Acrobat Search when it previews a search with the Word Stemming option. The Word Stemming option finds words that share a word stem with the search term. (Searching for manage also locates managed and managing, but not manager.) The option works even if it is not supported; only its Word Assistant preview is eliminated.

To change a word option in an index:

To disable support for an Acrobat Search option, deselect it in the Options dialog box. To enable support for an option, select it.

Page 44: Acrobat Catalog 3.0 Online Guide

44

Optimizing for CD-ROM

The Optimize for CD-ROM option arranges index files for the fastest possible access on a CD-ROM.

In addition, the option makes it easier for you to modify Document Info fields or security settings after you have indexed a document. Normally, when a user searches a document that has been modified after it was indexed, a message indicates that the document was changed and the user has to choose whether to use the index nevertheless. When you select the option, the message and choice are bypassed.

To optimize an index for CD-ROM:

Check Optimize for CD-ROM in the Options dialog box.

Page 45: Acrobat Catalog 3.0 Online Guide

45

Adding unique document identifiers to 1.0 PDF files

You may need to add unique document identifiers to PDF documents created with version 1.0 of Acrobat Distiller or PDF Writer and used in cross-platform environments. Version 2 and later of these pro-grams add the identifiers themselves.

The need arises when Macintosh or UNIX filenames are shortened to become DOS filenames and filena-ming ambiguities result. Acrobat Search uses the unique identifiers to resolve the ambiguities.

To add Acrobat identifiers to 1.0 PDF files:

Check Add IDs to Acrobat 1.0 PDF Files in the Options dialog box.

Page 46: Acrobat Catalog 3.0 Online Guide

46

Setting preferences for Acrobat Catalog

Setting preferences (Macintosh)

Setting preferences (Windows)

Tips on settings for efficient indexes

Page 47: Acrobat Catalog 3.0 Online Guide

47

Setting preferences (Macintosh)

The Edit > Preferences dialog box has five groups of preferences:

• The Index group has general Acrobat Catalog preferences.

• The Index Defaults group has preferences for options used in index definitions.

• The Logging group has preferences for logging index builds.

• The Drop Folders group has preferences for building an index by dragging and dropping.

• The Custom fields group has preferences for custom document information fields used with customized versions of Acrobat Exchange.

Page 48: Acrobat Catalog 3.0 Online Guide

48

To change preferences (Macintosh):

1 Choose Edit > Preferences.

2 Click the icon for one of the five groups of prefer-ences in the left panel to access the group. You can also use the Up and Down Arrow keys to select an icon.

3 Change the preference settings by typing a new value in the text box or by clicking an option to select or dese-lect it.

4 To complete the task, choose one of these options:

• Click OK to accept the changes and close the dialog box.

• Click Default; then click OK to restore the preferences to their default settings and close the dialog box.

• Click Cancel to revert to the previous values and close the dialog box.

Page 49: Acrobat Catalog 3.0 Online Guide

49

Setting preferences (Windows)

You change preferences by editing the acrocat.ini file located in the Windows folder. For a list of Win-dows preferences and defaults, see The acrocat.ini file.

To change preferences (Windows):

1 Using a text editor such as Notepad, or a word processor, open the acrocat.ini file in the Windows direc-tory. If you use a word processor, open the file as a text file.

2 Edit the preference settings you want to change.

3 Save the file. If you are using a word processor, save the file as a text file.

4 Restart Acrobat Catalog.

Page 50: Acrobat Catalog 3.0 Online Guide

50

The acrocat.ini file

Windows preferences appear in the [Options] section of the acrocat.ini file. Settings that might require changing follow. With each item, the value following the equal sign (=) is the default.

• DocumentWordSections=1. 0=small, 1=medium, and 2=large. Used with the three DocumentWordSections settings in the file. Determines the maximum size (in words and terms) of a document before Acrobat Catalog creates two or more indexes for the document. Consider changing for small- or large- memory machines.

• IndexAvailableGroupSize=1024. The number of PDF files Catalog processes before making a partial index available or before updating the current index with entries for new and changed documents. The larger the number, the faster the index is built or updated and the faster searches with the index are completed. The smaller the number, the more frequently the index is

Page 51: Acrobat Catalog 3.0 Online Guide

51

updated with the partial results of the current build and the more current is the information available to Acrobat Exchange users.

• WindowsOnlyFilenames=No. Yes is appropriate only if all searchers use Windows 3.1.

• MemoryPercent=20. If the percentage of memory available at the start of a build drops below this figure, the build stops.

• DocumentWordSectionsSmall=200000

• DocumentWordSectionsMedium=400000

• DocumentWordSectionsLarge=800000

• MaxLogFileSize=1000000 (1024K). When the maximum is reached, the log file is deleted and a new file created.

• GroupSizeForCDROM=4000. Anything above 4000 documents would become unreliable.

Page 52: Acrobat Catalog 3.0 Online Guide

52

Tips on settings for efficient indexes

• For small indexes and fast searches, specify the largest possible build-group size, 1024 files, with Index Available in the Index group of preferences (Macintosh) or with IndexAvailableGroupSize in acrocat.ini (Windows).

• To make partial indexes available quickly during large updates, specify a small build-group size (100 or fewer) with Index Available in the Index group of preferences (Macintosh) or with the IndexAvailableGroupSize option in acrocat.ini (Windows). However, note that decreasing this setting slows the update and the execution of search queries.

Page 53: Acrobat Catalog 3.0 Online Guide

53

• For fast updates, use the largest setting for the Docu-ment Section Size preference in the Index group of pref-erences (Macintosh) or for the DocumentWordSections preference in acrocat.ini (Windows). The memory required to process a document is 10 times the number of words in the document. For example, the largest setting for a computer with 24 megabytes of memory would be 2,400,000 (2.4 million) words.

Note: The Document Section Size/DocumentWord-Section setting determines the maximum size of a document before Acrobat Catalog creates two or more indexes for the document.

• For fast updates on a Macintosh, increase the Index Disk Cache Size in the Index group of preferences as much as possible.

Page 54: Acrobat Catalog 3.0 Online Guide

54

Distributing and maintaining an index

Providing useful information about the index

Keeping dynamic indexes up to date

Moving a document collection and its index

Deleting an index

Page 55: Acrobat Catalog 3.0 Online Guide

55

Providing useful information about the index

Index users need the following information:

• The kind of documents indexed

• The search options supported

• The person to contact or phone number to call with questions

• Whether numbers or stopwords are excluded from the index, and a list of any stopwords

•The location of any index-description document

The primary index-description document is the index-definition (PDX) file itself. When you define an index, you can put up to 250 characters in the Index Description text box. When index users list avail-able indexes, they can read these descriptions.

Page 56: Acrobat Catalog 3.0 Online Guide

56

Even if you can fit all the necessary information into 250 characters, consider providing a separate index-description document. Such a document could do the following:

• List the folders containing documents included in a LAN-based index, or list the documents included in a CD-ROM-based index. You might also include a brief description of the contents of each folder or document.

• If Document Info field values are assigned to indexed documents, list the values for each document.

You can place index-description documents in the same folders as the indexes they describe. Alterna-tively, place all the index-description documents in a central location. That way users can easily find descriptions of all the indexes without knowing where the indexes themselves are located.

Page 57: Acrobat Catalog 3.0 Online Guide

57

Keeping dynamic indexes up to date

You can schedule one-time updates, schedule updates at regular intervals, or arrange for updating to go on continuously.

Acrobat Catalog updates are incremental, to mini-mize updating time and permit searching to go on uninterrupted during updates. This technique causes the index to grow with each update, how-ever, and you need to purge and rebuild the index periodically to reclaim disk space and speed up searches.

Scheduling automatic builds

Purging and rebuilding an index

Tips on updating

Page 58: Acrobat Catalog 3.0 Online Guide

58

Scheduling automatic builds

You can set up Acrobat Catalog to build (or update) an index or a batch of indexes automatically. You arrange the build to take place at the following times:

• At a specified interval, such as every hour or every seven days, and at a particular time

• Once only, and immediately

• Continuously

Use the once-only method, rather than the normal way to build immediately, to build several indexes in a single batch.

To schedule an automatic build:

1 Choose Index > Schedule.

2 For each index you want to build, click Add, and select the name of the index-definition (PDX) file. Then click OK (Windows) or Open (Macintosh).

Page 59: Acrobat Catalog 3.0 Online Guide

59

3 Click Every, Once, or Continuously.

4 If you clicked Every, select a measure such as hours for the interval and enter the interval. If you want to delay processing the selected indexes until a specified time, select Starting At and use the up and down arrows to select the time.

5 Click Start. If you selected the Starting At option, Cat-alog waits for the time you specified before building or updating the indexes. If you selected Once or Continu-ously, it starts immediately. If you selected Continu-ously, it updates the indexes in the order they are listed in the Indices to Build list. It continues to update the indexes until you click Stop.

To add a new index to the existing schedule:

1 Create a new index definition.

2 Click Save As.

3 Before you save the PDX file, select Add Index to Schedule.

Page 60: Acrobat Catalog 3.0 Online Guide

60

Purging and rebuilding an index

When you update an index simply by rebuilding it, entries for deleted documents and for the original versions of changed documents remain in the index but are marked as invalid. This incremental updat-ing slightly increases the time required for searches that use the index, and it can greatly increase the disk space required by the index. (For example, if every document indexed has changed since an ini-tial build, the space required for the index is dou-bled.)

Because these increases accumulate over time, you occasionally need to purge the index before rebuilding it.

Page 61: Acrobat Catalog 3.0 Online Guide

61

You should also purge and rebuild if you change the optional search features supported by an index or change the stopwords list used to build an index. Otherwise, search performance may be slowed or search results distorted.

To purge and rebuild an index:

1 Choose Index > Purge.

2 Locate and select the name of an index-definition (PDX) file for the index you want to purge and rebuild.

3 Click OK (Windows) or Open (Macintosh). Acrobat Catalog purges the existing index. If the index is cur-rently in use, users are given time to complete queries in progress before the purge begins. (The default “time before purge” is 905 seconds; that is, 15 minutes.) Users receive an “Index unavailable for searching” message if they attempt to enter a new query. If a message indicat-ing that the purge has failed to complete appears, look up the message in Troubleshooting for help.

Page 62: Acrobat Catalog 3.0 Online Guide

62

4 After the purge completes, choose Index > Build.

5 Locate and select the PDX file for the index, and click OK (Windows) or Open (Macintosh). Catalog rebuilds the index.

Tip: A faster way to purge an index is simply to delete the nine subfolders of the index folder: assists, morgue, parts, pdd, style, temp, topicidx, trans, and work. Do this only if you are sure the index is not in use.

Page 63: Acrobat Catalog 3.0 Online Guide

63

Tips on updating

You must update an index if either of the following occurs:

• Documents are added to or removed from the collection.

• The hierarchy of the indexed folders changes.

Consider updating an index when either of the following occurs:

• Documents in the indexed document collection change.

• A new Document Info field is defined for the docu-ments in the collection, and data values for the new field are added to the documents.

Here are some ways to cut updating time:

• Don’t support the Sounds Like, Case Sensitive, or Word Stemming search options.

Page 64: Acrobat Catalog 3.0 Online Guide

64

• Use stopwords and exclude numbers.

• Install Acrobat Catalog on the computer where the indexed documents are stored; make sure this computer is the fastest available. If the program and documents are on different computers and it is feasible to move the documents temporarily, move them to the Catalog computer for updating and then move them back.

Note: For information on Catalog preferences that affect updating time, see Tips on settings for effi-cient indexes.

Page 65: Acrobat Catalog 3.0 Online Guide

65

Moving a document collection and its index

You can develop and test an indexed document col-lection on a local hard drive and then move the fin-ished document collection to a network server or CD-ROM. Or you can move a collection from one server, maybe because it is full, to another.

An index definition contains relative paths between the index-definition (PDX) file and the folders con-taining the indexed documents. You don’t have to rebuild the index after moving the indexed docu-ment collection if you have maintained these rela-tive paths. If the PDX file and the folders containing the indexed documents are in the same folder, you can maintain the relative path simply by moving that folder.

Page 66: Acrobat Catalog 3.0 Online Guide

66

If the relative path changes, you must create a new index after you move the indexed document collec-tion. However, you can still use the original PDX file. To use the original PDX file, first move the indexed documents. Then copy the PDX file to the folder where you want to create the new index and update the Include and Exclude lists as necessary.

On a Macintosh, if the index resides on a drive or server volume separate from any part of the collec-tion it applies to, moving either the collection or the index will break the index. If you intend to move a document collection either to another network loca-tion or onto a CD, create and build the index in the same location as the collection.

Page 67: Acrobat Catalog 3.0 Online Guide

67

Deleting an index

If you need to delete an index entirely rather than just purging it, delete the index-definition (PDX) file, the log file for the index, and the index folder and all of its nested subfolders. Use the normal file-deletion pro-cedures for your operating system.

Page 68: Acrobat Catalog 3.0 Online Guide

68

1996 Adobe Systems Incorporated. All rights reserved.

Adobe Acrobat 3.0 Catalog Online Guide

This manual, as well as the software described in it, is furnished under license and may be used or copied only in accordance with the terms of such license. The content of this manual is furnished for informational use only, is subject to change without notice, and should not be construed as a commitment by Adobe Systems Incorporated. Adobe Systems Incorporated assumes no responsibility or liability for any errors or inaccuracies that may appear in this book.

The copyrighted software that accompanies this manual is licensed to the End User for use only in strict accordance with the End User License Agreement, which the Licensee should read carefully before commencing use of the soft-ware. Except as permitted by such license, no part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, recording, or otherwise, without the prior writ-ten permission of Adobe Systems Incorporated.

Adobe, the Adobe logo, Acrobat, Acrobat Capture, the Acrobat logo, Distiller, Acrobat Exchange, Adobe Type Manager, PostScript, and the tagline “If you can dream it, you can do it” are trademarks of Adobe Systems Incorporated. Microsoft and Windows are registered trademarks and ActiveX and Windows NT are trademarks of Microsoft Corporation in the U.S. and other countries. Apple, Macintosh, Power Macintosh, and QuickTime are registered trademarks and AppleScript and TrueType are trademarks of Apple Computer, Inc. Lotus Notes is a registered trademark of Lotus Development Corporation. Netscape and Netscape Navigator are trademarks of Netscape Communications Corporation. UNIX is a registered trademark in the U.S. and other countries, licensed exclu-sively through X/Open Company, Ltd. Pentium is a trademark of Intel Corpora-tion. All other products or name brands are trademarks of their respective own-ers.

This product contains an implementation of the LZW algorithm licensed under U.S. Patent 4,558,302.

Page 69: Acrobat Catalog 3.0 Online Guide

69

This software includes software licensed from Verity, Inc., copyright 1994. All rights reserved. The address of Verity, Inc., is 894 Ross Drive, Sunnyvale, Cali-fornia 94089. Verity ® and TOPIC ® are registered trademarks of Verity, Inc. in the United States and other countries. English Electronic Thesaurus copyright 1993 by INSO Corporation. Adapted from the Oxford Thesaurus copyright 1991 by Oxford University Press and from Roget's II: The New Thesaurus copyright 1980 by Houghton Mifflin Company. All rights reserved. Reproduction or disas-sembly of embodied programs and databases prohibited.

1994 This software includes software licensed from RSA Data Security, Inc.

Written and designed at Adobe Systems Incorporated, 345 Park Ave, San Jose, CA 95110-2704.

Adobe Systems Europe Limited, Adobe House, 5 Mid New Cultins, Edinburgh EH11 4DU, Scotland, United Kingdom

Adobe Systems Co., Ltd., Yebisu Garden Place Tower, 4-20-3 Ebisu, Shibuya-ku, Tokyo 150, Japan

For defense agencies: Restricted Rights Legend. Use, reproduction, or disclosure is subject to restrictions set forth in subparagraph (c)(1)(ii) of the Rights in Tech-nical Data and Computer Software clause at 252.227-7013.

For civilian agencies: Restricted Rights Legend. Use, reproduction, or disclosure is subject to restrictions set forth in subparagraphs (a) through (d) of the com-mercial Computer Software Restricted Rights clause at 52.227-19 and the limi-tations set forth in Adobe’s standard commercial agreement for this software. Unpublished rights reserved under the copyright laws of the United States.

(9/96)

Page 70: Acrobat Catalog 3.0 Online Guide

70

How to use this online guide

Page back or page forward.

Undoes a change of page or view, or redoes a change (Go Back/Go Forward).

Go to the Contents.

Go to the Index.

Go to the how-to page (this page).

Go to the “parent” of the current topic.

Go to the indicated topic.

Go to the next page of a continued topic.

End of a continued topic.

For instructions on printing this guide, go to the next page.

text

Page 71: Acrobat Catalog 3.0 Online Guide

71

How to print this online guide

You can print separate topics or the entire guide. Since the pages of the guide have been made small for online viewing, Windows and Macintosh users may prefer to print them two to a page of paper—”two up.”

To print pages two up:

1 Choose File > Print Setup (Windows) or File > Page Setup (Macintosh).

2 Follow the instructions for your platform:

• In Windows, click Options, select 2 up on the Paper tab, click OK to return to the Print Setup dialog box, and click OK again to close it.

Page 72: Acrobat Catalog 3.0 Online Guide

72

• On a Macintosh, choose 2 Up from the Layout menu and click OK.

Note: If you can’t perform step 2, you may not be using an Adobe or PostScript printer driver. If you are and you still can’t perform the step, install the Adobe printer driver on the Acrobat CD-ROM. See the Acrobat Getting Started guide for installation instructions.

3 Choose File > Print.

4 Indicate the page range.

Click OK (Windows) or Print (Macintosh).