Configuring the Enterprise Portal Search Infrastructure

This chapter provides an overview of the Enterprise Portal search infrastructure and discusses how to:

Click to jump to parent topicDefining Search Indexes

This section provides an overview of search indexes and discusses how to:

Click to jump to top of pageClick to jump to parent topicUnderstanding Search Indexes in Enterprise Portal

A search index is a collection of files that is used during a search to quickly find documents of interest. You build a search index to enable searching on a given set of documents. The set of files that make up the index is a collection. This collection contains a list of words in the indexed documents, an internal documents table containing document field information, and logical pointers to the actual document files. Most content in Enterprise Portal can be searched after creating indexes.

Search Limitations

Managed Content that has been imported into another feature is not searchable. You can search for it in the Content Management search, but you will not retrieve results when searching for imported Managed Content in the following features: Action Items, Calendar Events, and Discussions. Content that has been created directly in these features is searchable as well as attachments that have been added in the feature. In addition, the Calendar feature will not index the actual website for a website that has been added directly, but it will index the metadata.

Click to jump to top of pageClick to jump to parent topicPages Used to Define Search Indexes

Page Name

Object Name

Navigation

Usage

Index Administration

EO_PE_SIDX_SUMMARY

Portal Administration, Search, Administer Indexes

Administer all PeopleSoft Enterprise Portal search indexes.

Add Index

EO_PE_ADD_INDEX

Click the Add Index button on the Index Administration page.

Add a record-based, file system spider, or HTTP spider index.

Record Indexes

EO_PE_RECD

Click the Edit Properties link for a record-based index on the Index Administration page.

Create and build record-based search indexes.

Filesystem Index

EO_PE_FSYS

Click the Edit Properties link for a file system spider index on the Index Administration page.

Create and build file system spider search indexes.

HTTP Index

EO_PE_HTTP

Click the Edit Properties link for an HTTP spider index on the Index Administration page.

Create and build HTTP spider search indexes.

Edit Key

VEGGIE_SEC

Click the Edit Key link on the Record Indexes page.

Change the results that are returned by the Key returned in search results functionality. We recommend that you retain the <pairs/> value in the Key returned in search results field.

What To Index

EO_PE_WHATTOINDEX

Portal Administration, Search, Administer Indexes, What To Index

Define the MIME types and file names you want to include in the search index. This page is available only for file system spider and HTTP spider indexes.

Subrecords

EO_PE_RGW_SUBRECRD

Portal Administration, Search, Administer Indexes

Click Edit Properties. Select the Subrecords tab.

Define the subrecords that you want to include in the search index. This page is available only for record-based indexes.

Security

EO_PE_SIDXPERM

Portal Administration, Search, Administer Indexes

Click Edit Properties. Select the Security tab.

Define security access for the search index.

Filters

EO_PE_SIDX_PKG

Portal Administration, Search, Administer Indexes

Click Edit Properties. Select the Filters tab.

Define application classes to use as filters for the search index.

Click to jump to top of pageClick to jump to parent topicAdministering Search Index Definitions

Access the Index Administration page.

 

Index

Displays the name of the search index. To select an index, select the check box to the left of the index name. Delivered indexes are unavailable for selection because they should not be altered.

Gateway Type

Displays the type of gateway the search index uses to access its content.

Portal Index: Based on the portal registry.

Record-based Index: Based on records.

HTTP Spider Index: Based on a URL.

Filesystem Spider Index: Based on a file system location.

Edit Properties

Click for a record-based index to access the Record Indexes page, where you can edit index properties.

Click for an HTTP spider index to access the HTTP Index page, where you can edit index properties.

Click for a file system index to access the Filesystem Index page, where you can edit index properties.

Add Index

Click to access the Add Index page, where you can add a new index.

Delete Selected Indexes

Click to delete any indexes you have selected. Deleting an index definition also removes the actual collections stored in the file system, if any have been built.

Schedule Indexes

Click to access the Build Search Indexes page, where you can configure and launch the Build Search Indexes process (EO_PE_IBLDR).

See Also

Building Search Indexes

Click to jump to top of pageClick to jump to parent topicEditing a Record-Based Search Index Definition

Access the Record Indexes page.

Build Index

Click to run the Build Search Indexes process (EO_PE_IBLDRB) for the selected search index.

System Index

This is a delivered index and is not available for editing.

Index Location

Displays the current location of the index.

By default, the files for an index are located in <PS_HOME>/data/search/<INDEXNAME>/<db name>/</language cd>. However, you can change this location by specifying the search index location property in the application server and Process Scheduler configuration files.

See Enterprise PeopleTools 8.48 PeopleBook: System and Server Administration, “Building and Maintaining Search Indexes,” Specifying the Index Location

Menu Name

Select the menu name that is associated with the records you want to include in the index.

Market

Select the market that is associated with the records you want to include in the index.

Component

Select the component that is associated with the records you want to include in the index.

Key returned in search results

Displays information that you have entered on the Edit Key page.

This data is used to synthesize the VdkVgwKey, which supports an XML-like syntax enabling you to modify the tag that is returned by Verity. We recommend that you retain the <pairs/> value, which means that the format of the Verity entry key will be FIELDNAME=VALUE.

Edit Key

Click to access the Edit Key page, where you can change the results that are returned by the Key returned in search results functionality. We recommend that you retain the default value delivered.

Parent Data Record

Record

Enter records or views that contain data. Only one record is allowed in a record search index definition. To create a record search index definition that includes multiple records, create a view of multiple records and select the view here.

WHERE clause to append

Enter an SQL WHERE clause that you want to use to fine tune the search result data. For example, if you are indexing a table of all counties in all states in the United State, but you want only counties in California in this particular index, you could add an SQL WHERE clause of STATE = 'CA'.

Fields

How to zone the index

Field zone. Select to create one zone for each PeopleSoft field on the record. Applications can specify that they want to access that particular zone in their searches.

One zone. Select to put all of the data into one zone. With this option, the index builds more quickly, but the application can't restrict searches to the portions of the index that come from a particular field.

Click here for help with the Field Columns

Displays a page of help text.

Record and Field Name

After you select a value in the Record field, the record name and record fields appear in this grid.

Verity Field

Select to indicate that you want the field to be included in style.ufl and indexed as a Verity field. Verity fields are returned with search results and can be compared numerically.

Generally, PeopleSoft fields that contain metadata about what is being indexed (such as ProductID) should be indexed as Verity fields.

Word Index

Select to indicate that you want the field to be included in the word index. Anything that is not included in the word index cannot be searched for as plain text, although it may still be returned in a Verity field if you have selected the Verify Field option.

In general, PeopleSoft fields that contain a lot of descriptive text, such as description fields, should be included in the word index.

Has attachment

Select to indicate that the field contains binary large object (BLOB) data that will be detached and indexed along with the record. You should not select this option unless the attachment is stored as a BLOB.

For columns of files that contain a URL to stored documents, select this option to include that stored data in the Verity collection.

Select this option if the selected field contains the URL to an attachment. In this way, this option enables you to index attachments that are referenced by URL and include their stored data in the Verity collection. Refer to the PeopleCode Developer's Guide for a description of file attachments.

The indexer downloads the attachment and indexes it as part of the Verity search collection document. This option is available for selection only if the selected field contains character data. It is not available for selection for numeric fields, as numeric fields cannot contain URLs.

You must use this option with a record that was designed for use with this feature. In the record, each row has a text field that contains a URI or an empty string.

The text must be a valid File Transfer Protocol (FTP) URI, including the login and password string, that uses the following format:

  • ftp://user:pass@host/path/to/filename.doc.

  • A valid record URI of the form record://RECORDNAME/path/to/file.doc.

  • A string of the form <urlid name=”A_URLID”/>/path/to/file.doc.

The third form references an entry in the URL table defined on the URL Maintenance page. If the URL ID that is named in the name attribute is valid, the entire URI is rewritten with the part in brackets replaced by the actual URI.

For example, if A_URLID is equal to ftp://anonymous:[email protected], the entire string in the previous example becomes ftp://anonymous:[email protected]/path/to/file.doc and is treated like any other FTP URI.

Rows of data with empty strings in the URI field are ignored with no error.

If the string is in one of these three valid URI formats and a document can be retrieved at the URI, the document is indexed with the same key as the rest of the row of data and is searchable.

Append to Verity Command Line

This control intended for PeopleSoft internal use only, but can be used by users with adequate Verity knowledge.

See Also

Enterprise PeopleTools 8.48 PeopleBook: System and Server Administration, “Using PeopleTools Utilities,” URL Maintenance

Click to jump to top of pageClick to jump to parent topicEditing Keys

Access the Edit Key page.

Key returned in search results

Enter information to change the results that are returned by the Key returned in search results functionality. You can enter the following values to derive results:

<pairs/>. Inserts a string of NAME=VALUE;. One such pair is returned for each key of the record.

<row/>. Inserts the record keys in a SQL-like syntax.

<field fieldname='MYFIELD'/>. Inserts the value of MYFIELD, if it exists in the record.

<sql stmt='SQL STATEMENT'/>. Inserts the value that is returned by the SQL statement. The system accepts only the first row that is returned. PeopleSoft does not support SQL statements returning more than one column.

Test VdkVgwKey (save first)

Click to test the search results returned by the values you entered in the Key returned in search results field.

Before clicking this button be sure to have a record selected in the Record field on the Record Indexes page.

Click to jump to top of pageClick to jump to parent topicEditing a File System Search Index Definition

Access the Filesystem Index page.

 

Build Index

Click to run the Build Search Indexes process (EO_PE_IBLDRB) for the selected search index.

System Index

This is a delivered index and is not available for editing.

Index Location

Displays the current location of the index.

By default, the files for an index are located in <PS_HOME>/data/search/<INDEXNAME>/<db name>/</language cd>. However, you can change this location by specifying the search index location property in the application server and Process Scheduler configuration files.

See Enterprise PeopleTools 8.48 PeopleBook: System and Server Administration, “Building and Maintaining Search Indexes,” Specifying the Index Location

Start Location

Specify the network file system path that contains the documents to index. Ensure that the local application server has the proper access to the file systems that you specify.

For Microsoft Windows, this means the drive mappings must be set up from the applications server. For UNIX, this means the correct network file system (NFS) mappings must be set on the application server.

Remap to URL

Enter the HTTP alias that you want to assign to the file system crawl results.

Append to Verity Command Line

This control intended for PeopleSoft internal use only, but can be used by users with adequate Verity knowledge.

Click to jump to top of pageClick to jump to parent topicEditing an HTTP Spider Search Index Definition

Access the HTTP Index page.

Build Index

Click to run the Build Search Indexes process (EO_PE_IBLDRB) for the selected search index.

System Index

This is a delivered index and is not available for editing.

Index Location

Displays the current location of the index.

By default, the files for an index are located in <PS_HOME>/data/search/<INDEXNAME>/<db name>/</language cd>. However, you can change this location by specifying the search index location property in the application server and Process Scheduler configuration files.

See Enterprise PeopleTools 8.48 PeopleBook: System and Server Administration, “Building and Maintaining Search Indexes,” Specifying the Index Location

Start Location

Enter the URL to content you want to include in the index. You can include one URL per search index definition. URLs should contain only the alphanumeric characters as specified in RFC 1738. Any special character must be encoded. For example, encode a space character as %20, and encode a < as %3c. Additional examples are available.

See http://www.w3.org/Addressing/rfc1738.txt

Stay in Domain

Select to limit indexing to a single domain. For example, suppose that you are indexing http://www.peoplesoft.com. If you select this option and a link points to a site outside the PeopleSoft domain, the indexing ignores the link.

Stay in Host

Select to further limit indexing within a single server. If you select this option, the index contains references to content only on the current web server or host. Links to content on other web servers within the domain are ignored. For example, if you are indexing http://www.peoplesoft.com and you select this option, the index will include documents on http://www.peoplesoft.com, but not on http://www1.peoplesoft.com.

Link Depth

Set the level of detail to which you want to index a certain site. If you enter 1, the indexing starts at the homepage, follows each link on that page, indexes all of the data on the target pages, and then stops. If you enter 2, the indexing follows the links on the target pages and indexes one more level into the website.

As you increase the number, the number of links that the indexing follows increases geometrically. Do not set this value too high, as it can impact performance negatively. You should not need to set this value higher than 10.

Proxy Host and Proxy Port

Enter a host and port for the indexing to use. Enter the same settings that you would use in your web browser if you need a proxy to access the internet.

Append to Verity Command Line

This control intended for PeopleSoft internal use only, but can be used by users with adequate Verity knowledge.

Click to jump to top of pageClick to jump to parent topicDefining What to Include in File System and HTTP Search Indexes

Access the What To Index page.

Mime Types (Multipurpose Internet Mail Extension types)

Index all Mime-types

Select to index all MIME types on a website.

Index only these Mime-types

Select to index only certain MIME types. Specify the MIME types to include in the MIME/Types Allowed list box. Use a space to separate multiple MIME types.

Exclude these Mime-types

Select to exclude a set of MIME types. Specify the MIME types to exclude in the MIME/Types Allowed list box. Use a space to separate multiple MIME types.

File Names

Index all filenames

Select to index all file names.

Index only these filenames

Select to index only certain file types. Specify the file types to include in the Pathname Globs List list box. Use a space to separate multiple file types.

Exclude these filenames

Select to exclude certain file types, such as temporary files. Specify the file types to exclude in the Pathname Globs List list box. Use a space to separate multiple file types.

Pathname Globs List

Specify the file types you have chosen to include or exclude. You can use wildcard characters (*) to denote a string and “?” to denote a single character. For example, the string *.doc 19??.excel means select all files that end with the .doc suffix and Microsoft Excel files that start with 19, followed by 2 characters.

Click to jump to top of pageClick to jump to parent topicDefining Search Index Security

Access the Security page.

Access Type

Public. Select to indicate that you want the search index to be searchable by all users.

Roles. Select to indicate that you want the search index to be searchable only by the roles you define in the Role Name field. The search index will be included in a user's PeopleSoft Enterprise Portal search only if the user is a member of a specified role.

Click to jump to top of pageClick to jump to parent topicDefining Search Index Filters

Access the Filters page.

App Class Type (application class type)

Select the application class type you want to use as a filter for the search index. Available values include:

Index Builder Callout. The application class used by the index builder to extend the processing of the creation of the handled index. At build time, the index builder will attempt to call out to the specified application class to perform any custom processing during the creation of the search collection.

Search Query Filter. The application class used to process and/or filter search results returned from this search collection. The search query filters can be used to post-process raw Verity search results, as well as apply security and prevent certain search results from being returned to certain users.

Package Name

Enter the package name you want to use.

Package Path

Enter the path to the package you want to use.

Application Class Name

Enter the application class name you want to use.

CallOut Type

Lists the SQL object used to select the URLs to be indexed. Enables indexing of the actual website, not just the metadata that lists the website's URL.

Selection SQL

Corresponds to the Selection SQL The types are predefined to uniquely select the content rows that are to be indexed.

  • Content. Selects unique content IDs.

  • Folder. Selects the content ID/folder ID.

  • Portal. Selects the content ID/folder ID/portal name.

Click to jump to parent topicDefining Search Index Groups

This section provides an overview of search index groups and discusses how to use the Define Index Groups component (EO_PE_SIDX_GROUPS) to define a search index group.

Click to jump to top of pageClick to jump to parent topicUnderstanding Search Index Groups

A search index group is made up of individual search indexes. Search index group names can be used by PeopleCode developers to reference the named groups from the search logic. Allowing users to configure which indexes are in which groups allows configuration of the search index groups for such functional areas as PeopleSoft Enterprise Portal search, Collaborative Workspaces, and so forth without requiring PeopleCode changes. Search index groups may also be created and maintained for use with custom development projects on customer sites, if desired.

Click to jump to top of pageClick to jump to parent topicPages Used to Define Search Index Groups

Page Name

Object Name

Navigation

Usage

Search Index Group

EO_PE_SIDX_GROUPS

Portal Administration, Search, Define Index Groups, Search Index Group

Create search index groups, which are made up of individual search indexes.

Search Tester

EO_PE_SIDX_GTST

Portal Administration, Search, Define Index Groups, Search Tester

Test the functionality of a search index group, and ensure that the indexes are returning results and that the proper filters are being used. Be sure to build your search indexes before using this page.

Click to jump to top of pageClick to jump to parent topicDefining a Search Index Group

Access the Search Index Group page.

Index Group Name

The system displays the name of the index group.

Description

Enter a description of the contents of the search indexes in the search index group.

Max Search Results (maximum search results)

Enter the maximum number of search results you want returned for the search index group. Leaving this field clear translates to a maximum search result number of 0. To indicate that you do not want to limit the number of search results, select the Unlimited Results option.

Unlimited Results

Select if you do not want to limit the number of search results returned for the search index group.

Use Thesaurus

Select if you want to use thesaurus capabilities to derive search results for the search index group.

Note. To use this option you must first create your own Verity thesaurus file.

See Enterprise PeopleTools 8.48 PeopleBook: Verity Locale Configuration Guide v 6.1 for PeopleSoft, “Creating a Custom Thesaurus”

Search Indexes

Index Name

Select a search index you want to include in the search index group.

Description

Enter a description of the search index as it is to be used with the search index group.

Override Default Results Link

Select to indicate that you want to override the results link defined on the Record Indexes page. This option only applies to record-based indexes.

Search Query Filters

Package Name

If you have entered filter data for the index definition, this value will be populated automatically upon saving this page. If you have not entered filter data for the index definition, you can enter the package name you want to use.

Package Path

If you have entered filter data for the index definition, this value will be populated automatically upon saving this page. If you have not entered filter data for the index definition, you can enter the path to the package you want to use.

Application Class Name

If you have entered filter data for the index definition, this value will be populated automatically upon saving this page. If you have not entered filter data for the index definition, you can enter the application class name you want to use.

Click to jump to top of pageClick to jump to parent topicTesting a Search Index Group

Access the Search Tester page.

Index Group Name

The system displays the name of the index group.

Description

Enter a description of the contents of the search indexes in the search index group.

Max Search Results (maximum search results)

Enter the maximum number of search results you want returned for the search index group. Leaving this field clear translates to a maximum search result number of 0. To indicate that you do not want to limit the number of search results, select the Unlimited Results option.

Unlimited Results

Select if you do not want to limit the number of search results returned for the search index group.

Use Thesaurus

Select if you want to use thesaurus capabilities to derive search results for the search index group.

Note. To use this option you must first create your own Verity thesaurus file.

See Enterprise PeopleTools 8.48 PeopleBook: Verity Locale Configuration Guide V5.0 for PeopleSoft, “Creating a Custom Thesaurus”

Search Text

Enter the text for which you want to search in the search index group.

Search

Click to execute the search for the search text in the search index group.

Display All Result Attributes

Select to display all Verity search fields returned from the search, as opposed to only what the filter returned.

Click to jump to parent topicBuilding Search Indexes

This section provides an overview of the Build Search Indexes Application Engine process (EO_PE_IBLDR), portal registry search indexes, and delivered portal search indexes. This chapter also discusses how to use the Build Search Indexes component (EO_PE_IBLDR_RUN) to run the Build Search Index process.

Click to jump to top of pageClick to jump to parent topicUnderstanding the Build Search Indexes Process

Use the Build Search Indexes Application Engine process (EO_PE_IBLDR) to build search indexes you have defined in the Administer Indexes component. PeopleSoft Enterprise Portal delivers run control IDs that you can use with the Build Search Indexes process that have been configured to enable you to conveniently build search indexes for the following portal areas:

Action Items

To build a search index for the Action Items feature, select the PAPP_ACTION_ITEMS run control ID on the Build Search Indexes page.

The following delivered search index definitions will be used to build the search index:

Important! Do not delete or modify these indexes. They are delivered as system data, and cannot be easily restored if deleted.

Calendar Events

To build a search index for the Calendar Events feature, select the PAPP_CALENDAR _EVENTS run control ID on the Build Search Indexes page.

The following delivered search index definitions will be used to build the search index:

Important! Do not delete or modify these indexes. They are delivered as system data, and cannot be easily restored if deleted.

Collaborative Workspaces

To build a search index for the Collaborative Workspaces feature, select the PAPP_COLLABORATIVE_WORKSPACES run control ID on the Build Search Indexes page.

The following delivered search index definitions will be used to build the search index:

Important! Do not delete or modify these indexes. They are delivered as system data, and cannot be easily restored if deleted.

Content Management

To build a search index for the Content Management feature, select the PAPP_CONTENT_MANAGEMENT run control ID on the Build Search Indexes page.

The following delivered search index definitions will be used to build the search index:

Important! Do not delete or modify these indexes. They are delivered as system data, and cannot be easily restored if deleted.

Discussions

To build a search index for the Discussions feature, select the PAPP_DISCUSSION_FORUMS run control ID on the Build Search Indexes page.

The following delivered search index definitions will be used to build the search index:

Important! Do not delete or modify these indexes. They are delivered as system data, and cannot be easily restored if deleted.

Resource Finder

To build a search index for the Resource Finder feature, select the PAPP_RESOURCE_FINDER run control ID on the Build Search Indexes page.

The CPX_EPX_SRCHDB search index definition will be used to build the search index.

Important! Do not delete or modify this index. It is delivered as system data, and cannot be easily restored if deleted.

Portal Registry

To build a search index for all content references in the main portal registry, select the PAPP_PORTAL_REGISTRY run control ID on the Build Search Indexes page and select the Build Portal Search Index option.

Important! Do not delete or modify this index. It is delivered as system data, and cannot be easily restored if deleted.

Sites

To build a search index for all search indexes for the main portal registry and also for all sites created with Site Wizard, select the PAPP_SITES run control ID on the Build Search Indexes page and select the Build All Site Search Indexes option.

See Also

Enterprise PeopleTools 8.48 PeopleBook: Internet Technology, “Building Registry Search Indexes”

Creating Sites

Click to jump to top of pageClick to jump to parent topicPages Used To Build Search Indexes

Page Name

Object Name

Navigation

Usage

Build Search Indexes

EO_PE_IBLDR_RUN

Portal Administration, Search, Build Search Indexes

Set up and run the Build Search Index process.

Click to jump to top of pageClick to jump to parent topicRunning the Build Search Index Process

Access the Build Search Indexes page.

 

Run

Click to access the Process Scheduler Request page, where you can run the Build Search Indexes process using the settings you have defined on this page.

Site Index Options

Build All Site Search Indexes

Select to build a search index for all search indexes for the current portal registry and also for all sites created with Site Wizard.

See Understanding the Build Search Indexes Process.

Portal Index Options

Build Portal Search Index

Select to build a search index for all content references in the current portal registry.

See Understanding the Build Search Indexes Process.

Index Name

Select a search index that you want the process to build.

Gateway Type

Describes the type of index.

Description

Displays a brief summary of the index contents.

Language

Select the language code for which you want to build the search index. This field value defaults to the database's base language.

Include In Log

This option becomes available for selection if you have selected the Create Log File option. Select to include the selected search index in the log file.

Create Log File

Select to create a log file for the process run. Select the Include In Log option for the search indexes you want to be included in the log.

The log is stored in the <PS_HOME>/appserv/prcs/<PRCS_SERVER_NAME>/files directory and contains debugging information about the various stages of the index processing and collection build. Any output from any callouts will also be in this log.