Indexing Data ColdFusion Help

Once a collection exists, you can indev data for that collection, and this is where things get interesting. After all, having defined a collection, it still remains inert and of little use because it is not associated with any data. The indexing process does thisIndexing can be done through the Cold Fusion Administrator or by using the CFINDEX tag. The processes are not identical, however, and meet different needs. Indexing with the ColdFusion Administrator can be used to index only documents (HTML files or other binary files} and is best suited to coilections whose data won’t be updated frequently
nor require re-indexing, The CFINDEX tag, on the other hand, allows the indexing of query results as well as document files and is well suited to collections that will be updated frequently or that users must update and re-index, You will look at indexing documents first followed by indexing query results.

Indexing Document Files

The easiest way to index a set of document files is through the. oldFusion Administrator. To do this, open the Verity page in the Administrator, select the collection to index from the list at the top of the page, and click the Index button. This will open a page like the one.

On this page, you can specify the following information:

File Extensions Enter the extensions, separated by commas, of all file types to be included in the index. The default includes all HTML and ColdFusion template files

but this can be changed to include any file type supported by the Verity system, including:
• HTML
• ColdFusion templates
• Plain text
• Adobe Acrobat
• Adobe FrameMaker MIF Format
• Applix Words
• Corel WordPerfect for Windows and Macintosh
• Lotus AMl Pro
• Lotus Word Pro
• Rich Text Format
• Microsoft Word for Windows, Macintosh, and DOS
• Microsoft Write
• Microsoft Works
• XYWrite
• Corel QuattroPro
• Lotus 1-2-3 for DOS, Windows, and OS/2
• Microsoft Excel
• Corel Presentations
• Lotus Freelance
• Microsoft PowerPoint

Directory Path Indicate the top-level directory to index. To find a directory, click the Browse Server button to open a Select Directory on the Server window like the one  find the desired directory, and click OK. By default, the Recursively Index Subdirectories box is checked. This causes all files matching the specified extensions in the directory and its subdirectories to be indexed. To index OIllY the files in the directory, unselect this option.

Return URL If the files you are indexing are accessible via a URL, this can be specified using this field. Consider an example: The directory d: \html is being indexed. This directory is accessible under the URL ht p: / /some. host/. By providing http:// some. host/ as the Return URL value, it ;5 possi Ie to quickly build links to found files when searching the index.
Language English is the default language. If.the International Languages Search Pack is installed, another language can be chosen.

After this information is provided, simply click the Update button to cause the specified files to be indexed. Depending on the number of files being indexed and the speed of the server, this process can take some time. After the process is completed, the Verity Collections page will be displayed. After an index for a collection has been created, it can be updated whenever the content of the collection changes (documents are added, deleted, or edited), by  the index and repeating the process.

Using CFINDEX to Index a Collection

As an alternative, you can use the CFINDEXtag to create an index of documert files. To do this, the following attributes should be used:

COLLECTION Specifies the name of the collection being indexed.
ACTION Specifies the action to take. Possible values are Update, Delete, Purge, Refresh, and Opt imize. Update updates an index whereas Refresh clears an index and re-creates it. The latter is usually used when indexing for the first time.
TYPE Specifies the type of index being created. Possible values are File to index a specific file, Path to index a set of files in a specific directory that are of the types specified in the EXTENSIONSattribute, and Custom to index query results (as will be discussed later in this chapter).
KEY When indexing a specific file, this attribute specifies the filename. When indexing a directory of files, this attribute specifies the path.
EXTENSIONS Specifies a comma-separated list of file extensions to index when using TYPE=Path.
RECURSE Indicates whether sub directories should be indexed when using TYPE=Path. Possible values are Yes and No.
EXTERNAL Indicates whether the collection was created outside of the ColdFusion environment using Verity’s native indexing tools. Possible values are Yes and No.

Optionally, you may need to specify an index language using the LANGUAGatEtribute and a return URL using the URLPATHattribute. For instance, the following tag creates an index for the collection test, indexes all HTML and ColdFusion documents in the d: \html directory and its subdirectories, and uses the return URL of http:// some. host/ for access to those files as URLs:

TIUs tag can be used anytime the index needs to be re-created because the indexed data has changed

Indexing Query Results

A powerful use of the Verity search system in Coldfusion is to index the res of queries. At first, this may seem redundant. After all, queries-especially those against databases using the CFQUERYtag-are searches of the contents of a specific data source. Howeverr databases are optimized to retrieve records based on keys and on exact matches for the contents of fields. Searches that attempt to retrieve records with specific text fields con
ing one or more words are inefficient. Verity indexes, on the other hand, are optimized this type of text-based search and retrieval.

By indexing the results of a query, these searches can be performed quickly and efficiently against the Verity index, and after the required record is found, it can be retrieved in full from the database by using an efficient query against a primary key or other specific record identifier.

Consider as an example a database table of people’s names and addresses. One field is a large text field for making notes about the people in the database. Although finding a specific person by their first name, last name, or country of residence is easy enough, attempting to find all the records containing references to a specific word in the notes field is inefficient. The Verity search engine is better suited to the task. To index the results of a query, you first need to execute a query that retrieves the required data from the table. For instance, in the preceding example, the query might look like where People is the table name, Person_ID is the primary key, and Notes is the field to be indexed. The name is included so that when a search is conducted, the information is available to display in the results without retrieving from the database again. To iridex the data, you use the CFINDEXtag. There is no way to index database or other query results by using the ColdFusion Administrator. These attributes should be used when indexing a query:

COLLECTION Should be set to the name of the collection being indexed.
ACTION Should be set to Update
TYPE Should be set to Custom
BODY Should contain a list of column names to index, separated by commas
KEY Should contain the name of the column containing the primary key for the result set
TITLE Should contain the column name containing the title for each record
QUERY Should contain the name of the query to b indexed
For example, to index the preceding query example, the following CFINDEXtag could be used:

Notice the use of TITLE=”Name· to indicate that, when sea-rching, the title of each item returned will be indicated by the content of the Namecolumn.

Posted on November 16, 2015 in Implementing a Search Engine

Share the Story

Back to Top
Share This