.
contact contact

B.08.20 / Release Notes / Database / Using FTS

Using FTS indexing

 
.
 

This document shows a series of examples how to make use of FTS searches in your application. Please note that most of the examples below are shown using Eloquence Basic code snippets, but the same features are also available to users of the image3k API, for example COBOL programs.


Using basic FTS features

The following examples use the CUSTOMERS dataset of a small DEMO-DB database.
Here is some dbtables info:

                         D A T A   S E T   F I E L D S

Data Set Name    Set  Item Name        Item  Item  Recd   Item  Item  Item
                 Num                   Num   Type  Ofs    Cnt   Len   Role

CUSTOMERS          2  CUSTOMER-NO         5  X6        0     1     6  Srch
                      CUSTOMER-NAME       4  X30       6     1    30
                      CITY                3  X30      36     1    30
                      STATE              19  X30      66     1    30
                      ZIPCODE            23  X6       96     1     6
                      ADDRESS             1  X30     102     1    30
                      SALES-AREA         17  X16     132     1    16
                      TURNOVER           20  P8      148    12    96
                      TURNOVER-PY        22  P8      244     1     8
                      TURNOVER-MTD       21  P8      252     1     8

The basic FTS functionality allows us to create FTS indexes on text fields using dbutil and then perform FTS searches with dbfind calls and retrieve FTS results with dbget calls.

Using a simple FTS index

Our first FTS index is intended to allow keyword searches on customer names, so we create it using the following dbutil commands:
database "demo-db";
upgrade database;
CREATE FTS INDEX in CUSTOMERS on CUSTOMER-NAME;
exit;

Note that we use the upgrade database command above, because our DEMO-DB has been created on a older Eloquence version. The upgrade database command extends the internal database catalog tables for use with FTS indexing. This is a one-time operation and not needed for databases that were created with Eloquence B.08.20 or newer.

After creating the FTS index, we may use the prdbutil command to review FTS related info of our database:

$ prdbutil -l fts demo-db

### database demo-db ###

### fts ###

create fts index in CUSTOMERS                               # refid 16
 on CUSTOMER-NAME
;

The prdbutil utility is helpful for reviewing FTS settings (among other things) because prschema does not display FTS related infos.

Now that the FTS index is created, we can use dbfind calls to perform FTS searches in our programs. The following code snippet shows three different dbfind call examples: one searches for a simple keyword ("design"), one searches for a partial keyword ("fash@") and one uses a boolean expression to search for a combination ("design or fash@"):

! search for a keyword
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","design")

! search for a partial keyword
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","fash@")

! use a boolean expression
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","design or fash@")

Note that all the above dbfind call examples use the FTS specific mode 12. We could also use mode 1 as long as the FTS index names do not conflict with existing key item, search item or index item names. For retrieving the FTS search results, we use dbget mode 25 (forward) or dbget mode 26 (backward):

REPEAT
  DBGET (Db$,"CUSTOMERS",25,S(*),"@",Buffer$,0)
  IF S(1)=0 THEN
    ! display relevant buffer fields
  END IF
UNTIL S(1)<>0

Note that we could also use dbget mode 5 or dbget mode 6 if there is no concurrent chained read active.

Using a composite FTS index

Creating FTS indexes is not limited to text fields as a whole. We can also select field substrings for indexing or combine multiple fields or substrings into a composite index field (similar to what was already possible with regular index items before FTS was introduced).

Here is a dbutil example that combines the first two characters from the SALES-AREA and the first four characters from the ZIPCODE to form a location code index called CUST-LOC:

database "demo-db";
CREATE FTS INDEX CUST-LOC in CUSTOMERS
  on SALES-AREA:2,ZIPCODE:4 option NP;
exit;

Note that the above FTS index definition also uses the "no parse" option NP to change the indexing behavior to not parse for words, i.e. spaces or non-alphanumeric characters are indexes "as is" (instead of being interpreted as keyword-delimiters).

Searching above FTS index for customers from the Western sales area excluding ZIP code area 4xxxx would look like this:

DBFIND (Db$,"CUSTOMERS",12,S(*),"CUST-LOC","WE@ and not WE4@")

Using FTS search refinements

You may be wondering why we would want to use a composite FTS index like the above instead of simply implementing it using a regular index item (which was already available before Eloquence 8.20)...

The benefit of creating it as an FTS index is that this allows use to use it in "refined searches", where a sequence of dbfind calls specifies a combination of FTS search criteria.

The following dbfind sequence, for example, would search for customers from the desired sales areas and zipcode region that also have they keywords design or fash@ in their customer name:

DBFIND (Db$,"CUSTOMERS",12,S(*),"CUST-LOC","(WE@ and not WE4@) or SO@")
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","AND (design or fash@)")

Note the leading AND operator in the second dbfind call above. This results in the FTS search refinement.

If the result of such a refined search does not match our expectations, we may even use dbfind mode 20 to undo the most recent search refinement and repeat it with different search criteria:

DBFIND (Db$,"CUSTOMERS",12,S(*),"CUST-LOC","(WE@ and not WE4@) or SO@")
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","AND (design or fash@)")

! undo previous refinement and retry with new criteria
DBFIND (Db$,"CUSTOMERS",20,S(*),"CUSTOMER-NAME","")
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","AND inc@")

Note that the above undo-and-retry is typically only used in interactive programs and based on user input.

Using a grouped FTS index

By grouping FTS index fields, we can perform FTS searches across multiple fields without having to specify each one individually.

For example, it may be convenient to search a customer by keywords from the address, city or state fields:

database "demo-db";
CREATE FTS INDEX in CUSTOMERS on ADDRESS option GROUP=1;
CREATE FTS INDEX in CUSTOMERS on CITY option GROUP=1;
CREATE FTS INDEX in CUSTOMERS on STATE option GROUP=1;
exit;

The above grouped index allows to search by city or state in a single dbfind call instead of using a sequence of dbfind calls for a refined search (but would also qualify entries with a street name like California Road, for example):

! grouped index allows
DBFIND (Db$,"CUSTOMERS",12,S(*),"ADDRESS","chicago or california")

! instead of having to refine
DBFIND (Db$,"CUSTOMERS",12,S(*),"CITY","chicago")
DBFIND (Db$,"CUSTOMERS",12,S(*),"STATE","OR california")

Using advanced FTS features

The following examples use additional datasets (PRODUCTS, ORDER-DETAILS and VENDORS) from the DEMO-DB database.

                         D A T A   S E T   F I E L D S

Data Set Name    Set  Item Name        Item  Item  Recd   Item  Item  Item
                 Num                   Num   Type  Ofs    Cnt   Len   Role

PRODUCTS           1  PRODUCT-NO         15  X6        0     1     6  Srch
                      PRODUCT-NAME       14  X16       6     1    16
                      PRICE              12  P4       22     1     4
                      PRODUCT-LINE       13  X2       26     1     2
                      QUANTITY           16  I2       28     1     2
 . . .

ORDER-DETAILS      6  ORDER-NO           10  X8        0     1     8  Srch
                      PRODUCT-NO         15  X6        8     1     6  Srch
                      QUANTITY           16  I2       14     1     2
                      SHIPMENT-DATE      18  X4       16     1     4

VENDORS            7  PRODUCT-NO         15  X6        0     1     6
                      VENDOR-NAME        24  X20       6     1    20

The advanced FTS functionality allows us to also create FTS indexes on numeric or array items, as well as using soundex or fuzzy searches on text fields. Beyond that it also allows to create linked or aggregated indexes, which permit FTS searches using criteria across multiple related datasets. Furthermore it also allows to use column values from an FTS search result as selection criteria for a subsequent FTS search on another dataset.

Using an FTS index on a numeric item

By creating an FTS index on a numeric item, we enable searching for item values or item ranges. We can also include the numeric item in more complex FTS search conditions using boolean expressions or refined searches (with a sequence of dbfind calls).

Here is an example that creates an FTS index on the PRICE item in the PRODUCTS dataset:

database "demo-db";
CREATE FTS INDEX in PRODUCTS on PRICE;  # item type P4 (packed decimal)
exit;

The above index may be used to search using values, relational expressions or ranges. Here are a few examples:

! search for a value
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE","123")

! search using a relational expression
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE",">=100")

! search for a range
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE",">=100<=500")

! alternative range syntax
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE","100:500")

Note that a range search is more efficient than a boolean expression combining two relational expressions:

! less efficient than a range search
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE",">=100 AND <=500")

Using an FTS index on an array item

It is also possible to create an FTS index on a numeric array item (to apply search criteria to each array element):

database "demo-db";
CREATE FTS INDEX in CUSTOMERS on TURNOVER;  # (array of 12 packed decimals)
exit;

This results in FTS searches qualifying entries where one or more array element satisfies the search criteria:

! searches individual array elements
DBFIND (Db$,"CUSTOMERS",12,S(*),"TURNOVER",">=5000"

Using linked or aggregated FTS indexes

Linked or aggregated FTS indexes may be used on detail sets and their associated master sets. They enable FTS searches using criteria across the corresponding datasets. For example, we may qualify (and retrieve) master set entries using search criteria on the associated detail set. Or we may combine criteria on detail set and master set within a refined search (with dbfind sequences).

Linked FTS indexes on a detail set keep track of individual entries as well as their corresponding master entries. Aggregated FTS indexes on a detail set, on the other hand, only keep track of the corresponding master entries. This is much more efficient but it also exhibits a different search behavior regarding the detail set: an FTS search on a linked index is able to retrieve individual detail set entries matching the search criteria, whereas an FTS search on an aggregated index returns all chain entries of those chains where one or more members satisfy the search criteria. It depends on the individual use case, which FTS index type is preferred.

Here is an example that creates both types of FTS indexes, linked and aggregated, on the SHIPMENT-DATE in the details set ORDER-DETAILS. It uses the search item PRODUCT-NO to link to the corresponding master set PRODUCTS.

database "demo-db";

# linked index
CREATE FTS INDEX ship-date in ORDER-DETAILS
  on SHIPMENT-DATE option path=PRODUCT-NO;

# aggregated index
CREATE FTS INDEX ship-date-agg in ORDER-DETAILS
  on SHIPMENT-DATE option path=PRODUCT-NO,agg;

exit;

Here are a few examples for using the linked index: the first one retrieves detail set entries matching a given shipment date range; the second one searches for products in the master set that satisfy a given price range and also have one or more shipments during the above date range; the third one searches for individual detail set entries for products in the given price range that shipped during the given date range.

! search detail entries for given date range
DBFIND (Db$,"ORDER-DETAILS",12,S(*),"SHIP-DATE","0620:0625")

! search master entries for given shipment date and price range
DBFIND (Db$,"ORDER-DETAILS",12,S(*),"SHIP-DATE","0620:0625")
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE","AND 100:500")

! search detail entries for given price and shipment date range
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE","100:500")
DBFIND (Db$,"ORDER-DETAILS",12,S(*),"SHIP-DATE","AND 0620:0625")

Here are a few examples for using the aggregated index: the first one retrieves detail set chains with one or more members matching a given shipment date range; the second one searches for products in the master set that satisfy a given price range and also have one or more shipments during the above date range (this search behaves like the same search using the linked index); the third one searches for detail set chains for products in the given price range where one or more shipments occured during the given date range.

! search detail CHAINS for given date range
DBFIND (Db$,"ORDER-DETAILS",12,S(*),"SHIP-DATE-AGG","0620:0625")

! search master entries for given shipment date and price range
DBFIND (Db$,"ORDER-DETAILS",12,S(*),"SHIP-DATE-AGG","0620:0625")
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE","AND 100:500")

! search detail CHAINS for given price and shipment date range
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE","100:500")
DBFIND (Db$,"ORDER-DETAILS",12,S(*),"SHIP-DATE-AGG","AND 0620:0625")

Using FTS result values to search another dataset

While linked and aggregated FTS indexes allow searches using criteria acrcoss multiple datasets, they do require that the corresponding datasets are related through an image path. Another feature of the advanced FTS functionality does not have this prerequisite, but still may help to perform searches across multiple datasets...

The "multi-find" operator ("&") allows to use column values of an FTS result set as criteria for a subsequent FTS search on a totally unrelated data set (from an image path viewpoint). In our DEMO-DB database this could be the standalone VENDORS detail set, which has PRODUCT-NO and VENDOR-NAME items, but has no image path to the PRODUCTS master set.

Here we create two additional FTS indexes on the VENDORS dataset:

database "demo-db";
CREATE FTS INDEX in VENDORS on PRODUCT-NO;
CREATE FTS INDEX in VENDORS on VENDOR-NAME;
exit;

These FTS indexes allow us to perform an FTS search on the PRODUCTS dataset and use values from the search results as selection criteria for an FTS search on the VENDORS dataset, i.e. find products in a given price range that shipped in a given date range and look up the corresponding vendors (optionally refining by excluding a specific vendor name keyword):

! search products for given shipment date and price range
DBFIND (Db$,"ORDER-DETAILS",12,S(*),"SHIP-DATE-AGG","0620:0625")
DBFIND (Db$,"PRODUCTS",12,S(*),"PRICE","AND 100:500")

! use the PRODUCT-NO in result set to look up corresponding vendors
DBFIND (Db$,"VENDORS",12,S(*),"PRODUCT-NO","&PRODUCT-NO")

! and optionally refine the resulting vendors list by vendor name
DBFIND (Db$,"VENDORS",12,S(*),"VENDOR-NAME","AND NOT acme")

Using soundex or fuzzy FTS searches

By using the soundex option, you can cause the keyword indexing to not only record the keywords in its internal keyword dictionary but also keep track of corresponding soundex values. This allows to perform phonetic searches using the soundex operator if exact spelling of a keyword is not known.

Here is an example of the dbutil command using the "soundex" option SX:

database "demo-db";
CREATE FTS INDEX in CUSTOMERS on CUSTOMER-NAME option SX;
exit;

The dbfind call could then optionally use the soundex search operator (as an alternative to searching for an exact match):

! perform soundex phonetic search
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","dessign!")

If the database does not have an FTS index with soundex option, we may still use the "fuzzy search" operator for finding results where we are not sure about the exact spelling. Note, however, that the fuzzy search operation is far less efficient than a regular keyword search or soundex search, because it cannot look up the search word in the keyword dictionary but has to perform the fuzzy matching on all keywords in the dictionary (which is typically still more efficient than scanning the whole dataset).

! perform fuzzy search
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","dessing!")

! perform another fuzzy search
DBFIND (Db$,"CUSTOMERS",12,S(*),"CUSTOMER-NAME","desing!"

COBOL example using FTS via the image3k API

The following COBOL source code snippets resemble the above example "Using a simple FTS index" from an image3k API perspective. Please see the discussion of the above example for general comments beyond the source code.

       . . .

       WORKING-STORAGE SECTION.

       01  DB-STATUS.
           05  STATUS-1               PIC S9(4) COMP.
           05  ENTRY-LENGTH           PIC S9(4) COMP.
           05  RECORD-NUMBER          PIC S9(9) COMP.
           05  CHAIN-COUNT            PIC S9(9) COMP.
           05  BACK-POINTER           PIC S9(9) COMP.
           05  FORWARD-POINTER        PIC S9(9) COMP.

       01  DB-MODE-12                 PIC S9(4) COMP VALUE 12.
       01  DB-MODE-25                 PIC S9(4) COMP VALUE 25.

       01  customers-set              PIC X(16) VALUE "CUSTOMERS;".

       01  SEARCH-ITEM                PIC X(80).
       01  SEARCH-EXPR                PIC X(80).

       . . .

       PROCEDURE DIVISION.

       . . .

      *  search customers using FTS index on CUSTOMER-NAME

           MOVE "CUSTOMER-NAME;" TO SEARCH-ITEM
           MOVE "design or fash@;" TO SEARCH-EXPR

           CALL "DBFIND" USING
             DB-NAME, customers-set, DB-MODE-12, DB-STATUS,
             SEARCH-ITEM, SEARCH-EXPR

           IF STATUS-1  = 0 THEN

      *  retrieve results from above FTS search ...

             MOVE "@;" TO DB-LIST

             PERFORM WITH TEST AFTER UNTIL STATUS-1  NOT = 0

               CALL "DBGET" USING
                 DB-NAME, customers-set, DB-MODE-25, DB-STATUS,
                 DB-LIST, customers-buf, dummy

               IF STATUS-1  = 0 THEN
      *          ... display appropriate fields from customers-buf
               END-IF

             END-PERFORM

           END-IF

       . . .

Note that the above dbfind call example uses the FTS specific mode 12. We could also use mode 1 as long as the FTS index names do not conflict with existing key item, search item or index item names.

For retrieving the FTS search results, we use dbget mode 25 (forward) or dbget mode 26 (backward), but we could also use dbget mode 5 or dbget mode 6 if there is no concurrent chained read active.


 
 
.
 
 
  Privacy | Webmaster | Terms of use | Impressum Revision:  2013-08-05  
  Copyright © 2013 Marxmeier Software AG