Eloquence B.07.10 Release Notes

Eloquence B.07.10 Release Notes - Database

Overview / Auditing database changes

Auditing database changes

The database was enhanced to support auditing database changes. Committed transactions may be logged and associated with application processes and batch jobs.

To enable auditing of database changes multiple components of the Eloquence database were enhanced:

The database client library was enhanced to collect and upload application process information upon connecting to the database server. In addition application specific information (such as current batch job) may be specified using the EQ_AUDIT_INFO environment variable.
The database server was modified to collect audit information from applications and associate it with commited transactions. The previous content of modified and deleted records are also saved in the forward-log file in addition to information on record structure to allow analysis of changes even after the database structure was changed.
The new fwaudit utility may be used to extract audit information from the forward-log file into a documented and a more compact file format, suitable for long term archival.
In addition this utility provides a filter expression engine to analyze the audit information contained in forward-log or binary audit files.

Server configuration changes

To enable recording audit information, forward-logging must be configured. In addition, the new EnableAudit config item in the section [ForwardLog] must be set to a nonzero value.

For example:

 [ForwardLog]
 EnableAudit = 1

Setting EnableAudit causes additional audit information to be recorded in the server forward-log file. This includes the following:

session sign-on and sign-off data containing the user name and client process information
mapping of internal file id numbers to database, data set names
record structure, including item names and types
before-image data to track record update and delete operations
user-defined 'memo' information (currently unused)

Enabling auditing increases disk space requirements for forward-log files. The amount of additional disk space is application specific and mostly depends on the amount of before-image data. Enabling auditing should have no noticeable performance implications besides writing additional information to the forward-log files.

If only audit information is wanted but no forward-log is needed for recovery purposes, forward-logging may be configured to only record such data relevant to auditing. This option significantly reduces disk space requirements for the forward-log files. However, the forward-log files created in this mode cannot be used with the dbrecover utility to perform a rollforward recovery of the database environment. The dbrecover utility rejects the audit-only forward-log files as invalid.

To enable the audit-only forward-log mode, the new AuditOnly config item in the section [ForwardLog] must be set to a nonzero value.

For example:

 [ForwardLog]
 AuditOnly = 1

It is not required to configure the [ForwardLog] EnableAudit item in addition.

Specifying application specific audit information

The database client library uploads some process information when connecting to the database server. In addition the environment variable EQ_AUDIT_INFO may be used to specify application specific information.

For example:

 export EQ_AUDIT_INFO="Month-end"

Currently, the process specific information includes the following items:

os - operating system
ip - ip address
user - OS name of the user
login - database login name
uid - numeric user id (HP-UX, Linux)
pid - process id
pname - command line
info - additional application information (EQ_AUDIT_INFO)

The fwaudit utility

The fwaudit utility may be used to extract audit information from the forward-log files. The audit information may be converted into the Eloquence binary audit file format and/or output as text.

The Eloquence binary audit file format is publicly documented.

 usage: fwaudit [options] file [...]
 options:
  -help       - show usage (this list)
  -o filename - write binary audit output to file (- for stdout)
  -c comment  - write clear-text comment into binary audit file
  -e expr     - specify filter expression
  -f filename - read filter expression from file
  -r          - print clear-text report
  -i number   - print values of the first 'number' items
  -I itemlist - print values of specified items
  -v          - print progress info / report details

The fwaudit utility processes one or multiple forward-log files that are specified on the command line. In addition, it can read and process its own binary audit file format, so it is typically used to create Eloquence audit files as well as to later analyze them.

If multiple forward-log files are specified, fwaudit automatically recognizes their volume generation and sequence numbers and sorts them appropriately so that they are processed in the correct order. This is done by reading the file headers, therefore this works even if the file names do not indicate the generation and sequence.

The fwaudit utility should be supplied with the complete set of forward-log files which belong to one or more volume generations (for details, please refer to the section about forward-log files and volume generations below).

Although possible, it is not recommended to mix forward-log and binary audit files on the command line.

Options:

The -o option may be used to specify the name of an output file. If present, the audit information is written to this file encoded in the Eloquence binary audit format. A - as the output file name specifies the standard output.

If an output file is created, the -c option may be used to write an user-defined comment to the beginning of the file, for example to include a label that may later be helpful to associate the file's creator.

The -e and -f options allow to specify one or more filter expressions to qualify the information which is output. This affects both the information written to an audit file as well as any clear-text output. The filter expression syntax is documented in the next section.

The -r option activates the clear-text reporting to analyze the contents of forward-log or binary audit files. It may be combined with the -i, -I and -v options to specify the level of detail:

If the -r option is specified alone (that is, neither -i nor -I nor -v are specified), the DBPUT, DBUPDATE or DBDELETE actions are printed indicating the data set, record number, timestamp and session number they belong to. Whenever a new session number occurs, the associated session sign-on record is printed which indicates the session's client process information.
The -i option specifies the number of item values which should be printed for each action (DBPUT, DBUPDATE or DBDELETE) in addition. For example, -i3 specifies that in addition to the -r output the first three item values should be printed.
The -I option allows to specify a list of item names which values should be printed for each action in addition. For example, -I 'custno, matchcode, name' specifies that in addition to the -r output the values of the CUSTNO, MATCHCODE and NAME items should be printed if they are part of the particular data set.
If a single array element should be printed, specify the 1-based element index in square brackets. For example, the first element in the TURNOVER array is specified as TURNOVER[1]. To specify that all elements of an array should be printed, use the array item name, for example TURNOVER.
For DBUPDATE, if -i or -I is specified modified item values are always printed regardless whether they are covered by the -i or -I specification. For example, if you specify -i1 but the value of the third item was modified its old and new values are printed.
If the -r option is not specified, the -i and -I options have no effect.
If the -v option is specified alone (that is, -r is not specified) the progress is indicated by printing the name of each processed file. Specifying -v twice (-vv) prints additional file header detail information.
If both the -v and -r options are specified, the values of all items are printed for each action unless the items to be printed are qualified with -i or -I.

If neither -o nor -r nor -v are specified, fwaudit just silently checks the forward-log or binary audit files for consistency and prints any error or warning messages to the standard error output.

For example:

 fwaudit -o 20050705.audit /data/fwlog/*

 fwaudit -rvv -i'albumcode,composername' \
         -e'timestamp >= 2005-07-05 14:09:06' /data/fwlog/*

Example output:

processing file: 222-1.log
 version: 1.01
 byte order: 4321
 character set: hp-roman8 (0)
 volume release: 11
 volume set id: 0x395e413e
 volume generation: 222
 file sequence: 1

SIGN-ON session:7
 protocol{7}os{HPUX}ip{127.0.0.1}user{mike}
 uid{102}pid{12283}pname{../putdel 5}

DBPUT MUSIC.SELECTIONS (#489) recno:148065 session:7
 timestamp: 2005-07-05 14:09:06
  ALBUMCODE             : 17358
  COMPOSERNAME          : "Ludwig Beethoven"

DBDELETE MUSIC.SELECTIONS (#489) recno:13343 session:7
 timestamp: 2005-07-05 14:09:06
  ALBUMCODE             : 17358
  COMPOSERNAME          : "Edvard Grieg"

DBUPDATE MUSIC.SELECTIONS (#489) recno:148344 session:7
 timestamp: 2005-07-05 14:09:06
  ALBUMCODE             : 27625
  COMPOSERNAME          : "Amadeus Mozart"
 -COMMENT               : "Comments"
 +COMMENT               : "Comments Updated"

Usage of filter expressions

A filter expression qualifies the information that is processed by the fwaudit utility. It affects both the data written to a binary audit file and the clear-text reporting output. This may be used for various purposes, for example:

to extract a subset of the original information from either forward-log or binary audit files and write it to a new binary audit file
to create a clear-text report about specific actions in a database from either forward-log or binary audit files

Filter expressions follow a syntax that allows to specify different categories:

filter by timestamp
Each action contains a timestamp which may be evaluated in a filter expression. The syntax is:
TIMESTAMP relational-op date-time-value
TIMESTAMP BETWEEN date-time-value AND date-time-value
- relational-op:
  < <= = <> >= >
- date-time-value:
    YYYY-MM-DD HH:MM:SS
    MM/DD/YYYY HH:MM:SS
    DD.MM.YYYY HH:MM:SS
In date-time-value, the time part is optional (defaults to 00:00:00).
If using BETWEEN, the AND keyword may be omitted.
For example:
```
 TIMESTAMP >= 2005-07-26
 timestamp between 07/26/2005 10:00 and 07/26/2005 12:00
 TIMESTAMP BETWEEN 26.07.2005 15:31:40 26.07.2005 15:31:50
```
filter by the type of database operation
You can use the keywords DBPUT, DBUPDATE and DBDELETE to qualify the type of database operations.
filter by database and data set
You can use an expression of the form database.dataset to qualify a specific database and data set combination.
The rightmost dot character (in case the database name contains dot characters) is used to separate the database name from the data set name. The comparison is not case sensitive. Wildcards may be used (see notes below).
For example:
```
 DB.CUSTOMERS
 db.*
 *.customers
 db.*[0-9]
```
filter by record number
An action's record number may be used in a filter expression. The syntax is:
RECNO relational-op integral-value
RECNO BETWEEN integral-value AND integral-value
- relational-op:
  < <= = <> >= >
- integral-value:
  a positive integral number
If using BETWEEN, the AND keyword may be omitted.
For example:
```
 RECNO <> 1000
 recno between 100 and 150
 RECNO BETWEEN 2500 2600
```

filter by process specific information

Each action contains a reference to the session it belongs to. This session data may be used in a filter expression. The syntax is:

session-item relational-op {text}
session-item BETWEEN {text} AND {text}

session-item:

`os`	-	operating system
`ip`	-	ip address
`user`	-	OS name of the user
`login`	-	database login name
`uid`	-	numeric user id (HP-UX, Linux)
`pid`	-	process id
`pname`	-	command line
`info`	-	additional application information

relational-op:
< <= = <> >= >
text:
text value to compare with

The comparison is not case sensitive. Wildcards may be used if either the = or the <> operator is applied (see notes below).
If using BETWEEN, the AND keyword may be omitted.

For example:

 login = {public}
 pname = {query*}
 UID BETWEEN {110} AND {120}

filter by item value
An item value may be evaluated in a filter expression. The syntax is:
item-name relational-op value
item-name BETWEEN value AND value
- item-name:
  item name (case insensitive)
- relational-op:
  < <= = <> >= >
- value:
  text for X, U and B items (in single or double quotes)
  numeric for I, K, E, P and Z items
In a DBUPDATE operation, the item value is compared with both the before-image and after-image data. If this is not desired, the item name may be prefixed with - or + to specify that only the before-image (-) or after-image (+) data should be used.
To compare the value of an array element, specify the 1-based element index in square brackets. For example, the first element in the TURNOVER array is specified as TURNOVER[1].
A text comparison is case sensitive. Wildcards may be used if either the = or the <> operator is applied (see notes below).
If using BETWEEN, the AND keyword may be omitted.
For example:
```
 NAME = "*M?LLER"
 orderdate >= 20050101
 +price between 12.5 12.8
 -turnover[2] < 5000
```

Notes:

Keywords are not case sensitive, so that TIMESTAMP, DBPUT and RECNO are equivalent to timestamp, dbput and recno.
Wildcards include * (matches any number of characters), ? (matches a single character) and character classes enclosed in square brackets (for example: [abc] [A-Z] [A-Za-z0-9_]).
If an item name equals a keyword (for example TIMESTAMP) it must be enclosed in square brackets, for example:
```
 [TIMESTAMP] = 25001
 [+TIMESTAMP] > 88957
```
The keywords are: NOT AND OR BETWEEN DBPUT DBUPDATE DBDELETE RECNO TIMESTAMP

Filter expressions may be combined with AND, OR and NOT. The order of precedence is NOT -> AND -> OR (that is, NOT has the highest and OR has the lowest precedence). To change the precedence, sub-expressions may be put into parantheses.

The examples below illustrate the use of combined filter expressions.

Find out who created a specific customer record:

 dbput and *.customers and custno="090667"

Find out who altered the price of a specific item below a certain value (note that with +price only the after-image value is evaluated):

 dbupdate and db.items and itemcode="77901" and +price<1.5

Report all modifications or deletions of customer records during the last month (note that parantheses must be used to change the AND -> OR order of precedence):

 (dbupdate or dbdelete) and *.customers
 and timestamp between 06/01/2005 07/01/2005

Report all database modifications of a certain user in a specified period except those which affected the STATISTICS or LOGBOOK data sets (note that parantheses must be used to change the NOT -> OR order of precedence):

 login={jdoe} and timestamp between 2005-07-27 2005-07-28
 and not (*.statistics or *.logbook)

Specifying multiple filter expressions with the -e command line option effectively combines them with AND, so that for example:

 -e dbput -e '*.customers' -e 'custno="090667"'

becomes:

 -e '(dbput) AND (*.customers) AND (custno="090667")'

With the -f command line option, a filter expression may be read from a text file, for example:

 # Extract all June 2005 database modifications
 timestamp between 06/01/2005 07/01/2005
 
 # Exclude anything affecting
 # the STATISTICS or LOGBOOK data sets
 and not (*.statistics or *.logbook)

As illustrated, everything after a # character until the next line break is considered a comment.

Forward-log files and volume generations

A volume generation begins with either

the start of the eloqdb6 database server or
the start of an on-line backup (dbctl backup start) or
the execution of the dbctl forwardlog restart command.

A volume generation ends with either the next shutdown of the eloqdb6 or the next dbctl backup start or dbctl forwardlog restart command.

If forward-logging is configured to use automatically managed files, the %N token in the configured file name is replaced with the volume generation number followed by the sequence number of a file within one generation.

A file with a sequence number higher than 1 might reference information contained in a previous file of the same generation. Therefore it is important that all files of a generation are processed, otherwise information may be missing in the resulting audit file, in which case fwaudit issues a warning message.

Integrating fwaudit into the on-line backup procedure

A possible way to integrate the fwaudit utility into a regular backup procedure on HP-UX or Linux could be:

Create a new temporary subdirectory and move all existing forward-log files into this directory.
Note that this includes the current forward-log file which the eloqdb6 will continue to write although it has been moved to a different directory. This works because the file's inode does not change if the file is moved within the same file system.
Start a new on-line backup with the dbctl backup start command. This starts a new volume generation and creates a new forward-log file in the configured forward-log directory.
Now all forward-log files referring to the previously successful backup are located in the temporary subdirectory and the new forward-log file referring to the yet-to-be-done current backup is located in the configured forward-log directory.
Perform the backup. When finished, stop the on-line backup mode with the dbctl backup stop command.
If the backup has been successful, the previous forward-log files located in the temporary subdirectory are now obsolete.
They are now used for a last time with the fwaudit utility which extracts their audit information to a new audit file (see example below).
After that, the temporary subdirectory along with all contained files may be removed and the new audit file should be archived.
If the backup failed (e.g. if a tape device error occurred), move the previous forward-log files from the temporary subdirectory back to their original location so that, based on the previously successful backup, the forward-log is still complete.

For example:

 # Change to the directory containing the forward-log files
 cd directory-containing-fwlog-files

 # Create new audit file name from current date/time
 audit_fname="fwaudit-`date '+%Y%m%d-%H%M%S'`"

 # Create a comment to be written into the audit file
 comment="Created by $LOGNAME at `date '+%Y-%m-%d %H:%M:%S'`"

 # Process all files in this directory
 /opt/eloquence6/bin/fwaudit -o "$audit_fname" -c "$comment" *

(Please note that "directory-containing-fwlog-files" must be replaced with the real location of the directory containing the forward-log files.)

Audit file format

The audit file format is publicly documented to be used by interested third parties. It is expected to be stable. An example program to read the audit file format is available in C source code upon request (please contact support@marxmeier.com).

Overview

An audit file starts with a file header containing a signature, a version number and the byte order and character set encoding used with the file.

Audit data is organized in records, each prefixed with a tag which describes the type of the record and its size.

Record types are:

comment
This is a general-purpose record for any kind of informational comment text. The fwaudit -c option writes a comment record to the beginning of the audit file. Also, during fwaudit processing warning and informational messages are written as comment records for later reference.
session sign-on
A 32bit session number is associated with information about the application, such as operating system, IP address, login and process information.
session sign-off
This marks the end of a session's lifetime and could be used to free the data associated with a session.
node schema
Internally, Eloquence manages its data storage in 'node' units which are equivalent to files. Each data set has an unique node number. The node schema records associate a node number with the corresponding data set name and record structure.
IMAGE op
This contains the IMAGE operations DBUPDATE, DBPUT and DBDELETE including the session number, the node number, a timestamp and the record number. Depending on the type of operation, before- image data (DBUPDATE, DBDELETE) and/or after-image data (DBUPDATE, DBPUT) are included as well.

File header

The audit file header has the following structure:

 - 10 byte                 : signature 'ELOQ.AUDIT'
 - 5 byte                  : file version, currently '01.00'
 - 1 byte                  : NUL (0x00) termination byte
 - 2 byte (unsigned 16bit) : byte order (*1)
 - 2 byte (unsigned 16bit) : character set (*2)

 (*1) Byte order encoding:
     - 4321 (0x10e1) : big endian
     - 1234 (0x04d2) : little endian

 (*2) Character set encoding:
     - 0 : hp-roman8
     - 1 : iso-8859-1

Record tag

Each record starts with the following information:

 - 1 byte (unsigned 8bit)  : record type (*3)
 - 4 byte (unsigned 32bit) : size of record

 (*3) Type encoding:
     - '1' (49 / 0x31) : comment
     - '2' (50 / 0x32) : session sign-on
     - '3' (51 / 0x33) : session sign-off
     - '4' (52 / 0x34) : node schema
     - '5' (53 / 0x35) : IMAGE op

Please note:

The size field specifies the size of the information following the record tag and does not include the size of the record tag (5 bytes).
For upwards compatibility, unknown record types should be skipped (which is possible because their size is known).

Comment record

A comment record consists entirely of the comment text. There is no termination character.

Session sign-on record

Format of session sign-on record:

 fixed part
 * 4 byte (unsigned 32bit) : session number

 variable part
 * 2 byte (unsigned 16bit) : number of entries
 * each entry consists of:
   - 2 byte (unsigned 16bit) : size of data
   - [size of data] bytes    : data (text)

The sign-on data uses the following encoding:

 item1{value1}item2{value2}...

The item name is followed by the associated value enclosed in curly braces. A brace or backslash character that is part of the item value is escaped by a backslash character.

The following items are currently defined:

For example:

 SIGN-ON session:3
  protocol{7}os{HPUX}ip{127.0.0.1}user{mike}login{public}
  uid{102}pid{12282}pname{../putdel 5}info{benchmark}

Session sign-off record

Format of session sign-off record:

 - 4 byte (unsigned 32bit) : session number

For example:

 SIGN-OFF session:3

Node schema record

Format of node schema record:

 fixed part
 * 4 byte (unsigned 32bit) : node number
 * 2 byte (unsigned 16bit) : length of data set name
 * 2 byte (unsigned 16bit) : size of IMAGE data record
 * 2 byte (unsigned 16bit) : number of items in data set
 * 2 byte (unsigned 16bit) : reserved for future use

 variable part
 * [length of data set name] bytes : data set name
 * for each item in data set:
   - 1 byte (unsigned 8bit)      : length of item name
   - [length of item name] bytes : item name
   - 1 byte (unsigned 8bit)      : item data type
   - 2 bytes (unsigned 16bit)    : number of members (1..n)
   - 2 bytes (unsigned 16bit)    : size per member in bytes
   - 4 bytes (unsigned 32bit)    : format code

For example:

SCHEMA MUSIC.COMPOSERS (#485) record size: 48 bytes
 'COMPOSERNAME' type:X count:1 size:16 fmt:0x0
 'BIRTH' type:X count:1 size:16 fmt:0x0
 'DEATH' type:X count:1 size:16 fmt:0x0

SCHEMA MUSIC.ALBUMS (#483) record size: 102 bytes
 'ALBUMCODE' type:I count:1 size:4 fmt:0x0
 'ALBUMTITLE' type:X count:1 size:40 fmt:0x0
 'MEDIUM' type:X count:1 size:2 fmt:0x0
 'ALBUMCOST' type:I count:1 size:4 fmt:0x0
 'RECORDINGCO' type:X count:1 size:16 fmt:0x0
 'DATERECORDED' type:X count:1 size:16 fmt:0x0
 'MFGCODE' type:X count:1 size:20 fmt:0x0

IMAGE op record

Format of IMAGE op record:

 fixed part
 * 4 byte (unsigned 32bit) : session number
 * 4 byte (unsigned 32bit) : node number
 * 4 byte (unsigned 32bit) : timestamp (seconds since 1970-01-01 UTC)
 * 4 byte (unsigned 32bit) : record number
 * 1 byte (unsigned 8bit)  : operation code (*4)
 * 1 byte (unsigned 8bit)  : nonzero if before-image data is present
 * 1 byte (unsigned 8bit)  : nonzero if after-image data is present
 * 1 byte (unsigned 8bit)  : reserved for future use

 variable part
 * if before-image data is present:
   - [size of IMAGE data record (*5)] bytes : before-image data
 * if after-image data is present:
   - [size of IMAGE data record (*5)] bytes : after-image data

 (*4) Operation encoding:
     - '1' (49d / 0x31) : DBUPDATE
     - '2' (50d / 0x32) : DBPUT
     - '3' (51d / 0x33) : DBDELETE

 (*5) The 'size of IMAGE data record' should be obtained from
      the associated node schema record.

For example:

DBPUT MUSIC.COMPOSERS (#485) recno:1 session:2
 timestamp: 2005-07-05 14:06:40
 after-image: 48 bytes
  000: 4c 75 64 77 69 67 20 42 65 65 74 68 6f 76 65 6e |Ludwig Beethoven|
  010: 31 37 37 30 00 00 00 00 00 00 00 00 00 00 00 00 |1770............|
  020: 31 38 32 37 00 00 00 00 00 00 00 00 00 00 00 00 |1827............|

DBDELETE MUSIC.SELECTIONS (#489) recno:1 session:3
 timestamp: 2005-07-05 14:06:54
 before-image: 74 bytes
  000: 00 00 03 8e 53 45 4c 20 4e 41 4d 45 20 00 4c 75 |....SEL NAME .Lu|
  010: 64 77 69 67 20 42 65 65 74 68 6f 76 65 6e 00 00 |dwig Beethoven..|
  020: 07 76 50 65 72 66 6f 72 6d 65 72 73 20 00 00 00 |.vPerformers ...|
  030: 00 00 00 00 00 00 43 6f 6d 6d 65 6e 74 73 20 20 |......Comments  |
  040: 00 00 00 00 00 00 00 00 00 00                   |..........      |

DBUPDATE MUSIC.SELECTIONS (#489) recno:28 session:3
 timestamp: 2005-07-05 14:06:54
 before-image: 74 bytes
  000: 00 00 b2 46 53 45 4c 20 4e 41 4d 45 20 00 4c 75 |...FSEL NAME .Lu|
  010: 64 77 69 67 20 42 65 65 74 68 6f 76 65 6e 00 00 |dwig Beethoven..|
  020: b6 2e 50 65 72 66 6f 72 6d 65 72 73 20 00 00 00 |..Performers ...|
  030: 00 00 00 00 00 00 43 6f 6d 6d 65 6e 74 73 20 20 |......Comments  |
  040: 00 00 00 00 00 00 00 00 00 00                   |..........      |
 after-image: 74 bytes
  000: 00 00 b2 46 53 45 4c 20 4e 41 4d 45 20 00 4c 75 |...FSEL NAME .Lu|
  010: 64 77 69 67 20 42 65 65 74 68 6f 76 65 6e 00 00 |dwig Beethoven..|
  020: b6 2e 50 65 72 66 6f 72 6d 65 72 73 20 00 00 00 |..Performers ...|
  030: 00 00 00 00 00 00 43 6f 6d 6d 65 6e 74 73 20 55 |......Comments U|

os	-	operating system
ip	-	ip address
user	-	OS name of the user
login	-	database login name
uid	-	numeric user id (HP-UX, Linux)
pid	-	process id
pname	-	command line
info	-	additional application information (EQ_AUDIT_INFO)