com.billpringle.utils.wrputils
Class WrpValidateCsv

java.lang.Object
  extended by com.billpringle.utils.wrputils.WrpValidateCsv

public class WrpValidateCsv
extends java.lang.Object

This class implements an Application Specific Language (ASL) that can be used to validate a CSV file.

Validation consists of examining the individual fields within a CSV file, applying various rules to determine if the content of the CSV file is syntactically and possibly semantically correct. The validation routines can generate a series of messages that can be used to communicate the validation results. Typically these messages are only generated in the case of errors or warnings, but they could also be used for other purposes.

Examples of validation rules are testing the number of fields in the CSV file against a minimum or maximum value, testing the length of a given field against a minimum or maximum value, etc. The language also provides the ability to test the pattern of the field values, such as all numeric, or following a more specific format. For example, a US Zip Code could be tested using the patterns Z5 or Z5-9999, where the first pattern indicates numeric string of at least five digits with leading zeros supplied if needed, while the second pattern could be used to test for the nine-digit Zip Code.

Commands have the following components, which are optional unless noted otherwise:

Sequence number
This is a required parameter, containing the unique line number of the command. When the rules are retrieved, they are sorted by sequence number
Label
This is an optional name for the statement, which can be used as a GoTo target.
InVar
This is the input variable for the command. If present, the command will perform an operation on this variable, and possibly store the result in the OutVar variable.
Cmd
This is a required parameter, and contains the name of the command to be executed.
Each command has the potential to "fire", usually indicating an exception. For example, the minlen command fires if the field length is less than the specified value. When a command fires, a message is displayed (if present), the status of the record is set to Result (if present), and if GoTo Destination is present, it is executed next.
Arg1
This is an optional literal argument for the command. The type of this literal is determined by its content: If it is a numeric string, then it is marked as an Integer or Float, depending on the presence of a period; otherwise it is marked as a String. To force a numeric string to be considered a string, enclose it with double quoted (e.g., "9").
There should be only one literal entry for any given value. If more than one command uses the same literal value, each command should be pointing to the same literal value.
Arg2
This is an optional literal argument for those commands that require more than one argument. See Arg1 for more details. If the command only uses one literal argument, then Arg1 will be used.
OutVar
This is an optional variable name. If the command returns a value, it will be stored in OutVar. The type of this variable depends on the command.
Result
This is an optional argument that specifies the status of the record. Valid values are:

Typically, this field will be left blank or set to BAD, indicating an error was encountered. The record status is affected if the specified status is worse than the current value. (For example, if the current status is "GOOD", and a command fires with a status of "BAD", then the status of the record will be changed from "GOOD" to "BAD". The status would not be changed from "BAD" to "GOOD".

Message
This is an optional component containing an error message for the command. If the command fires, this message is displayed.
This string can contain variables of the form $[name], where name is the name of the variable to be displayed.
The string can also be of the form {{litval}}, where litval is a named literal value. (See the method litVal.)
GoTo Destination
This is an optional component that is only used if the command fires. It contains the label of the statement to be executed next when the command fires. If the command doesn't fire (or the destination isn't present), then the next statement is executed.
Comment
This is an optional component that can be used to explain the purpose of the command, etc.

The following commands are supported. All string comparisons are case sensitive.

CommandSyntaxDescription
Control commands
NOPnopA do-nothing command that never fires
ALWAYSalways This command always fires, so any message is displayed, and if a GoTo Destination is present, it will be the next command to be executed.
DONEdone This command terminates processing normally. The status of the record will be set to the current status value set by Result.
ABORTabort This command terminates with an error condition. It is typically used for logic errors, and issues a Message.
STATstat arg1 This command fires if the current status of the record matches arg1
WRITEwrite This command causes a normalized form of the record to be written to the previously specified output file
Data definition commands
MAPmap arg1 outvar Map field number arg1 to the variable outvar. Each time a new record is read, this field is copied to this variable.
The argument arg1can also be: NFLDS, indicating the number of fields in the input line or REC for the entire input line
DECLAREinvar declare arg1 [arg2] Declare variable invar with data type arg1. All variables must be declared, but the declarations don't need to appear before they are referenced. If arg2 is present, the initial value for the variable is set to arg2
Assignment commands
COPYinvar copy outvar Copy the contents of the variable invar into the variable outvar. Both invar and outvar must be the same data type.
SETset arg1 outvar Copy the value of the literal arg1 to the variable outvar. The data types of arg1 and outvar must be the same.
String manipulation and validation commands
ASCIIinvar ascii This command fires if any non-ASCII characters are found in the string variable invar
NODATAnodata arg1 arg2 This command fires if all of the fields in the specified range are empty or contain only whitespace. The data types of arg1 and arg2 must be integer
MATCHinvar match arg1 This command fires if the string value of invar is the same as the string literal arg1.
STARTSinvar starts arg1 This command fires if the string in invar begins with the literal string arg1.
ENDSinvar ends arg1 This command fires if the string in invar ends with the literal string arg1.
CONTAINSinvar contains arg1 [outvar] This command fires if the string in invar contains the literal string arg1. If outvar is present, the index of the literal into the string invar is stored in the numeric variable outvar.
PATTinvar patt arg1 outvar This command applies the pattern indicated by arg1 on the string variable invar and places the resulting string in outvar.
SUBSTRinvar substr arg1 [arg2] outvar This command extracts the substring starting at location arg1 (where zero is the start of the string). If arg2 is present, the substring terminates at that location (the character at that position is not included in the output string). If arg2 is not present, then the rest of the string is included.
APPENDinvar append outvar The string variable invar is appended to the current value of outvar
REPLACEinvar replace arg1 arg2 outvar Replace arg1 with arg2 in invar storing the results in outvar
TRIMinvar trim outvar Trim leading and trailing spaces in invar and store the results in outvar.
TRIMALLtrimall arg1 If arg1 is "y" or "t", then all fields will be trimmed; otherwise none of them will.
NORMinvar norm arg1 outvar This command normalizes the string value of invar using the pattern indicated by arg1 and places the result in outvar. If the string value doesn't match the normalization pattern, the command will fire.
Numeric test and manipulation commands
MINLENinvar minlen arg1 This command fires if the length of the string in invar is less than the numeric literal arg1
MAXLENinvar maxlen arg1 This command fires if the length of the string in invar is greater than the numeric literal in arg1
EQinvar eq arg1 This command fires if the numeric value of invar equals the numeric literal arg1
LTinvar lt arg1 This command fires if the numeric value of invar is less than the numeric literal arg1
GTinvar gt arg1 This command fires if the numeric variable invar is greater than the numeric literal arg1
LENinvar len outvar This command stores the length of the string variable invar into the numeric variable outvar
INTVALinvar intval outvar This command converts the numeric string in invar into the numeric variable outvar. The command fires if the string is non-numeric.
Miscellaneous commands
ZIPinvar zip outvar This command validates the city/state/zip combination in invar, in the form "city|state|zip", with the fields separated by vertical bars. The command fires if the zipcode combination is not valid, and outvar will contain an error message
ZIPCITYinvar zipcity outvar This command finds the cities that are defined for the zipcode in invar and places them in outvar.

The following patterns are supported for the match command:

p
This is the "normal" pattern, which replaces characters in the input string with the following characters in the output string The lengths of the input string and output strings will be identical.
For example, the pattern for the input string "(123)555-1234" will be "-999-999-9999".
9
Only the numeric digits in the input string are replaced by a "9". Non-numeric characters are not copied.
For example, "(123)555-1234)" will be changed to "9999999999".
n
This pattern will copy only numeric digits to the output string, dropping any non-numeric character.
For example, the pattern "n" on (123)555-1234" will result in "1235551234".
zN
Similar to pattern "9", except that additional "9" characters are added if the input string is less that "N" digits.
For example, a pattern of "z5" on "123" will result in "99999". The pattern "z5" for the string "123456" will result in "999999".
nN
Similar to pattern "n", except that leading zeros are supplied to result in a numeric string with at least "N" digits.
For example, the pattern "n5" for the string "123" will result in "00123". The pattern "n5" for the string "123456" will result in "123456".
a
This pattern will copy only letters from the input string to the output string, dropping any other type of character.
For example, the pattern of "a" for the string "U.S." will be "US".
u
This is similar to the "a" pattern, except that it will replace any lower case character with its upper case character.
For example, the pattern "u" on input string "pa" will result in "PA".

The following patterns are supported for the norm command:

*
Copy the entire (or remainder) input string directly to the output string
X
Copy the current character.
9
Copy a numeric digit from the input string
If the next input character is not a digit, an error is triggered.
a
Copy an alphabetic character from the input string.
If the next input character is not a letter, an error is triggered.
Any other character
The pattern character is copied to the output string

This program assumes that the CSV file has been parsed and stored in an instance of the WrpCsv class.

Author:
Bill Pringle

Nested Class Summary
 class WrpValidateCsv.GoToLoc
          Inner class for statement label
 class WrpValidateCsv.ValidationRule
          Inner class for validation rule
 class WrpValidateCsv.ValidationRuntimeException
          Inner class for validation run time exception
 class WrpValidateCsv.Variable
          Inner class for variable.
 
Field Summary
protected  int breakpt
          breakpoint sequence number (place actual breakpoint in executeCommand).
protected  int compileErrors
          compilation errors for rules
 WrpCsv csv
          the current CSV file that is being validated
private  int csvLine
          line number of this CSV
(package private)  java.util.Vector<java.lang.String> errMsgs
          list of error messages
private  WrpCsv headerCsv
          parsed header line
private  java.lang.String headerLine
          expected header line
protected  int IC
          instruction counter - location of next instruction to be executed
protected  java.util.Vector<WrpValidateCsv.GoToLoc> labels
          list of statement labels- created during compile time.
protected  int lastseq
          previous sequence number.
protected  java.util.Vector<WrpValidateCsv.Variable> literals
          defined literals (arg1 and arg2)
static int MAP_ERR
          invalid map
static int MAP_NFLDS
          number of fields
static int MAP_REC
          entire record
protected  java.util.Vector<WrpValidateCsv.Variable> maps
          field mapping for each record
 java.util.Vector<java.lang.String> normalized
          array of normalized fields for this record
 java.lang.String normString
          output file for normalized record
 WrpPattern patt
          pattern building object
private  int prevErr
          previous error line (if any)
(package private)  java.io.PrintStream prtLog
          Log file, if logging is enabled.
static int REC_BAD
          record has errors
static int REC_ERROR
          record status unknown
static int REC_FUZZY
          record is suspicious
static int REC_GOOD
          record status is good
static int REC_NONE
          record is missing
static int REC_WARN
          record has warnings
 java.util.Vector<WrpValidateCsv.ValidationRule> rules
          the set of validation rules to be used for validation
protected  int runErrors
          the number of run-time errors for this CSV validation
protected  int runFlg
          run flag - zero means done; non-zero, keep running
protected  int runTot
          the total number of run-time errors for this compilation
protected  int Stat
          current status of the current record
 boolean trimall
          flag to trim all fields (default true)
static int TYPE_BOOL
          data type boolean
static int TYPE_ERR
          data type in error
static int TYPE_FLT
          data type float
static int TYPE_INT
          data type integer
static int TYPE_STR
          data type string
static java.lang.String UNKNOWN
          Used as String representation of unknown / error values
protected  java.util.Vector<WrpValidateCsv.Variable> variables
          list of declared variables
protected  WrpZipCode zipCodes
           
 
Constructor Summary
WrpValidateCsv()
          Default constructor.
WrpValidateCsv(java.util.Vector<WrpValidateCsv.ValidationRule> rules)
          Constructor with validation rules.
WrpValidateCsv(java.util.Vector<WrpValidateCsv.ValidationRule> rules, int csvLine, WrpCsv csv)
          Create a validation object with the given rules and then validate the given CSV string.
 
Method Summary
 void addRule(WrpValidateCsv.ValidationRule rule)
          Add a new rule to the current list of rules.
 boolean ascii(java.lang.String fld)
          Test the input string for non-ASCII characters
 void checkSyntax(int ndx)
          Check the syntax of the specified command There is a validation segment for each instruction / command.
 void chkCmdArg1(int ndx)
          Check syntax for the form: cmd arg1
 void chkCmdArg1Outvar(int ndx)
          Check syntax for form: cmd arg1 outvar
 void chkInvarCmdArg1(int ndx)
          Check syntax for form: invar cmd arg1 This command can be used for any type of literal.
 void chkInvarCmdArg1Outvar(int ndx)
          Check syntax for form: invar cmd arg1 outvar
 void chkInvarCmdOutvar(int ndx)
          Check syntax for form: invar cmd outvar
 int compileRules()
          check rules for syntax, build list of destinations and variables, validate GoTo destinations
 void declareVariable(int ndx)
          Add variable declaration to list of variables
 java.lang.String decodeMsg(int ndx, java.lang.String msg)
          Replace variable names in a string with their values.
 java.lang.Boolean executeCommand(int ndx)
          Execute the current command
private  int executeRules()
          Validate supplied CSV using supplied rules
 int executeRules(int csvLine)
          Execute the current list of rules, with the current line number
 int findGoTo(java.lang.String label)
          Find target destination of GoTo in list of statement labels
 int findLiteral(java.lang.String litval)
          Find (or create) literal for the given value
 int findMap(java.lang.String name)
          Find the map entry for the specified field
 int findVariable(java.lang.String vname)
          Find specified variable in list of variables for the given variable
 void fireCommand(int ndx)
          Fire the current command If a command returns true, then the command fired.
 void firstPass(int ndx)
          Perform first pass of compile
private  WrpValidateCsv.Variable getArg1(WrpValidateCsv.ValidationRule rule)
          Get the Arg1 variable for the specified rule
private  WrpValidateCsv.Variable getArg2(WrpValidateCsv.ValidationRule rule)
          Get the Arg2 variable for the specified rule
 int getCompileErrors()
          get the number of compile-time errors for the most recent compile
 WrpCsv getCsv()
          Return the current CSV object to be (or was) validated
 java.util.Vector<java.lang.String> getErrMsgs()
          Return the validation messages from the most recent validation.
 java.lang.String getHeaderLine()
          Get the expected header line.
private  WrpValidateCsv.Variable getInVar(WrpValidateCsv.ValidationRule rule)
          Obtain the InVar variable for this rule
 java.util.Vector<java.lang.String> getNormalized()
          Get the list of normalized strings.
 java.lang.String getNormString()
          Get the normalized form of the record.
 int getNumberRules()
          Get the current number of rules.
private  WrpValidateCsv.Variable getOutVar(WrpValidateCsv.ValidationRule rule)
          Obtain the OutVar variable for the specified rule
 java.util.Vector<WrpValidateCsv.ValidationRule> getRules()
          Return the current list of validation rules.
 int getRunErrors()
          get the number of run-time errors for the latest execution
 int getRunTot()
          get the total number of run-time errors since compilation
 int getStat()
          Get the validation status for the record that was most recently validated.
 java.lang.String litVal(java.lang.String litval)
          Resolve any variable expressions.
 void loadRules(java.sql.ResultSet rs)
          Obtain rules from database and load into list using the given ResultSet.
 void logError(int ndx, java.lang.String msg)
          Add error message to list of messages
static void main(java.lang.String[] args)
          Example driver for this class.
 void mapField(int ndx)
          Map a field into a local variable.
 void mapFields()
          Map the defined fields in CSV to variables.
 WrpValidateCsv.ValidationRule newRule()
          A constructor for the ValidationRule subclass.
 boolean nodata(int n1, int n2)
          Determine if the specified fields are empty or contain only white space
 java.lang.String norm(int ndx, java.lang.String strIn, java.lang.String pattern)
          Normalize the input string using the specified pattern.
protected  void nullArg(int ndx)
          Raise exception if a null argument is passed for a required argument.
 int numErrors()
          Return the number of compile errors that were encountered during the most recent compilation.
 java.lang.String patt(java.lang.String str, java.lang.String pattern)
          Apply the specified pattern to the specified string
static java.lang.String recStat(int num)
          convert record status number to a string
static int recStat(java.lang.String name)
          Convert record status from string to integer
 void secondPass(int ndx)
          Perform second compile pass
 void setCsv(WrpCsv csv)
          Specify the CSV object to be validated.
 void setCsvLine(int csvLine)
          Set the CSV line number.
 void setErrMsgs(java.util.Vector<java.lang.String> msgs)
          Set the error messages for this validation.
 void setHeaderLine(java.lang.String headerLine)
          Set the expected header line.
 void setLog(java.io.PrintStream log)
          Set the log file for debugging.
 void setRules(java.util.Vector<WrpValidateCsv.ValidationRule> rules)
          Specify the validation rules to be used when validating lines
 void setZipCodes(WrpZipCode zc)
          This method imports a list of ZipCodes that can be used to validate city/state/zipcode combinations.
 void storeLabel(int ndx)
          Add statement label of this rule to list of labels
 void typeArg1(int ndx, int type)
          Check data type for literal Arg1
 void typeArg2(int ndx, int type)
          Check data type for literal Arg2
 void typeInvar(int ndx, int type)
          Check data type for InVar
 void typeOutvar(int ndx, int type)
          Check data type for OutVar
 int validate()
          Validate the given CSV record using the specified rules
 int validate(int csvLine)
          Validate the current record using the specified line number.
 int validate(java.util.Vector<WrpValidateCsv.ValidationRule> rules, int csvLine, WrpCsv csv)
          Validate the given CSV record using the given rules
 int validate(WrpCsv csv)
          Validate the specified CSV using the previously supplied rules
 int validateHeader()
          Validate the header line.
protected  boolean validateZipCode(java.lang.String str, WrpValidateCsv.Variable outvar)
          Validate the city/state/zip combination.
static java.lang.String varType(int num)
          Convert type number to name
static int varType(java.lang.String name)
          Convert type name to integer value
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TYPE_STR

public static final int TYPE_STR
data type string

See Also:
Constant Field Values

TYPE_INT

public static final int TYPE_INT
data type integer

See Also:
Constant Field Values

TYPE_FLT

public static final int TYPE_FLT
data type float

See Also:
Constant Field Values

TYPE_BOOL

public static final int TYPE_BOOL
data type boolean

See Also:
Constant Field Values

TYPE_ERR

public static final int TYPE_ERR
data type in error

See Also:
Constant Field Values

MAP_REC

public static final int MAP_REC
entire record

See Also:
Constant Field Values

MAP_NFLDS

public static final int MAP_NFLDS
number of fields

See Also:
Constant Field Values

MAP_ERR

public static final int MAP_ERR
invalid map

See Also:
Constant Field Values

REC_GOOD

public static final int REC_GOOD
record status is good

See Also:
Constant Field Values

REC_WARN

public static final int REC_WARN
record has warnings

See Also:
Constant Field Values

REC_FUZZY

public static final int REC_FUZZY
record is suspicious

See Also:
Constant Field Values

REC_BAD

public static final int REC_BAD
record has errors

See Also:
Constant Field Values

REC_NONE

public static final int REC_NONE
record is missing

See Also:
Constant Field Values

REC_ERROR

public static final int REC_ERROR
record status unknown

See Also:
Constant Field Values

UNKNOWN

public static final java.lang.String UNKNOWN
Used as String representation of unknown / error values

See Also:
Constant Field Values

csv

public WrpCsv csv
the current CSV file that is being validated


trimall

public boolean trimall
flag to trim all fields (default true)


patt

public WrpPattern patt
pattern building object


rules

public java.util.Vector<WrpValidateCsv.ValidationRule> rules
the set of validation rules to be used for validation


normalized

public java.util.Vector<java.lang.String> normalized
array of normalized fields for this record


normString

public java.lang.String normString
output file for normalized record


csvLine

private int csvLine
line number of this CSV


prevErr

private int prevErr
previous error line (if any)


headerLine

private java.lang.String headerLine
expected header line


headerCsv

private WrpCsv headerCsv
parsed header line


labels

protected java.util.Vector<WrpValidateCsv.GoToLoc> labels
list of statement labels- created during compile time. This variable contains a list of statement labels that were found during the first pass of the compile step. It is used to reset the IC (instruction counter) for a GOTO command. This variable is also tested to determine if the rules were compiled. If the value is null, then the rules must be compiled before they can be executed.


variables

protected java.util.Vector<WrpValidateCsv.Variable> variables
list of declared variables


maps

protected java.util.Vector<WrpValidateCsv.Variable> maps
field mapping for each record


literals

protected java.util.Vector<WrpValidateCsv.Variable> literals
defined literals (arg1 and arg2)


compileErrors

protected int compileErrors
compilation errors for rules


errMsgs

java.util.Vector<java.lang.String> errMsgs
list of error messages


lastseq

protected int lastseq
previous sequence number. Used to check for duplicate sequence numbers.


zipCodes

protected WrpZipCode zipCodes

prtLog

java.io.PrintStream prtLog
Log file, if logging is enabled. This variable can be set by the client using the setLog(PrintStream) method.


runFlg

protected int runFlg
run flag - zero means done; non-zero, keep running


runErrors

protected int runErrors
the number of run-time errors for this CSV validation


runTot

protected int runTot
the total number of run-time errors for this compilation


IC

protected int IC
instruction counter - location of next instruction to be executed


Stat

protected int Stat
current status of the current record


breakpt

protected int breakpt
breakpoint sequence number (place actual breakpoint in executeCommand). In the executeCommand() routine, there is an if statement that tests if the current sequence number is equal to the value of the following variable. If the two are equal, a debug line is printed. You can use your own debugger to place a breakpoint at the print statement, which will pause the program when the specified line is reached. You can then use the debugger to examine variable, begin stepping instructions, etc.

You can modify the source code so that breakpt isn't initialized to -1 (which effectively turns off this feature) or you can place a breakpoint at some place within the program, and then use the debugger to change the value of breakpt.

Constructor Detail

WrpValidateCsv

public WrpValidateCsv()
Default constructor. This constructor creates an object without any rules or CSV string defined.

Once created, the object can be used to validate CSV strings by first specifying a list of rules with the setRules method, and then calling the validate method for each string.


WrpValidateCsv

public WrpValidateCsv(java.util.Vector<WrpValidateCsv.ValidationRule> rules)
Constructor with validation rules. After this constructor creates the validation object, any number of CSV strings can be validated with the validate method

Parameters:
rules - the validation rules to use

WrpValidateCsv

public WrpValidateCsv(java.util.Vector<WrpValidateCsv.ValidationRule> rules,
                      int csvLine,
                      WrpCsv csv)
Create a validation object with the given rules and then validate the given CSV string. The csvLine parameter is used to determine if the text of the line should be added to the list of validation messages See the method logError.

Parameters:
rules - a vector containing the validation rules.
csvLine - a sequential line number within the CSV line for this string
csv - the CSV object to be validated
Method Detail

newRule

public WrpValidateCsv.ValidationRule newRule()
A constructor for the ValidationRule subclass.

This method can be used to create a new empty validation rule. It is most likely used while reading validation rules from the database (or other source).

Returns:
an empty ValidationRule object

setCsv

public void setCsv(WrpCsv csv)
Specify the CSV object to be validated.

This method is called by the validate(WrpCsv csv) method, followed by a call to validate(). It isn't clear why a user would want to call this method, but at the same time there is no reason to prevent them.

Parameters:
csv - the WrpCsv object that is to be validated

getCsv

public WrpCsv getCsv()
Return the current CSV object to be (or was) validated

Returns:
the current WrpCsv object for validation

setCsvLine

public void setCsvLine(int csvLine)
Set the CSV line number.

The csvLine variable is used to determine if the original input line is to be included in the validation messages. The previous value is stored internally so that the original line is only included the first time that logError is called with that line number. This prevents the input line from being repeated if a line contains more than one validation error (or generates more than one valiation message).

Parameters:
csvLine -

getHeaderLine

public java.lang.String getHeaderLine()
Get the expected header line. The first line of the CSV file should contain the fields contained in this string.

Returns:
the expected header line

setHeaderLine

public void setHeaderLine(java.lang.String headerLine)
Set the expected header line. The first line of the CSV file should contain the same fields as are contained in the expected header line.

Parameters:
headerLine -

getStat

public int getStat()
Get the validation status for the record that was most recently validated.

The status value is normally returned from the variout validate methods.

Returns:
the saved valiation message

setRules

public void setRules(java.util.Vector<WrpValidateCsv.ValidationRule> rules)
Specify the validation rules to be used when validating lines

Parameters:
rules - a vector of validation rules

setZipCodes

public void setZipCodes(WrpZipCode zc)
This method imports a list of ZipCodes that can be used to validate city/state/zipcode combinations.

Parameters:
zc - a ZipCode structure that has been loaded with ZipCodes.

setLog

public void setLog(java.io.PrintStream log)
Set the log file for debugging. This method can be used to toggle debugging on and off. If the argument is a printstream object, debugging information will be written to this file. If the argument is null, then debugging will be disabled.

Parameters:
log - a printstream or null to disable debugging

getNormalized

public java.util.Vector<java.lang.String> getNormalized()
Get the list of normalized strings. This method can be used if you want to access the individual strings. You could also obtain the same information by getting the normalized string, which is a CSV line with the same info.

Returns:
the normalized

getNormString

public java.lang.String getNormString()
Get the normalized form of the record. This method can be used to extract a normalized form of a CSV line. The normalized form has certain fields formatted in a standardized manner. For example, a variety of phone number formats can be accepted (e.g., 123-456-7890, (123)456-7890, 123.456.7890, etc.) but the normalized form can be made into a specific format (e.g., 123-456-7890).

Returns:
the output file for a normalized record

numErrors

public int numErrors()
Return the number of compile errors that were encountered during the most recent compilation.

Returns:
the number of compile errors, or -1 if no compile was performed since the validation rules were obtained.

getErrMsgs

public java.util.Vector<java.lang.String> getErrMsgs()
Return the validation messages from the most recent validation.

Returns:
a vector of strings containing the validation messages.

setErrMsgs

public void setErrMsgs(java.util.Vector<java.lang.String> msgs)
Set the error messages for this validation. This method is used when validating a new format lists. The first line is not a shipper line, but rather a header We don't want to validate, but need to have a place-holder for the messages.

Parameters:
msgs - the list of error messages

getCompileErrors

public int getCompileErrors()
get the number of compile-time errors for the most recent compile


getRunErrors

public int getRunErrors()
get the number of run-time errors for the latest execution


getRunTot

public int getRunTot()
get the total number of run-time errors since compilation


loadRules

public void loadRules(java.sql.ResultSet rs)
               throws java.sql.SQLException
Obtain rules from database and load into list using the given ResultSet.

This method loops through the ResultSet and reads the resulting valiation rules into the instance vector rules. Any previous set of rules are replace by these rules.

Parameters:
rs - RecordSet containing rules query
Throws:
java.sql.SQLException

getRules

public java.util.Vector<WrpValidateCsv.ValidationRule> getRules()
Return the current list of validation rules.

Returns:
the current validation rules

addRule

public void addRule(WrpValidateCsv.ValidationRule rule)
Add a new rule to the current list of rules. The given rule is added to the instance vector rules. These additions must be in sequential order, since validation will be performed sequentially by executing the rules, starting with the first rule.

Parameters:
rule -

getNumberRules

public int getNumberRules()
Get the current number of rules. This method returns how many rules are currently stored for execution.

Returns:
the number of rules

compileRules

public int compileRules()
check rules for syntax, build list of destinations and variables, validate GoTo destinations

Compilation takes place in two passes. The first pass is performed on all the rules, followed by the second pass.


firstPass

public void firstPass(int ndx)
Perform first pass of compile

This pass will build lists of statement labels and declared variables. It will generate error messages for invalid declare statements

Parameters:
ndx - index into rules for this rule

storeLabel

public void storeLabel(int ndx)
Add statement label of this rule to list of labels

Parameters:
ndx - index into rules of the rule

declareVariable

public void declareVariable(int ndx)
Add variable declaration to list of variables

Parameters:
ndx - index into rules for this declaration

mapField

public void mapField(int ndx)
Map a field into a local variable.

This routine is called during the first pass when a map command is encountered. Create a new local variable with the specified name (if it doesn't already exist), and map it to field arg1, with type arg2. The default field type is String, except for NFLDS, which is type Integer. The following special field names are reserved words:

For each new string to be validated, the string is parsed (if not a WrpCsv object), and the input fields are copied to the mapped variables.

Parameters:
ndx - the location of the map statement

findVariable

public int findVariable(java.lang.String vname)
Find specified variable in list of variables for the given variable

Parameters:
vname - name of variable to find
Returns:
index of variable, or -1 if not found

findLiteral

public int findLiteral(java.lang.String litval)
Find (or create) literal for the given value

Search for the specified literal value. If not found, create a new literal for it. Return the location of a variable for this literal

This implies that literals are constant values. There is currently no command that will allow the user to modify these values.

Literal values can have encoded values of the form {{value}}, where value is the name of the literal code. See the litVal method.

Parameters:
litval - literal value to find
Returns:
the location in literals for this value

litVal

public java.lang.String litVal(java.lang.String litval)
Resolve any variable expressions. Strings of the form {{str}} are converted into a character represented by str

Special literal values can be encoded by enclosing a key word between curly braces. These notations are then replaced by the string they represent. Currently defined literals are:

Parameters:
litval - the value of the literal
Returns:
the input argument, if no encoding literals; otherwise, the translated string

findMap

public int findMap(java.lang.String name)
Find the map entry for the specified field

Parameters:
name - the name of the map field to find
Returns:
the index into maps for the field, or -1 if not found

findGoTo

public int findGoTo(java.lang.String label)
Find target destination of GoTo in list of statement labels

Parameters:
label - the statement label to find
Returns:
the location of the label in labels, or -1 if not found

secondPass

public void secondPass(int ndx)
Perform second compile pass

This pass will validate commands and resolve any references (variables or goto) by inserting the index to the variable or rule. It will also detect any undeclared variables or undefined goto targets. It will also generate errors for invalid syntax.

Parameters:
ndx - index into rules for the rule to be compiled

checkSyntax

public void checkSyntax(int ndx)
Check the syntax of the specified command There is a validation segment for each instruction / command.

Parameters:
ndx - the index into rules for the specified command

chkInvarCmdArg1

public void chkInvarCmdArg1(int ndx)
Check syntax for form: invar cmd arg1 This command can be used for any type of literal. If the argument should be numeric or string, then test for the specific forms

Parameters:
ndx - the index into rules for this rule

chkInvarCmdOutvar

public void chkInvarCmdOutvar(int ndx)
Check syntax for form: invar cmd outvar

Parameters:
ndx - index into rules for this rule

chkInvarCmdArg1Outvar

public void chkInvarCmdArg1Outvar(int ndx)
Check syntax for form: invar cmd arg1 outvar

Parameters:
ndx - index into rules for this rule

chkCmdArg1

public void chkCmdArg1(int ndx)
Check syntax for the form: cmd arg1

Parameters:
ndx - index into rules for this rule

chkCmdArg1Outvar

public void chkCmdArg1Outvar(int ndx)
Check syntax for form: cmd arg1 outvar

Parameters:
ndx - index into rules for this rule

typeInvar

public void typeInvar(int ndx,
                      int type)
Check data type for InVar

Parameters:
ndx - index into variables for this variable
type - expected data type

typeArg1

public void typeArg1(int ndx,
                     int type)
Check data type for literal Arg1

Parameters:
ndx - index into literals for element
type - expected data type

typeArg2

public void typeArg2(int ndx,
                     int type)
Check data type for literal Arg2

Parameters:
ndx - index into literals for this element
type - expected data type

typeOutvar

public void typeOutvar(int ndx,
                       int type)
Check data type for OutVar

Parameters:
ndx - index into variables for element
type - expected data type

executeRules

private int executeRules()
                  throws WrpValidateCsv.ValidationRuntimeException
Validate supplied CSV using supplied rules

The rules should be compiled before this routine is called. The supplied rules are executed against the supplied CSV

Returns:
the number of run-time validation errors
Throws:
WrpValidateCsv.ValidationRuntimeException

executeRules

public int executeRules(int csvLine)
Execute the current list of rules, with the current line number

The line number can be used if the same compilation will be used to execute against multiple CSV entries. The line number is used to determine if the input line should be displayed.

Parameters:
csvLine - the line number for the current CSV entry
Returns:
the record status of the CSV

mapFields

public void mapFields()
Map the defined fields in CSV to variables. Loop through the defined field maps and copy the corresponding CSV field to the mapped variable.


executeCommand

public java.lang.Boolean executeCommand(int ndx)
                                 throws WrpValidateCsv.ValidationRuntimeException
Execute the current command

Parameters:
ndx - index into Rules of the command to execute
Returns:
the result of the command (true means the command fired)
Throws:
WrpValidateCsv.ValidationRuntimeException

fireCommand

public void fireCommand(int ndx)
Fire the current command If a command returns true, then the command fired. When this happens, the following happens:


decodeMsg

public java.lang.String decodeMsg(int ndx,
                                  java.lang.String msg)
Replace variable names in a string with their values.

If a string of the form "$[varname]" is present, replace it with the current value of the variable.

Parameters:
ndx - index into rules for the current rule
msg - text of the message string to be decoded
Returns:
the resulting string, with values replaces

getInVar

private WrpValidateCsv.Variable getInVar(WrpValidateCsv.ValidationRule rule)
Obtain the InVar variable for this rule

Parameters:
rule - the rule containing the specified invar
Returns:
Variable for InVar

getOutVar

private WrpValidateCsv.Variable getOutVar(WrpValidateCsv.ValidationRule rule)
Obtain the OutVar variable for the specified rule

Parameters:
rule - the rule containing the OutVar
Returns:
Variable for OutVar

getArg1

private WrpValidateCsv.Variable getArg1(WrpValidateCsv.ValidationRule rule)
Get the Arg1 variable for the specified rule

Parameters:
rule - the rule containing the Arg1 literal
Returns:
Variable for Arg1

getArg2

private WrpValidateCsv.Variable getArg2(WrpValidateCsv.ValidationRule rule)
Get the Arg2 variable for the specified rule

Parameters:
rule - the rule containing the Arg2 literal
Returns:
Variable for Arg2

ascii

public boolean ascii(java.lang.String fld)
Test the input string for non-ASCII characters

Parameters:
fld - input string to be tested
Returns:
true if a non-ASCII character found; false otherwise

nodata

public boolean nodata(int n1,
                      int n2)
Determine if the specified fields are empty or contain only white space

Parameters:
n1 - lowest field to check (base 1)
n2 - highest field to check (base 1)
Returns:
true if no data present; false otherwise

patt

public java.lang.String patt(java.lang.String str,
                             java.lang.String pattern)
Apply the specified pattern to the specified string

Parameters:
str - input string
pattern - the pattern code to be applied to the string
Returns:
the resulting output string with pattern codes

norm

public java.lang.String norm(int ndx,
                             java.lang.String strIn,
                             java.lang.String pattern)
Normalize the input string using the specified pattern. Each character in the pattern string is compared with the next character from the input string. If the pattern character is a match character, then the input character is either copied or transformed to the output string. Some match characters apply to a single character in the input string, while others match the entire string. If the pattern character is not a match character, then the pattern character is copied to the output string, and the input character is compared with the next pattern character. For example, the pattern "999-999-9999" would normalize the input string "1234567890" to "123-456-7890". Also, the pattern "u" will normalize the input string "usa" to "USA".

Supported match characters:

Notice that the characters "X", "9", and "a" can't be inserted into the output string using the pattern.

Parameters:
ndx - index into rules for current rule
strIn - input string
pattern - normalization pattern
Returns:
the resulting string

nullArg

protected void nullArg(int ndx)
                throws WrpValidateCsv.ValidationRuntimeException
Raise exception if a null argument is passed for a required argument. This function should never be called if the rules are correct. This method reports that a null argument was passed to a command, but the command required to argument to function properly. The method throws a runtime error with the appropriate error message.

Parameters:
ndx - the index into Rules of the command being executed.
Throws:
WrpValidateCsv.ValidationRuntimeException

validateZipCode

protected boolean validateZipCode(java.lang.String str,
                                  WrpValidateCsv.Variable outvar)
Validate the city/state/zip combination. This method determins if the zip code is valid for the city and state combination. This method uses the USPS database.

Parameters:
str - input string: "city|state|zip"
outvar - error message, if any
Returns:
true if valid; false otherwise (with error message in outvar)

logError

public void logError(int ndx,
                     java.lang.String msg)
Add error message to list of messages

Parameters:
msg - an error message explaining the problem
ndx - location of invalid rule

validate

public int validate(java.util.Vector<WrpValidateCsv.ValidationRule> rules,
                    int csvLine,
                    WrpCsv csv)
Validate the given CSV record using the given rules

Parameters:
rules - the validation rules to use
csvLine - the line number to use for this validation
csv - the CSV record to validate
Returns:
the number of run-time validation errors

validate

public int validate(int csvLine)
Validate the current record using the specified line number. This method validates the current record using existing rules, but first setting the line number to the specified value.

Parameters:
csvLine - the line number to use for this validation
Returns:
the number of run-time errors

validate

public int validate()
Validate the given CSV record using the specified rules

Returns:
the number of compilation or run-time validation errors

validate

public int validate(WrpCsv csv)
Validate the specified CSV using the previously supplied rules

Parameters:
csv - CSV line to be validated
Returns:
the number of run-time errors

validateHeader

public int validateHeader()
Validate the header line. This method compares the current line with the header string. It should be called by validate() when the first line is read, if the headerLine string has been set.

Returns:
Record status (0 if good - see constants REC_*)

varType

public static java.lang.String varType(int num)
Convert type number to name

Parameters:
num - type number (TYPE_something)
Returns:
a string representation of the type

varType

public static int varType(java.lang.String name)
Convert type name to integer value

Parameters:
name - the name of the type
Returns:
the integer value for the type (TYPE_something)

recStat

public static int recStat(java.lang.String name)
Convert record status from string to integer

Parameters:
name - the record status name
Returns:
an integer representation for the record status

recStat

public static java.lang.String recStat(int num)
convert record status number to a string

Parameters:
num - a numerical representation of the record status
Returns:
the string representation for the record status

main

public static void main(java.lang.String[] args)
Example driver for this class. It is assumed that the user will create their own main program that will use their own database rather than the example UtilDb.

Parameters:
args - not used