com.billpringle.utils.wrputils
Class WrpWordCount

java.lang.Object
  extended by com.billpringle.utils.wrputils.WrpWordCount

public class WrpWordCount
extends java.lang.Object

This class identifies what words appear in a string and how many. A key string is used to identify the specific entity that is beingn counted. For example, you can count words in several text documents, and report on how many times a given word appears in both documents.

Creative Commons License Creative Commons License Symbols Unless noted otherwise, all materials available for download from my site are copyrighted by Bill Pringle, and are licensed under a Creative Commons License.

Author:
Bill Pringle

Nested Class Summary
 class WrpWordCount.WordCount
          Internal class for word count information
 
Field Summary
protected  java.lang.String key
          unique key
protected  java.io.BufferedReader rdr
          reader for file
protected  java.util.Vector<WrpWordCount.WordCount> words
          collection of words
 
Constructor Summary
WrpWordCount()
          Default constructor
WrpWordCount(java.lang.String key)
          Constructor with key string
 
Method Summary
 void dump()
          Dump the word count information This method is useful for debugging.
 void dump(java.io.PrintStream prt)
          Dump the word count information using the specified stream
 int getCount(int cnt)
          Get the word count information for specified word
 int getCount(java.lang.String key, java.lang.String word)
          Get the count associated with the given key / word pair.
 java.lang.String getKey()
          Get the current key string
 java.lang.String getKey(int cnt)
          Get the key associated with the specified word
 int getNumberWords()
          Get the number of words that have been found
 java.lang.String getWord(int cnt)
          Get the specified word
 WrpWordCount.WordCount getWordCount(int cnt)
          Get the word count information associated with a specific word
 java.lang.String lettersOnly(java.lang.String str)
          Extract only letters and digits from string, removing punctuation, etc.
static void main(java.lang.String[] args)
          Test driver for the WrpWordCount class
 void parseFile()
          Parse the file using the predefined BufferedReader
 void parseFile(java.io.BufferedReader rdr)
          Parse a file using the specified reader
 void parseFile(java.lang.String fname)
          Parse the specified file and count words
 void parseString(java.lang.String str)
          Parse the specified strings into words
 void updateCount(java.lang.String key, java.lang.String word)
          Update count of words.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

key

protected java.lang.String key
unique key


rdr

protected java.io.BufferedReader rdr
reader for file


words

protected java.util.Vector<WrpWordCount.WordCount> words
collection of words

Constructor Detail

WrpWordCount

public WrpWordCount()
Default constructor


WrpWordCount

public WrpWordCount(java.lang.String key)
Constructor with key string

Parameters:
key - the key string
Method Detail

parseFile

public void parseFile(java.lang.String fname)
Parse the specified file and count words

Parameters:
fname - the name of the file to be read

parseFile

public void parseFile(java.io.BufferedReader rdr)
Parse a file using the specified reader

Parameters:
rdr - the BufferedReader to be used

parseFile

public void parseFile()
Parse the file using the predefined BufferedReader


parseString

public void parseString(java.lang.String str)
Parse the specified strings into words

Parameters:
str - the string to parse

lettersOnly

public java.lang.String lettersOnly(java.lang.String str)
Extract only letters and digits from string, removing punctuation, etc.

Parameters:
str - the string to extract from
Returns:
string with only letters or digits

updateCount

public void updateCount(java.lang.String key,
                        java.lang.String word)
Update count of words. If word is present, increment count If word not present, set count to one

Parameters:
key - key string
word - the word whose count is to be updated

getCount

public int getCount(java.lang.String key,
                    java.lang.String word)
Get the count associated with the given key / word pair.

Parameters:
key - key string
word - desired word
Returns:
the number of times the word was found

getNumberWords

public int getNumberWords()
Get the number of words that have been found

Returns:
the number of words

getWordCount

public WrpWordCount.WordCount getWordCount(int cnt)
Get the word count information associated with a specific word

Parameters:
cnt - which words to extract
Returns:
word count information for the specified word

getWord

public java.lang.String getWord(int cnt)
Get the specified word

Parameters:
cnt - the word number
Returns:
the actual word

getCount

public int getCount(int cnt)
Get the word count information for specified word

Parameters:
cnt - the number of the word
Returns:
the word count information for that word

getKey

public java.lang.String getKey()
Get the current key string

Returns:
the key string

getKey

public java.lang.String getKey(int cnt)
Get the key associated with the specified word

Parameters:
cnt - the number of the word to extract
Returns:
the word count information

dump

public void dump()
Dump the word count information This method is useful for debugging.


dump

public void dump(java.io.PrintStream prt)
Dump the word count information using the specified stream

Parameters:
prt - the PrintStream to use for the dump

main

public static void main(java.lang.String[] args)
Test driver for the WrpWordCount class

Parameters:
args - command line arguments