This is levenshtein.egg.info, produced by makeinfo version 4.7 from
eggdoc-output.texi.

INFO-DIR-SECTION The Algorithmic Language Scheme
START-INFO-DIR-ENTRY
* levenshtein.egg: (levenshtein.egg).		Levenshtein edit distance
END-INFO-DIR-ENTRY


File: levenshtein.egg.info,  Node: Top,  Next: About this egg,  Up: (dir)

levenshtein egg
***************

Levenshtein edit distance

Written by Kon Lovett (mailto:klovett@pacbell.net)

   This manual corresponds to version 1.3 of the levenshtein extension
library for Chicken Scheme.

* Menu:

* About this egg::
* Documentation::
* Examples::
* References::
* License::
* Index::


File: levenshtein.egg.info,  Node: About this egg,  Next: Documentation,  Prev: Top,  Up: Top

1 About this egg
****************

* Menu:

* Version history::
* Requirements::


File: levenshtein.egg.info,  Node: Version history,  Next: Requirements,  Up: About this egg

1.1 Version history
===================

`1.3'
     Major changes

`1.2'
     Switched to array-lib

`1.1'
     Requirement for srfi-43 was wrong [Thanks to Benedikt Rosenau]

`1.0'
     Initial release


File: levenshtein.egg.info,  Node: Requirements,  Prev: Version history,  Up: About this egg

1.2 Requirements
================

This egg requires the following extensions:

   `utf8', `numbers', `procedure-surface', `miscmacros', `misc-extn',
`syntax-case', `vector-lib', `array-lib'


File: levenshtein.egg.info,  Node: Documentation,  Next: Examples,  Prev: About this egg,  Up: Top

2 Documentation
***************

Levenshtein is a collection of procedures providing various
forms of the Levenshtein edit distance calculation.

   The Levenshtein edit distance has been used for areas as
diverse as soil sample and language dialect analysis. Not just for 			text
strings.

* Menu:

* 8-bit Characters Only::
* Procedure Means::
* Generic String::
* Generic Sequence::
* Only Vector - Baroque & Slow::
* Edit operations::
* Version 1.2 Compatibility::


File: levenshtein.egg.info,  Node: 8-bit Characters Only,  Next: Procedure Means,  Up: Documentation

2.1 8-bit Characters Only
=========================

Performs edit distance calculation for byte strings. All 				return
the total edit cost.


(require-extension levenshtein-byte)

 -- procedure: levenshtein-distance/byte
          (levenshtein-distance/byte SOURCE TARGET)

     Calculates the edit distance from the `SOURCE' to the `TARGET'.
     All costs are unitary.


(require-extension levenshtein-transpose-byte)

 -- procedure: levenshtein-distance/transpose-byte
          (levenshtein-distance/transpose-byte SOURCE TARGET)

     Calculates the edit distance from the `SOURCE' to the `TARGET',
     taking into 						account the Transpose operation. All costs are unitary.


File: levenshtein.egg.info,  Node: Procedure Means,  Next: Generic String,  Prev: 8-bit Characters Only,  Up: Documentation

2.2 Procedure Means
===================

Supplies the arithmetic and string operations for the 				distance
algorithms below.

   Should you wish to supply your own 'means' please see the
procedure-surface egg documentation, 				"levenshtein-*-surface.scm", and
"levenshtein-*-means.scm" 				in this egg for more information.


(require-extension levenshtein-utf8-means)

 -- procedure-means: levenshtein-utf8-means
          (levenshtein-utf8-means (levenshtein-string-surface))

     Uses procedures from the utf8 egg.


(require-extension levenshtein-octet-means)

 -- procedure-means: levenshtein-octet-means
          (levenshtein-octet-means (levenshtein-string-surface))

     Uses procedures from SRFI-13.


(require-extension levenshtein-numbers-means)

 -- procedure-means: levenshtein-numbers-means
          (levenshtein-numbers-means (levenshtein-numeric-surface))

     Uses procedures from the numbers egg.


(require-extension levenshtein-gennum-means)

 -- procedure-means: levenshtein-gennum-means
          (levenshtein-gennum-means (levenshtein-numeric-surface))

     Uses only fixnum and flonum procedures.


(require-extension levenshtein-fixnum-means)

 -- procedure-means: levenshtein-fixnum-means
          (levenshtein-fixnum-means (levenshtein-numeric-surface))

     Uses only fixnum procedures.


File: levenshtein.egg.info,  Node: Generic String,  Next: Generic Sequence,  Prev: Procedure Means,  Up: Documentation

2.3 Generic String
==================

Performs edit distance calculation for strings.


(require-extension levenshtein-generic-string)

 -- procedure: levenshtein-distance/generic-string
          (levenshtein-distance/generic-string SOURCE TARGET [#:insert-cost 1] [#:delete-cost 1] [#:substitute-cost 1] [#:=? char=?] [#:limit-cost #f] [#:numeric-means levenshtein-fixnum-means] [#:string-means levenshtein-octet-means])

     Calculates the edit distance from the `SOURCE' to the `TARGET'.

    `SOURCE'
          A string.

    `TARGET'
          A string.

    `insert-cost:'
          A number. The edit cost of an insert.

    `delete-cost:'
          A number. The edit cost of a delete.

    `substitute-cost:'
          A number. The edit cost of a substitute.

    `=?:'
          A procedure; (-> object object boolean). The equality
          predicate. Probably not useful to override the default
          predicate.

    `limit-cost:'
          A number or #f. Limit edit distance calculation to cost.
          Number type must be compatible with the numeric-means.

    `numeric-means:'
          A procedure-means. The arithmetic means.

    `string-means:'
          A procedure-means. The string means.


File: levenshtein.egg.info,  Node: Generic Sequence,  Next: Only Vector - Baroque & Slow,  Prev: Generic String,  Up: Documentation

2.4 Generic Sequence
====================


(require-extension levenshtein-generic-sequence)

 -- procedure: levenshtein-distance/generic-sequence
          (levenshtein-distance/generic-sequence SOURCE TARGET [#:insert-cost 1] [#:delete-cost 1] [#:substitute-cost 1] [#:get-work-vector make-vector] [#:=? eqv?] [#:limit-cost #f] [#:numeric-means levenshtein-fixnum-means] [#:string-means levenshtein-octet-means])

     Calculates the edit distance from the `SOURCE' to 					the `TARGET'.

    `SOURCE'
          A vector, string, or list.

    `TARGET'
          A vector, string, or list.

    `insert-cost:'
          A number. The edit cost of an insert.

    `delete-cost:'
          A number. The edit cost of a delete.

    `substitute-cost:'
          A number. The edit cost of a substitute.

    `get-work-vector:'
          A procedure; (-> number vector). Returns a work vector of
          length 'number', or larger.

    `=?:'
          A procedure; (-> object object boolean). The equality
          predicate. Must handle the types of the source & target
          elements in either argument position!

    `limit-cost:'
          A number or #f. Limit edit distance calculation to cost.
          Number type must be compatible with the numeric-means.

    `numeric-means:'
          A procedure-means. The arithmetic means.

    `string-means:'
          A procedure-means. The string means. Only used when
          source & target are strings.

     The conversion from string to vector is dependent on the
     binding of 'string->list' at the time of the call to
     'levenshtein-distance/generic-sequence. Can be an issue when
     argument types are mixed; string and vector, or string and 					list.
     String and string are passed on to
     'levenshtein-distance/generic-string'


File: levenshtein.egg.info,  Node: Only Vector - Baroque & Slow,  Next: Edit operations,  Prev: Generic Sequence,  Up: Documentation

2.5 Only Vector - Baroque & Slow
================================

Performs edit distance calculation for vectors. Allows 				definition of
new edit operations. Will keep track of edit 				operations
performed.


(require-extension levenshtein-vector)

 -- procedure: levenshtein-distance/vector*
          (levenshtein-distance/vector* SOURCE TARGET [EDIT-OPER ...] [#:=? char=?] [#:operations #f] [#:numeric-means levenshtein-fixnum-means])

     Calculates the edit distance from the source vector `SOURCE' to
     the target vector `TARGET'.  						Returns the total edit cost or
     (values <total edit cost> 						<performed operations matrix>).

    `SOURCE'
          A vector.

    `TARGET'
          A vector.

    `EDIT-OPER'
          Edit operation definitions to apply. Defaults are the 								basic
          Insert, Delete, and Substitute.

    `=?:'
          A procedure; (-> object object boolean). The equality
          predicate.

    `operations:'
          A boolean. Include the matrix of edit operations
          performed?

    `numeric-means:'
          A procedure-means. The arithmetic means.


(require-extension levenshtein-path-iterator)

 -- procedure: levenshtein-path-iterator
          (levenshtein-path-iterator MATRIX)

     Creates an optimal edit distance operation path iterator
     over the performed operations matrix `MATRIX'. The 						matrix
     is usually the result of an invocation of
     '(levenshtein-distance/vector* ... operations: #t)'.

     Each invocation of the iterator will generate a list of
     the form: ((cost source-index target-index 						levenshtein-operator)
     ...). The last invocation will return 						#f.


File: levenshtein.egg.info,  Node: Edit operations,  Next: Version 1.2 Compatibility,  Prev: Only Vector - Baroque & Slow,  Up: Documentation

2.6 Edit operations
===================

Edit operation specification. A set of base operations is
predefined, but may be overridden. The base set is identified by 				the
keys Insert, Delete, Substitute, and Transpose. A printer 				and
reader are provided for edit operations.


(require-extension levenshtein-operators)

 -- record: levenshtein-operator
    `key'
          A symbol. Key for the operation.

    `name'
          A string. Describes the operation.

    `cost'
          A number. The cost of the operation.

    `above'
          A non-negative fixnum. How far back in the source.

    `left'
          A non-negative fixnum. How far back in the target

 -- procedure: make-levenshtein-operator
          (make-levenshtein-operator KEY NAME COST ABOVE LEFT)

     Returns a new edit operator.

 -- procedure: levenshtein-operator?
          (levenshtein-operator? OBJECT)

     Is the `OBJECT' a levenshtein operator?

 -- procedure: clone-levenshtein-operator
          (clone-levenshtein-operator EDIT-OPERATION [#:key] [#:name] [#:cost] [#:above] [#:left])

     Returns a duplicate of the `EDIT-OPERATION', 						with field values
     provided by the optional keyword 						arguments. EDIT-OPERATION may be
     the key of the already 						defined edit operation.

 -- procedure: levenshtein-operator-ref
          (levenshtein-operator-ref KEY)

     Get the definition of an edit operation.

 -- procedure: levenshtein-operator-set!
          (levenshtein-operator-set! EDIT-OPERATION)

     Define an edit operation.

 -- procedure: levenshtein-operator-delete!
          (levenshtein-operator-delete! EDIT-OPERATION)

     Removes the `EDIT-OPERATION' 						definition. EDIT-OPERATION may be the
     key of the already 						defined edit operation.

 -- procedure: levenshtein-operator-reset
          (levenshtein-operator-reset)

     Restore defined edit operations to the base set.

 -- procedure: levenshtein-operator-equal?
          (levenshtein-operator-equal? A B)

     Are the levenshtein operators `A' & `B' equal for all fields?


File: levenshtein.egg.info,  Node: Version 1.2 Compatibility,  Prev: Edit operations,  Up: Documentation

2.7 Version 1.2 Compatibility
=============================


(require-extension levenshtein)

Warning - The numbers and utf8 eggs will be used!

 -- macro: levenshtein-distance/string
          (levenshtein-distance/string SOURCE TARGET [EQ INSERT-COST DELETE-COST SUBSTITUTE-COST])

     Calculates the edit distance from the source string `SOURCE' to
     the target string `TARGET'. The 						default equivalence procedure `EQ'
     is 'char=?'. The 						default costs `INSERT-COST', `DELETE-COST', and
     `SUBSTITUTE-COST' are unitary.

 -- macro: levenshtein-distance/list
          (levenshtein-distance/list SOURCE TARGET [EQ INSERT-COST DELETE-COST SUBSTITUTE-COST])

     Calculates the edit distance from the source list `SOURCE' to the
     target list `TARGET'. The 						default equivalence procedure
     `EQ' is 'equal?'. The 						default costs `INSERT-COST',
     `DELETE-COST', and `SUBSTITUTE-COST' are unitary.

 -- macro: levenshtein-distance/vector
          (levenshtein-distance/vector SOURCE TARGET [EQ INSERT-COST DELETE-COST SUBSTITUTE-COST])

     Calculates the edit distance from the source vector `SOURCE' to
     the target vector `TARGET'. The 						default equivalence procedure `EQ'
     is 'equal?'. The 						default costs `INSERT-COST', `DELETE-COST', and
     `SUBSTITUTE-COST' are unitary.

 -- macro: levenshtein-distance/sequence
          (levenshtein-distance/sequence SOURCE TARGET [EQ INSERT-COST DELETE-COST SUBSTITUTE-COST])

     Calculates the edit distance from the source sequence `SOURCE' to
     the target sequence `TARGET'.  						The default equivalence procedure
     `EQ' is 'char=?'.  						The default costs `INSERT-COST',
     `DELETE-COST', and `SUBSTITUTE-COST' are 						unitary.

 -- macro: levenshtein-distance/scratch
          (levenshtein-distance/scratch SOURCE TARGET [GET-SCRATCH EQ INSERT-COST DELETE-COST SUBSTITUTE-COST])

     Calculates the edit distance from the source vector `SOURCE' to
     the target vector `TARGET'. The 						default equivalence procedure `EQ'
     is 'equal?'. The 						default costs `INSERT-COST', `DELETE-COST', and
     `SUBSTITUTE-COST' are unitary. The default 						get-scratch procedure
     `GET-SCRATCH' is 						make-vector.

     Few, if any, programs will use this procedure directly.
     This is like levenshtein-distance/vector, but allows
     get-scratch to be specified. get-scratch is a procedure of 						one
     term, n, that yields a vector of length n or greater,
     which is used for record-keeping during execution of the
     Levenshtein algorithm. make-vector can be used for
     get-scratch, although some programs comparing a large size 						or
     quantity of vectors may wish to reuse a record-keeping
     vector, rather than each time allocating a new one that will 						need
     to be garbage-collected.


File: levenshtein.egg.info,  Node: Examples,  Next: References,  Prev: Documentation,  Up: Top

3 Examples
**********


(use levenshtein-vector levenshtein-path-iterator)

(define iter
	(levenshtein-path-iterator
		(levenshtein-distance/vector* "YWCQPGK" "LAWYQQKPGKA" operations: #t))
; ignoring interpreter feedback
(define r0 (iter))
(define t0 r0)
(define r1 (iter))
(define r2 (iter))
(define r3 (iter))
(define r4 (iter))
(define r5 (iter))
(iter)
; r0 now has #f, since the iterator finishes by returning to the initial caller, which is the
; body of '(define r0 (iter))', thus re-binding r0. However, t0 has the original returned value.

; Please see the 'levenshtein*test.scm' files in the levenshtein egg for more examples.


File: levenshtein.egg.info,  Node: References,  Next: License,  Prev: Examples,  Up: Top

4 References
************

Pictures of Pr. Levenstein
(http://www.uni-bielefeld.de/(en)/ZIF/FG/2002Combinatorics/02-reception.html)

   Levenshtein distance @ Wikipedia
(http://en.wikipedia.org/wiki/Levenshtein_distance)

   Levenshtein Distance
(http://www-igm.univ-mlv.fr/~lecroq/seqcomp/node2.html)

   Levenshtein distance
(http://www.reference.com/browse/wiki/Levenshtein_distance)

   Talk:Levenshtein distance @ Wikipedia
(http://en.wikipedia.org/wiki/Talk:Levenshtein_distance)


File: levenshtein.egg.info,  Node: License,  Next: Index,  Prev: References,  Up: Top

5 License
*********


Copyright (c) 2005, 2006, Kon Lovett.  All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the Software),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.


File: levenshtein.egg.info,  Node: Index,  Prev: License,  Up: Top

Index
*****

 [index ]
* Menu:

* clone-levenshtein-operator:            Edit operations.      (line 41)
* levenshtein-distance/byte:             8-bit Characters Only.
                                                               (line 13)
* levenshtein-distance/generic-sequence: Generic Sequence.     (line 10)
* levenshtein-distance/generic-string:   Generic String.       (line 12)
* levenshtein-distance/list:             Version 1.2 Compatibility.
                                                               (line 20)
* levenshtein-distance/scratch:          Version 1.2 Compatibility.
                                                               (line 44)
* levenshtein-distance/sequence:         Version 1.2 Compatibility.
                                                               (line 36)
* levenshtein-distance/string:           Version 1.2 Compatibility.
                                                               (line 12)
* levenshtein-distance/transpose-byte:   8-bit Characters Only.
                                                               (line 22)
* levenshtein-distance/vector:           Version 1.2 Compatibility.
                                                               (line 28)
* levenshtein-distance/vector*:          Only Vector - Baroque & Slow.
                                                               (line 14)
* levenshtein-fixnum-means:              Procedure Means.      (line 49)
* levenshtein-gennum-means:              Procedure Means.      (line 41)
* levenshtein-numbers-means:             Procedure Means.      (line 33)
* levenshtein-octet-means:               Procedure Means.      (line 25)
* levenshtein-operator:                  Edit operations.      (line 15)
* levenshtein-operator-delete!:          Edit operations.      (line 58)
* levenshtein-operator-equal?:           Edit operations.      (line 69)
* levenshtein-operator-ref:              Edit operations.      (line 48)
* levenshtein-operator-reset:            Edit operations.      (line 64)
* levenshtein-operator-set!:             Edit operations.      (line 53)
* levenshtein-operator?:                 Edit operations.      (line 36)
* levenshtein-path-iterator:             Only Vector - Baroque & Slow.
                                                               (line 45)
* levenshtein-utf8-means:                Procedure Means.      (line 17)
* make-levenshtein-operator:             Edit operations.      (line 31)



Tag Table:
Node: Top244
Node: About this egg630
Node: Version history809
Node: Requirements1109
Node: Documentation1397
Node: 8-bit Characters Only1969
Node: Procedure Means2759
Node: Generic String4212
Node: Generic Sequence5560
Node: Only Vector - Baroque & Slow7508
Node: Edit operations9334
Node: Version 1.2 Compatibility11552
Node: Examples14515
Node: References15252
Node: License15831
Node: Index17022

End Tag Table
