MtSegThe Multext multilingual segmenter toolsDi Cristo Philippe, CNRS |
The purpose of the segmenter is to split a text into words and special tokens such as abbreviations and numbers, as well as certain multi-word units, and to detect and mark sentence boundaries.
The segmenter has been developed in the context of the MULTEXT project.
Please send comments and suggestions to multext@lpl.univ-aix.fr. |
Copyright © Centre National de la Recherche Scientifique, 1996.
This document is only a draft and should be cited as such. Creators of WWW documents pointing to it are warned that its content and location may change without notice. This document is provided as is without any express or implied warranties. While every effort has been taken to ensure the accuracy of the information contained, the authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. Permission is granted to make and distribute verbatim copies of this document for non-commercial purposes provided this copyright, disclaimer and permission notice are preserved on all copies.
This document is better viewed with Netscape
| Top
| Next
| LPL/CNRS
| MULTEXT
|