Text Summarizer is provided for the CLARIN VLO. The service is designed to automatically summarize arbitrary text documents. The service takes any text data as input and gives their summary at the output.
The task of summarizing can be defined as follows: for one or more documents at the input you need to make a concise and brief summary, which contains the most important information from the input documents. In this definition, “concise” means that the summary should be smaller than the input documents, “brief” means that it should be grammatically correct and coherent. The “importance” of information is determined by context and subject area.
The main areas of automatic summarizing application are:
- Solving problems of classification and clustering of texts.
- In library and search systems, acquaintance with the content of the document and addressing the issue of access to the original text.
- Summarizing news headings.
- In online search engines, creating snippets of input text that contain user query words and are used to describe links.
- Сompiling excerpts from email correspondence.
- In dialog systems, generating answers to questions using a summary of several documents.
The developed summarizing system is based on the extraction method. When using this method, the result of document processing is presented as a set of sentences. Among this set, the system selects those that best meet the specified criteria, i.e. are more relevant. The result is a subset of sentences of the source text.
The details are presented here.