Package boilerpipe: Information
Source package: boilerpipe
Version: 1.2.0-alt1_13jpp8
Build time: May 27, 2019, 04:08 AM in the task #230257
Category: Development/Java
Report package bugHome page: https://github.com/kohlschutter/boilerpipe
License: ASL 2.0
Summary: Boilerplate Removal and Fulltext Extraction from HTML pages
Description:
The boilerpipe library provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page. The library already provides specific strategies for common tasks (for example: news article extraction) and may also be easily extended for individual problem settings. Extracting content is very fast (milliseconds), just needs the input document (no global or site-level information required) and is usually quite accurate.
Maintainer: Igor Vlasenko
Last changed
May 25, 2019 Igor Vlasenko 1.2.0-alt1_13jpp8
- new version
Feb. 5, 2019 Igor Vlasenko 1.2.0-alt1_12jpp8
- fc29 update
April 15, 2018 Igor Vlasenko 1.2.0-alt1_11jpp8
- java update