Package boilerpipe: Information
Source package: boilerpipe
Version: 1.2.0-alt1_12jpp8
Build time: Feb 6, 2019, 11:05 AM in the task #220657
Category: Development/Java
Report package bugFTBFS | ||
---|---|---|
Architecture | FTBFS since | Update |
x86_64 | Aug. 1, 2021 | May 12, 2024 |
i586 | Aug. 1, 2021 | Aug. 1, 2021 |
Home page: https://github.com/kohlschutter/boilerpipe
License: ASL 2.0
Summary: Boilerplate Removal and Fulltext Extraction from HTML pages
Description:
The boilerpipe library provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page. The library already provides specific strategies for common tasks (for example: news article extraction) and may also be easily extended for individual problem settings. Extracting content is very fast (milliseconds), just needs the input document (no global or site-level information required) and is usually quite accurate.
Maintainer: Igor Vlasenko
Last changed
Feb. 5, 2019 Igor Vlasenko 1.2.0-alt1_12jpp8
- fc29 update
April 15, 2018 Igor Vlasenko 1.2.0-alt1_11jpp8
- java update
Nov. 9, 2017 Igor Vlasenko 1.2.0-alt1_10jpp8
- fc27 update