1.0 KiB
QuickPeep Seed Formats
Motivation
The QuickPeep seed format is a simple textual format that can be used to house seeds (initial URLs for the raker and categories for the indexer).
The main seed pack will be tracked in Git and released under an open data licence.
It may be useful to other projects to have such a data set available. Contributions to the set of seeds will help the search engine gain results.
The format
# Remark
Category1, Category2:
https://example.org
https://example.com/blah/* [Tag3, Tag4]
Category4, Category5:
https://blahblahblah.com
A file consists of blocks (perhaps only one block). A block starts with header line: a comma-separated list of tags (usually broad categories) followed by a colon. The block then continues with 1 URL or URL pattern per line. A URL or URL pattern may optionally be followed by a square-bracketed list of additional tags.
A block should ideally end on a blank line, but this is not required.
Blank lines and lines beginning with #
are ignored.