Update the README a little bit

2022-07-02 22:55:18 +01:00 · 2022-07-02 22:55:18 +01:00 · d8d6f13f7e
commit d8d6f13f7e
parent 09f70ad8ce
1 changed files with 11 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -26,11 +26,11 @@ If you need to fall back to a conventional search engine, this will eventually b

 *Crossed-out things are aspirational and not yet implemented.*

- ~~Shareable 'rakepacks', so that anyone can run their own search instance without needing to rake (crawl) themselves~~
-    - ~~Dense encoding to minimise disk space usage; compressed with Zstd?~~
+- Shareable 'rakepacks', so that anyone can run their own search instance without needing to rake (crawl) themselves
+    - Dense encoding to minimise disk space usage; compressed with Zstd.
 - Raking (crawling) support for
    - HTML (including redirecting to Canonical URLs)
-        - ~~Language detection~~
+        - Language detection for when the metadata is absent.
    - Redirects
    - ~~Gemtext over Gemini~~
    - RSS, Atom and JSON feeds
@ -43,9 +43,9 @@ If you need to fall back to a conventional search engine, this will eventually b
 - Article content extraction, to provide more weight to words found within the article content (based on a Rust version of Mozilla's *Readability* engine)
 - (Misc)
    - ~~Use of the Public Suffix List~~
-    - ~~Tagging URL patterns; e.g. to mark documentation as 'old'.~~
+    - Tagging URL patterns; e.g. to mark documentation as 'old'.
 - ~~Page duplicate content detection (e.g. to detect `/` and `/index.html`, or non-HTTPS and HTTPS, or non-`www` and `www`...)~~
- ~~Language detection for pages that don't have that metadata available.~~
+


 ## Limitations
@ -62,11 +62,17 @@ If you need to fall back to a conventional search engine, this will eventually b

 *Not written yet.*

+The stages of the QuickPeep pipeline are briefly described in [an introductory blog post][qp_intro_blog].
+
+[qp_intro_blog]: https://o.librepush.net/blog/2022-07-02-quickpeep-small-scale-web-search-engine
+

 ## Development and Running

 *Not written yet.*

+Some hints may be obtained from the introductory blog post mentioned in the 'Architecture' section, but it's probably quite difficult to follow right now.
+

 ### Helper scripts