Commit Graph

225 Commits (main)

Author SHA1 Message Date
Olivier 'reivilibre' 3b1eae7f7e Remove forgotten println statements 2022-06-14 23:07:58 +01:00
Olivier 'reivilibre' 4896ecd426 Add tests and fixes
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-12 16:30:36 +01:00
Olivier 'reivilibre' aa4567c623 Use the sniffed encoding in page extraction 2022-06-12 15:49:02 +01:00
Olivier 'reivilibre' 5701b1e6d8 Add full-procedure sniffer 2022-06-12 15:41:17 +01:00
Olivier 'reivilibre' c451a12e44 Pass the bytes through when extracting HTML 2022-06-12 15:26:46 +01:00
Olivier 'reivilibre' c783f89f72 Create a crate for HTML charset detection 2022-06-12 14:47:42 +01:00
Olivier 'reivilibre' b08a883831 Update lock
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-11 20:40:23 +01:00
Olivier 'reivilibre' d1bbb91477 Decrease default crawl delay a little bit
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-11 00:58:00 +01:00
Olivier 'reivilibre' 504be33b8a Deny content based on content-type before downloading it 2022-06-11 00:57:24 +01:00
Olivier 'reivilibre' 5d1f35a8ee Deny content based on content-length header 2022-06-11 00:12:15 +01:00
Olivier 'reivilibre' bb396dfb5b Reinstate backoffs on startup
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-10 23:35:24 +01:00
Olivier 'reivilibre' fc69b1b192 Add backoff reinstatement function to store 2022-06-10 23:02:13 +01:00
Olivier 'reivilibre' 75afb8b559 Change lack of content-type to be a permanent failure
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
In practice, I see this happening on URLs with unknown filetypes
2022-06-05 10:16:17 +01:00
Olivier 'reivilibre' e88bf6cb44 Convert size limit hits into permanent failures 2022-06-05 10:12:18 +01:00
Olivier 'reivilibre' e66ac80484 Allow passing permanent failures up as errors 2022-06-05 10:09:03 +01:00
Olivier 'reivilibre' fb3eae9226 Accept forbidden robots.txt — if they forbid us from knowing about things, we will be cheeky
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-04 23:54:26 +01:00
Olivier 'reivilibre' d18d0635d7 Don't hammer robots.txt 2022-06-04 23:54:22 +01:00
Olivier 'reivilibre' d8f4baf9a3 Fix the database storage size limit
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-04 23:50:10 +01:00
Olivier 'reivilibre' d3600bfb73 Add metric for new enqueued URLs 2022-06-04 23:38:12 +01:00
Olivier 'reivilibre' aa08463499 Update the metrics more frequently to prevent spiking in rates
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-04 23:36:59 +01:00
Olivier 'reivilibre' bde4a7e5e2 Allow inspecting more domains
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-04 23:14:57 +01:00
Olivier 'reivilibre' f8756e1359 Implement --prefix for the DB inspector 2022-06-04 23:12:00 +01:00
Olivier 'reivilibre' 3d3ab4a580 Add a grafana dashboard 2022-06-04 23:09:46 +01:00
Olivier 'reivilibre' e2b4b3127d Simplify Flake (remove 'src' input that causes problem on 'old' Nixes)
ci/woodpecker/push/manual Pipeline is pending Details
ci/woodpecker/push/check Pipeline was successful Details
ci/woodpecker/push/release Pipeline was successful Details
2022-06-04 22:30:34 +01:00
Olivier 'reivilibre' 8e3a44ee5e Start moving Nix Flake into the root 2022-06-04 22:30:17 +01:00
Olivier 'reivilibre' 23733efd3f Add prototype raker module to the Nix flake
continuous-integration/drone the build was successful Details
2022-04-26 20:55:41 +01:00
Olivier 'reivilibre' 588d5bdf54 Sort out lack of Python (only in ARM64 build for some reason)
continuous-integration/drone the build canceled Details
2022-04-07 23:07:15 +01:00
Olivier 'reivilibre' e154835f77 Fix asset build
continuous-integration/drone the build canceled Details
2022-04-07 23:05:51 +01:00
Olivier 'reivilibre' 33b3bd8e49 Build the static assets prior to release
continuous-integration/drone the build canceled Details
2022-04-07 23:03:58 +01:00
Olivier 'reivilibre' d8f96895e6 Don't add too many hyphenminuses 2022-04-07 22:36:45 +01:00
Olivier 'reivilibre' e1c302eebf Fix CI tagging information?
continuous-integration/drone the build is pending Details
2022-04-07 22:23:55 +01:00
Olivier 'reivilibre' 17f4c03617 Fix release CI?
continuous-integration/drone the build was successful Details
2022-04-07 21:00:50 +01:00
Olivier 'reivilibre' 41066948fc Update Nix Flake to use unified config :-)
continuous-integration/drone the build was successful Details
2022-04-06 22:01:58 +01:00
Olivier 'reivilibre' 8ac99c154f Fix unified config in web 2022-04-06 21:50:17 +01:00
Olivier 'reivilibre' df356da498 Use the unified config in the indexer 2022-04-06 21:49:40 +01:00
Olivier 'reivilibre' 4fd2dc393e Use the unified config in the raker 2022-04-05 17:50:55 +01:00
Olivier 'reivilibre' dd0097c1aa Use the unified config in the web UI 2022-04-05 22:17:28 +01:00
Olivier 'reivilibre' 340b4e29a6 Simplify web config 2022-04-05 17:45:24 +01:00
Olivier 'reivilibre' 3c3e2fc0bf Delete the old config files 2022-04-05 17:45:24 +01:00
Olivier 'reivilibre' 616db3d633 Make a sample configuration file 2022-04-05 22:15:56 +01:00
Olivier 'reivilibre' bc2801d7f0 Fix the flake!
continuous-integration/drone the build was successful Details
2022-04-05 21:40:45 +01:00
Olivier 'reivilibre' ca45371d40 stash 2022-04-05 19:41:14 +01:00
Olivier 'reivilibre' 360a902263 Allow specifying the listener binding as CLI arg 1 2022-04-04 21:04:43 +01:00
Olivier 'reivilibre' 1c0b501a21 Fix libclang dep?
continuous-integration/drone the build was successful Details
2022-04-04 19:46:12 +01:00
Olivier 'reivilibre' 62c4e41162 Add libclang1 to the CI config
continuous-integration/drone the build failed Details
2022-04-03 18:27:11 +01:00
Olivier 'reivilibre' 96a01e0aaa Dissolve links before emitting documents to the pack store
continuous-integration/drone the build failed Details
Fixes #9
2022-04-03 10:47:18 +01:00
Olivier 'reivilibre' 6c2ff9daec Add minimum free space cutoff feature for the raker 2022-04-03 10:18:41 +01:00
Olivier 'reivilibre' f335d0daaa Working Nix flake based on Naersk 2022-04-02 21:27:25 +01:00
Olivier 'reivilibre' 001357c825 Have a good crack at trying to get buildRustPackage-based flake working 2022-04-01 23:50:36 +01:00
Olivier 'reivilibre' 99a4c91ac3 Have a good crack at trying to get a naersk-based flake working
continuous-integration/drone the build failed Details
It falls over because it needs the dev DB creating before building :-(
2022-04-01 23:24:27 +01:00