Olivier 'reivilibre'
|
9866da2d16
|
Load documents and index them!
|
2022-03-24 23:45:36 +00:00 |
Olivier 'reivilibre'
|
5418afe8dd
|
Read rake pack records in the indexer
|
2022-03-24 23:37:11 +00:00 |
Olivier 'reivilibre'
|
4f4c3e36d1
|
Prepare indexer
|
2022-03-24 23:19:32 +00:00 |
Olivier 'reivilibre'
|
7fa302b2c2
|
Support flush() for Tantivy
|
2022-03-24 22:57:55 +00:00 |
Olivier 'reivilibre'
|
139fe380bc
|
Cargo fmt
|
2022-03-24 22:57:13 +00:00 |
Olivier 'reivilibre'
|
f43424de94
|
Add an open function for the Tantivy Backend
|
2022-03-24 22:55:21 +00:00 |
Olivier 'reivilibre'
|
73154e7e34
|
Move the indexer around
|
2022-03-24 19:50:20 +00:00 |
Olivier 'reivilibre'
|
7aa5521c5d
|
Think a bit about how indexers will fit together
continuous-integration/drone the build failed
Details
|
2022-03-23 20:57:51 +00:00 |
Olivier 'reivilibre'
|
1773ba4f44
|
Start fleshing out the indexer
|
2022-03-23 20:11:12 +00:00 |
Olivier 'reivilibre'
|
0060ec0764
|
Add seed sorting tool in order to approach first proof of concept
continuous-integration/drone the build failed
Details
|
2022-03-22 23:54:28 +00:00 |
Olivier 'reivilibre'
|
528e0bbf43
|
Cargo fix and fmt
|
2022-03-22 23:23:31 +00:00 |
Olivier 'reivilibre'
|
753d03327a
|
Fix weeds slipping in as seeds
|
2022-03-22 23:23:17 +00:00 |
Olivier 'reivilibre'
|
be84c0e1cc
|
Add DB inspection tool
|
2022-03-22 23:23:15 +00:00 |
Olivier 'reivilibre'
|
db9fe77c16
|
Re-apply seeds and weeds at import time, to on-hold URLs
|
2022-03-22 20:01:26 +00:00 |
Olivier 'reivilibre'
|
2f5131e690
|
Don't enqueue references if they're weeds
|
2022-03-22 19:56:10 +00:00 |
Olivier 'reivilibre'
|
641c575660
|
Allow importing 'weeds' as opposed to 'seeds'
|
2022-03-22 19:52:50 +00:00 |
Olivier 'reivilibre'
|
05ebfc8998
|
Add process metrics
continuous-integration/drone the build failed
Details
|
2022-03-21 20:19:59 +00:00 |
Olivier 'reivilibre'
|
2d35298a2e
|
Add some metrics for emitted packs
|
2022-03-21 19:56:29 +00:00 |
Olivier 'reivilibre'
|
806192fab5
|
Shut down faster (don't wait for crawl delays)
|
2022-03-21 19:39:31 +00:00 |
Olivier 'reivilibre'
|
a60ace0482
|
Fix bug where Ctrl+C wouldn't hang up the emitters
|
2022-03-21 19:38:03 +00:00 |
Olivier 'reivilibre'
|
649dec7fa9
|
Add tool for viewing what's on hold
|
2022-03-21 19:33:07 +00:00 |
Olivier 'reivilibre'
|
fcc1f517af
|
Support pages where we can't extract the article
|
2022-03-21 19:25:55 +00:00 |
Olivier 'reivilibre'
|
e0fb714f7a
|
Some bugfixes that get the raker mostly going
|
2022-03-21 19:24:20 +00:00 |
Olivier 'reivilibre'
|
51d5b9208b
|
Put URLs on hold rather than the queue if they are not allowed
|
2022-03-21 19:16:48 +00:00 |
Olivier 'reivilibre'
|
9ef4fef858
|
Shut down gently on SIGINT or SIGTERM (supposedly)
|
2022-03-21 19:11:07 +00:00 |
Olivier 'reivilibre'
|
71c22daf0d
|
Emit rakepacks from the raker
|
2022-03-21 19:07:35 +00:00 |
Olivier 'reivilibre'
|
f60031a462
|
Improve domain acquisition and shutdown logic
|
2022-03-20 23:01:56 +00:00 |
Olivier 'reivilibre'
|
06b3c54b81
|
Remove active domain after all pages are raked
continuous-integration/drone the build failed
Details
|
2022-03-20 22:24:35 +00:00 |
Olivier 'reivilibre'
|
6d109632a3
|
Sort through TODO items
|
2022-03-20 22:23:12 +00:00 |
Olivier 'reivilibre'
|
120702ce0e
|
Don't forget to commit after R/W operations
continuous-integration/drone the build failed
Details
|
2022-03-20 22:01:57 +00:00 |
Olivier 'reivilibre'
|
f6efc7a4e5
|
Fix seed finder
|
2022-03-20 21:57:04 +00:00 |
Olivier 'reivilibre'
|
173b8a4de1
|
Comment out tagging code from the rake seeder
|
2022-03-20 21:51:33 +00:00 |
Olivier 'reivilibre'
|
179f04b2dd
|
Import the seeds and show stats
|
2022-03-20 21:46:50 +00:00 |
Olivier 'reivilibre'
|
abf814550a
|
Import seeds (theoretically)
|
2022-03-20 21:41:32 +00:00 |
Olivier 'reivilibre'
|
8df430c7f1
|
Load and parse seeds
|
2022-03-20 20:50:31 +00:00 |
Olivier 'reivilibre'
|
39aa4eb9b7
|
Add seed file parser
|
2022-03-20 20:29:32 +00:00 |
Olivier 'reivilibre'
|
5e61386a83
|
Use ArcIntern and CompactString as needed
|
2022-03-20 15:49:00 +00:00 |
Olivier 'reivilibre'
|
fc90ea4e1f
|
Set migration version to something a little bit more intuitive
|
2022-03-20 15:44:53 +00:00 |
Olivier 'reivilibre'
|
6bdc505394
|
Rename qp-seeds to qp-seedrake
|
2022-03-20 15:42:06 +00:00 |
Olivier 'reivilibre'
|
5be6cade11
|
STASH notes about seeds
continuous-integration/drone the build failed
Details
|
2022-03-20 15:20:13 +00:00 |
Olivier 'reivilibre'
|
c3ccd64d5f
|
Add way of periodically tracking database metrics
|
2022-03-20 14:24:30 +00:00 |
Olivier 'reivilibre'
|
e651a953f6
|
STASH to calculate datastore metrics
continuous-integration/drone the build failed
Details
|
2022-03-20 13:26:34 +00:00 |
Olivier 'reivilibre'
|
410a4e962b
|
Ignore the workbench directory
continuous-integration/drone the build failed
Details
|
2022-03-20 12:24:52 +00:00 |
Olivier 'reivilibre'
|
f9aac34104
|
Add support for Prometheus metrics
|
2022-03-20 12:24:35 +00:00 |
Olivier 'reivilibre'
|
a907817831
|
Use mold linker for faster compilation
|
2022-03-20 12:24:19 +00:00 |
Olivier 'reivilibre'
|
4f85aebd38
|
Theoretically allow graceful stop
|
2022-03-20 06:33:39 +00:00 |
Olivier 'reivilibre'
|
085020b80d
|
Get ever closer to a raker being usable
continuous-integration/drone the build failed
Details
|
2022-03-20 00:08:37 +00:00 |
Olivier 'reivilibre'
|
ea4f2d1332
|
Some partial progress towards raking pages
|
2022-03-19 22:57:36 +00:00 |
Olivier 'reivilibre'
|
5bab279cc2
|
STASH
continuous-integration/drone the build failed
Details
|
2022-03-19 15:39:59 +00:00 |
Olivier 'reivilibre'
|
ab0b1e84ee
|
STASH work on Raking
continuous-integration/drone the build failed
Details
|
2022-03-19 21:04:12 +00:00 |