Commit graph

827 commits

Author SHA1 Message Date
Markus Heiser
edfbf1e118 [refactor] typification of SearXNG (initial) / result items (part 1)
Typification of SearXNG
=======================

This patch introduces the typing of the results.  The why and how is described
in the documentation, please generate the documentation ..

    $ make docs.clean docs.live

and read the following articles in the "Developer documentation":

- result types --> http://0.0.0.0:8000/dev/result_types/index.html

The result types are available from the `searx.result_types` module.  The
following have been implemented so far:

- base result type: `searx.result_type.Result`
  --> http://0.0.0.0:8000/dev/result_types/base_result.html

- answer results
  --> http://0.0.0.0:8000/dev/result_types/answer.html

including the type for translations (inspired by #3925).  For all other
types (which still need to be set up in subsequent PRs), template documentation
has been created for the transition period.

Doc of the fields used in Templates
===================================

The template documentation is the basis for the typing and is the first complete
documentation of the results (needed for engine development).  It is the
"working paper" (the plan) with which further typifications can be implemented
in subsequent PRs.

- https://github.com/searxng/searxng/issues/357

Answer Templates
================

With the new (sub) types for `Answer`, the templates for the answers have also
been revised, `Translation` are now displayed with collapsible entries (inspired
by #3925).

    !en-de dog

Plugins & Answerer
==================

The implementation for `Plugin` and `Answer` has been revised, see
documentation:

- Plugin: http://0.0.0.0:8000/dev/plugins/index.html
- Answerer: http://0.0.0.0:8000/dev/answerers/index.html

With `AnswerStorage` and `AnswerStorage` to manage those items (in follow up
PRs, `ArticleStorage`, `InfoStorage` and .. will be implemented)

Autocomplete
============

The autocompletion had a bug where the results from `Answer` had not been shown
in the past.  To test activate autocompletion and try search terms for which we
have answerers

- statistics: type `min 1 2 3` .. in the completion list you should find an
  entry like `[de] min(1, 2, 3) = 1`

- random: type `random uuid` .. in the completion list, the first item is a
  random UUID

Extended Types
==============

SearXNG extends e.g. the request and response types of flask and httpx, a module
has been set up for type extensions:

- Extended Types
  --> http://0.0.0.0:8000/dev/extended_types.html

Unit-Tests
==========

The unit tests have been completely revised.  In the previous implementation,
the runtime (the global variables such as `searx.settings`) was not initialized
before each test, so the runtime environment with which a test ran was always
determined by the tests that ran before it.  This was also the reason why we
sometimes had to observe non-deterministic errors in the tests in the past:

- https://github.com/searxng/searxng/issues/2988 is one example for the Runtime
  issues, with non-deterministic behavior ..

- https://github.com/searxng/searxng/pull/3650
- https://github.com/searxng/searxng/pull/3654
- https://github.com/searxng/searxng/pull/3642#issuecomment-2226884469
- https://github.com/searxng/searxng/pull/3746#issuecomment-2300965005

Why msgspec.Struct
==================

We have already discussed typing based on e.g. `TypeDict` or `dataclass` in the past:

- https://github.com/searxng/searxng/pull/1562/files
- https://gist.github.com/dalf/972eb05e7a9bee161487132a7de244d2
- https://github.com/searxng/searxng/pull/1412/files
- https://github.com/searxng/searxng/pull/1356

In my opinion, TypeDict is unsuitable because the objects are still dictionaries
and not instances of classes / the `dataclass` are classes but ...

The `msgspec.Struct` combine the advantages of typing, runtime behaviour and
also offer the option of (fast) serializing (incl. type check) the objects.

Currently not possible but conceivable with `msgspec`: Outsourcing the engines
into separate processes, what possibilities this opens up in the future is left
to the imagination!

Internally, we have already defined that it is desirable to decouple the
development of the engines from the development of the SearXNG core / The
serialization of the `Result` objects is a prerequisite for this.

HINT: The threads listed above were the template for this PR, even though the
implementation here is based on msgspec.  They should also be an inspiration for
the following PRs of typification, as the models and implementations can provide
a good direction.

Why just one commit?
====================

I tried to create several (thematically separated) commits, but gave up at some
point ... there are too many things to tackle at once / The comprehensibility of
the commits would not be improved by a thematic separation. On the contrary, we
would have to make multiple changes at the same places and the goal of a change
would be vaguely recognizable in the fog of the commits.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-01-28 07:07:08 +01:00
Denperidge
70f1b65008 [feat] engines: add NixOS Wiki
Co-authored-by: Bnyro <bnyro@tutanota.com>
2025-01-26 20:12:19 +01:00
Bnyro
c1fcee9d9f [docs] settings_search.rst: add missing autocompletion providers 2025-01-26 08:38:29 +01:00
Bnyro
f766faca3f [feat] engines: add ipernity (images) 2025-01-20 17:22:32 +01:00
Denperidge
3333d9f385 [feat] engines: public domain image archive 2025-01-20 12:56:15 +01:00
DanielMowitz
272e39893d [feat]: engines: add astrophysical data system 2025-01-16 20:27:55 +01:00
Austin-Olacsi
73e395c8ce
[feat] engines: re-add alexandria.org 2024-12-25 13:13:18 +01:00
Bnyro
a7537a6935 [feat] search: add url formatting preference 2024-12-01 13:08:50 +01:00
Bnyro
5a9c1c6b5b [fix] crowdview engine: html tags in title and content 2024-11-28 06:19:55 +01:00
Markus Heiser
78f5300830 [chore] drop sjp engine: WEB side has changed a long time ago
The WEB page (PL only) has changed and there is now also a kind of CAPTCHA.
There is currently no possibility to restore the function of this engine.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-26 15:45:02 +01:00
Markus Heiser
ac0c6cc2d1 [chore] remove invalid base_url from settings.yml engines
The engines do not have / do not need a property `base_url`, lets remove it from
the settings.yml

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-26 10:06:07 +01:00
Bnyro
8744dd3c71 [feat] metrics: support for open metrics 2024-11-24 14:25:49 +01:00
Markus Heiser
0253c10b52 [feat] engine: add adobe stock video and audio engines
The engine has been revised; there is now the option ``adobe_content_types``
with which it is possible to configure engines for video and audio from the
adobe stock.  BTW this patch adds documentation to the engine.

To test all three engines in one use a search term like::

    !asi !asv !asa sound

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-24 11:56:12 +01:00
Bnyro
f20a7632f1 [feat] engine: add adobe stock photos 2024-11-24 11:56:12 +01:00
Markus Heiser
0f9694c90b [clean] Internet Archive Scholar search API no longer exists
Engine was added in #2733 but the API does no longer exists. Related:

- https://github.com/searxng/searxng/issues/4038

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-23 17:59:38 +01:00
Leo Liu
dfaf5868e2 [fix] settings.yml - enabled_plugins: document to reflect default settings
Remove 'Autodetect search language', which is no longer valid, from settings,
and add 'Unit converter plugin', which is now default enabled, to settings.
2024-11-10 16:09:41 +01:00
Bnyro
9f48d5f84f [feat] engine: support for openlibrary 2024-10-15 13:06:00 +02:00
Brock Vojkovic
e17d7632d0 [feat] add favicons to result urls 2024-10-05 08:18:28 +02:00
Grant Lanham
2a29e16d25 [feat] implement mariadb engine 2024-10-03 13:04:06 +02:00
Zhijie He
6be56aee11 add Cloudflare AI Gateway engine
add Cloudflare AI Gateway engine

add settings for Cloudflare AI Gateway engine

set utf8 encode for data, fix non english char cause 500 error

format json data

fixed indentation and config format error

fix line-length limitation in CI

reformatted code for CI

reformatted code for CI

limit system prompts to less 120 chars

cleanup unused variable & format code
2024-09-23 07:02:10 +02:00
Markus
28dc623785 [fix] drop engine alexandria.org
The origin alexandria.org is broken:

  https://www.alexandria.org/?c=&r=&a=0&q=foo

returns "504 Gateway Time-out"

- Closes: https://github.com/searxng/searxng/issues/3786

Signed-off-by: Markus <markus@venom.fritz.box>
2024-09-15 14:45:48 +02:00
Markus
3630e464b3 [fix] drop engine gpodder
gpodder is ultra slow on search terms like foo

  https://gpodder.net/search.json?q=foo

takes up to a minute to return an empty json response.

- Closes: https://github.com/searxng/searxng/issues/3785
Signed-off-by: Markus <markus@venom.fritz.box>
2024-09-15 14:45:38 +02:00
Bnyro
84e2f9d46a [feat] gitlab: implement dedicated module
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-15 08:04:21 +02:00
Finn Steffens
9e2bfe14db
[feat] engine: add right dao
* [feat] engine: add right dao

* [enh] right dao engine: allow additional classes

Allow additional classes while parsing to prevent the engine from breaking in the future if additional classes are added to the elements.

Co-authored-by: Bnyro <bnyro@tutanota.com>

---------

Co-authored-by: Bnyro <bnyro@tutanota.com>
2024-09-12 17:51:47 +02:00
Bnyro
94a1f39bde [engine] bahnhof.de: remove engine 2024-09-03 18:52:54 +02:00
Markus Heiser
b774ee04ba [mod] enable calculator and allow plugin on public instances
Remove quirks that prevented the Calculator from being used on public instances.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-03 18:36:28 +02:00
Austin-Olacsi
e45b771ffa [feat] engine: implementation of yandex (web, images)
It's set to inactive in settings.yml because of CAPTCHA.  You need to remove
that from the settings.yml to get in use.

Closes: https://github.com/searxng/searxng/issues/961
2024-08-21 12:08:35 +02:00
Austin-Olacsi
9f47bdefc6 [feat] engine: implementation of encyclosearch 2024-07-28 10:45:51 +02:00
Markus Heiser
d7bb97b616 [fix] engine yacy images: increase timout from 3 to 5sec
Its a leftover from 657dcb97

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-27 17:54:41 +02:00
Bnyro
84abab0808 [feat] engine: implementation of geizhals.de 2024-07-27 11:46:25 +02:00
Markus Heiser
657dcb973a [fix] engine yacy: update list of base URLs
https://search.lomig.me
  Poor results / tested `!yacy :en hello` and got zero results

https://yacy.ecosys.eu
  Slow response (> 6sec for trivial search terms)

https://search.webproject.link
  Dead instance / URL offline

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-20 09:59:43 +02:00
Bnyro
e4da22ee51 [feat] engine: implementation of alpine linux packages
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-14 17:57:58 +02:00
Grant Lanham
ef103ba80a Implement google/brave switch in Mullvad Leta
cleanup

Import annontations
2024-07-07 08:08:11 +02:00
Bnyro
4eaa0dd275 [fix] gentoo: use mediawiki engine 2024-07-03 10:24:03 +02:00
Markus Heiser
a5f8e0899c [fix] disable Reddit engine by default
Reddit is enabled by default .. many bot request will go through Reddit .. we
should disable Reddit by default to cool down the IP [1].

[1] https://github.com/searxng/searxng/issues/3444#issuecomment-2180415057

Closes: https://github.com/searxng/searxng/issues/3444
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-28 08:48:52 +02:00
holysoles
7be468d213 [feat] docker: add env vars for common public instance settings 2024-06-14 14:58:02 +02:00
Bnyro
aa59bfbf60 [feat] hostname replace plugin: support for external list file 2024-06-07 14:42:52 +02:00
Bnyro
3bec04079c [feat] hostname replace plugin: possibility to prioritize certain websites
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-07 14:42:52 +02:00
Bnyro
46c5309888 [feat] mojeek: implement dedicated module 2024-06-07 11:31:05 +02:00
Markus Heiser
32a2175f38 [feat] add engines for discourse forums (python, caddy, pi-hole)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-07 10:16:09 +02:00
Markus Heiser
5fc93b6c34 [fix] comment in settings.yml 'Calculator plugin' --> 'Basic Calculator'
Reported by @GitTimeraider in [1]

[1] https://github.com/searxng/searxng/discussions/3529#discussioncomment-9605018
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-05-30 14:34:25 +02:00
Austin-Olacsi
9bb75a6644 [feat] engine: implementation of findthatmeme 2024-05-28 18:18:13 +02:00
Daniel Kukula
87165ac532 [mod] engine hex: add sort_criteria & page_size to configuration 2024-05-28 11:55:59 +02:00
Daniel Kukula
a49232ee29 [feat] engine: implementation of cargo search (crates.io) 2024-05-17 16:37:39 +02:00
Bnyro
645a840d82 [refactor] codeberg: use gitea engine 2024-05-15 07:23:57 +02:00
Bnyro
82b6c0d05f [feat] engine: implementation of gitea 2024-05-15 07:23:57 +02:00
Markus Heiser
fb32425d78 [mod] yacy engine: pick base_url randomly from a list of instances
Inspired by post [1] in the disscussion we had, while yacy.searchlab.eu was
broken.

[1] https://github.com/searxng/searxng/issues/3428#issuecomment-2101080101

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-05-09 17:29:15 +02:00
Bnyro
72be98e12f [feat] plugins: new calculator plugin 2024-05-09 17:23:38 +02:00
Bnyro
78077126f2 [feat] wikimedia commons: support for videos, audio and other files 2024-05-04 06:23:04 +02:00
Daniel Kukula
46d7a8289b [feat] engine: implementation of https://hex.pm
The package manager for the Erlang ecosystem Find packages.

Co-authored-by: Bnyro <82752168+Bnyro@users.noreply.github.com>
2024-05-03 21:37:37 +02:00