History log of /plugin/aichat/Storage/SQLiteStorage.php (Results 1 – 22 of 22)
Revision Date Author Comments
# 31a78876 12-Mar-2025 Andreas Gohr <andi@splitbrain.org>

sqlite: avoid warnings on too short vectors

This should not happen in the real world. But when embeddings were
created with a shorter vector model than the model that is used to
embed the query, the

sqlite: avoid warnings on too short vectors

This should not happen in the real world. But when embeddings were
created with a shorter vector model than the model that is used to
embed the query, the cosineSimilarity method threw a whole bunch of
warnings. We now stop the comparison at the vector length.

In the real world the same model for embeddings and the query should be
used, results are unpredictable otherwise. So this is mostly a cosmetic
change for messed up states during development.

show more ...


# 42b2c6e8 12-Mar-2025 Andreas Gohr <andi@splitbrain.org>

add remote component to ask questions to the bot

The endpoint allows to override model and language settings on demand.


# 8c08cb3f 27-Mar-2024 Andreas Gohr <andi@splitbrain.org>

auto style fixes


# ab1f8dde 26-Mar-2024 Andreas Gohr <andi@splitbrain.org>

emit the INDEXER_PAGE_ADD event

This allows plugins that add data to the fulltext index to add the same
data to the embeddings. This improves embedding searches with struct
data for example.


# 720bb43f 25-Mar-2024 Andreas Gohr <andi@splitbrain.org>

make threshold configurable


# 04afb84f 19-Mar-2024 Andreas Gohr <andi@splitbrain.org>

correctly use storage setting


# 34a1c478 19-Mar-2024 Andreas Gohr <andi@splitbrain.org>

more refactoring on chat and embed model support

* differentiate between input and output tokens
* make use of much larger input contexts


# 441edf84 08-Nov-2023 Andreas Gohr <andi@splitbrain.org>

fixed overlong lines


# 30b9cbc7 08-Nov-2023 splitbrain <splitbrain@users.noreply.github.com>

�� Automatic code style fixes


# f8d5ae01 13-Sep-2023 Andreas Gohr <andi@splitbrain.org>

codesniffer cleanups


# 7ebc7895 13-Sep-2023 splitbrain <splitbrain@users.noreply.github.com>

�� Automatic code style fixes


# adfc5429 29-Aug-2023 Andreas Gohr <andi@splitbrain.org>

generate clusters only if more than 3 clusters would be created


# e33a1d7a 28-Aug-2023 Andreas Gohr <andi@splitbrain.org>

optionally search one language only


# 8c8b7ba6 16-Aug-2023 Andreas Gohr <andi@splitbrain.org>

Added dumping of TSV files to SQLite store

This allows visualizing the embed vectors


# 8285fff9 15-Aug-2023 Andreas Gohr <andi@splitbrain.org>

Merge branch 'pineconestorage'

* pineconestorage:
implement Pinecone based storage
First go at syntax to display similar pages


# 3379af09 15-Aug-2023 Andreas Gohr <andi@splitbrain.org>

use a k-means based cluster approach to speed up similarity searches


# 35555bac 15-Aug-2023 Andreas Gohr <andi@splitbrain.org>

simplify cosine distance calculation

Since all OpenAI vectors are normalized, only the dotproduct needs to be
calculated for the distance. This saves a couple of floating point ops
per chunk, but do

simplify cosine distance calculation

Since all OpenAI vectors are normalized, only the dotproduct needs to be
calculated for the distance. This saves a couple of floating point ops
per chunk, but doesn't make a huge difference overall.

show more ...


# 01f06932 10-Aug-2023 Andreas Gohr <andi@splitbrain.org>

First go at syntax to display similar pages


# 68b6fa79 10-Aug-2023 Andreas Gohr <andi@splitbrain.org>

First go at syntax to display similar pages


# 81b450c8 14-Jun-2023 Andreas Gohr <andi@splitbrain.org>

use a cut-off point when considering similar documents


# 9b3d1b36 14-Jun-2023 Andreas Gohr <andi@splitbrain.org>

show similarity scores in CLI


# f6ef2e50 14-Jun-2023 Andreas Gohr <andi@splitbrain.org>

refactoring to make models selectable

This makes it much easier to add new models. Models can now be selected
via the configuration