optionally search one language only
use a k-means based cluster approach to speed up similarity searches
firs go at abrstracting the storage backend and using sqliteRequires the dev branch of the sqlite plugin