Lines Matching full:token
69 * A single value. Eg. an entity or a token
76 …ficiency and access speed, token and frequency indexes can be split into multiple physical files u…
82 > Note: token lengths are counted in bytes, not characters. This means that for languages with mult…
145 * **token** - The actual information strewn across the entities. Eg. words. token.RID -> token
146 …* **frequency** - Maps tokens to entities and records their frequency. token.RID -> entity.RID*fre…
150 …* **Split collections**: Each entry is a ''tokenLength*tokenId'' pair because the token length is …
151 …* **Non-split collections**: Only the token ID is needed since all tokens live in a single file. F…
157 …* frequency collections - The same token can appear multiple times in the same entity and searches…
158 …ut each token appears only once per entity thus all frequencies are 1. Searches do not care for th…
159 …ty and a token exists. For example a page has exactly one title. Direct collections only use entit…
161 Independently of the collection type, a collection can use **split or non-split token indexes**. Se…
163 ^ Name ^ Type ^ Split? ^ Entity ^ Token ^ Frequency …
171 …. It replaces all previously stored tokens for the given entity. An empty token list removes the e…
180 …entity's old tokens, resolves the new tokens to IDs (creating them in the token index if needed), …
182 …t collections, ''addEntity()'' simply writes the first token at the entity's position in the token…
189 …* For direct collections, ''getToken($entity)'' retrieves the single token stored for an entity (e…
236 … metadata/title searches where indexed values preserve case (the fulltext token index is already l…
238 …rs long. Each length group can be looked up in the corresponding suffixed token index, allowing ef…
241 * Token "wiki" appears 5 times on page "start"
242 * Token "wikitext" appears 3 times on page "start"
246 Term does not enforce minimum token length. For fulltext search, callers should filter short words …