| #
a05e297a |
| 23-Feb-2008 |
Andreas Gohr <andi@splitbrain.org> |
use fulltext index to search for used media files FS#1336 FS#1275
This changes how DokuWiki looks for reference toa media file which is about to deleted. Instead of doing a full grep through all pag
use fulltext index to search for used media files FS#1336 FS#1275
This changes how DokuWiki looks for reference toa media file which is about to deleted. Instead of doing a full grep through all pages it now uses the fulltext index first, then does an exact match on the found pages.
This speeds up the search significantly on larger wikis. However the fulltext search limits now apply: images with names shorter than 3 charcters may not be found.
This needs extensive testing!
darcs-hash:20080223205254-7ad00-486de0a4125d51b4e7999827f710d1d9de8bc60d.gz
show more ...
|
| #
60c15d7d |
| 15-Feb-2008 |
Andreas Gohr <andi@splitbrain.org> |
better highlighting for phrase searches FS#1193
This patch makes the highlighting of phrases in search snippets and on the pages itself much better.
Now a regexp gets passed to the ?s
darcs-hash:2
better highlighting for phrase searches FS#1193
This patch makes the highlighting of phrases in search snippets and on the pages itself much better.
Now a regexp gets passed to the ?s
darcs-hash:20080215174653-7ad00-cd2d6f7d408db7b7dd3cb9974c3eb27f3a9baeac.gz
show more ...
|
| #
103c256a |
| 30-Sep-2007 |
Chris Smith <chris@jalakai.co.uk> |
add page_exists function (inc/pageutils.php)
bool page_exists($id, $rev
darcs-hash:20070930021040-d26fc-e3847bfdd20a36154685262eca94211cfd461e83.gz
|
| #
00976812 |
| 30-Sep-2007 |
Andreas Gohr <andi@splitbrain.org> |
don't use realpath() anymore (FS#1261 and others)
The use of realpath() to clean up relative file names caused some trouble in certain setups relying on symlinks or having restricitve file structure
don't use realpath() anymore (FS#1261 and others)
The use of realpath() to clean up relative file names caused some trouble in certain setups relying on symlinks or having restricitve file structure setups.
This patch replaces all realpath() calls with a PHP only replacement which should solve those problems.
darcs-hash:20070930184250-7ad00-512ff04c95f57fc9eaf104f80372237a3c94286f.gz
show more ...
|
| #
a21136cd |
| 04-Aug-2007 |
Andreas Gohr <andi@splitbrain.org> |
fulltext search fixes FS#1191 FS#1192
darcs-hash:20070804081226-7ad00-a8e7127c7122a96f9817158d87e1a364d8cdbc9f.gz
|
| #
235cf363 |
| 18-Jul-2007 |
Andreas Gohr <andi@splitbrain.org> |
fix for phrase search FS#1189
darcs-hash:20070718104839-7ad00-50348c1834c78e891f049023d2e8894d6bb0a00b.gz
|
| #
ed7ecb79 |
| 14-May-2007 |
Anika Henke <a.c.henke@arcor.de> |
FS#744 (template developers, heed the changes)
darcs-hash:20070514222527-d5083-53ed619daf07d0a84c52161465d163abf1400529.gz
|
| #
09c27a6d |
| 30-Mar-2007 |
Guy Brand <gb@isis.u-strasbg.fr> |
Fix backlinks - See FS#1040
darcs-hash:20070330215042-19e2d-3528f2412ff044eb45158f349db5bbb5e32d907b.gz
|
| #
05082375 |
| 03-Mar-2007 |
Andreas Gohr <andi@splitbrain.org> |
fixed warning whith no search results FS#1088
darcs-hash:20070303220143-7ad00-5d592dbebaae371c03102b20ae7e0d9e433b378b.gz
|
| #
d7d7bed5 |
| 05-Feb-2007 |
Andreas Gohr <andi@splitbrain.org> |
fix for slashes in phrase search #1066
darcs-hash:20070205191848-7ad00-77ad5a398534a7a64884e155c4607350e0f25a7c.gz
|
| #
6798a86a |
| 24-Nov-2006 |
Andreas Gohr <andi@splitbrain.org> |
trim pagename returned by ft_pageLookup
darcs-hash:20061124215413-7ad00-f2bd46b7edf70660cc3e0274bd222eafba1edbc6.gz
|
| #
579b0f7e |
| 12-Nov-2006 |
TNHarris <telliamed@fastmail.us> |
Word-Length Indexer
A modification to the indexer that sorts words based on length. This should make searching a little bit more efficient. After the patch is applied, your old index will be automat
Word-Length Indexer
A modification to the indexer that sorts words based on length. This should make searching a little bit more efficient. After the patch is applied, your old index will be automatically converted to the new format (when you visit a page). The new index format is:
1. Index files are stored in savedir/index 2. Word lists are stored as wlen.idx. This used to be word.idx. 3. Word indexes are stored as ilen.idx. This used to be index.idx. 4. The page list, page.idx, is simply copied to the new location.
Any plugins you have, such as the blog plugin, that read the index files need to be updated.
darcs-hash:20061112194900-2b9f0-a975498ccf0a1d39c6df73b79bcd028d5e81c389.gz
show more ...
|
| #
6b06b652 |
| 05-Nov-2006 |
chris <chris@jalakai.co.uk> |
backlinks fixes (bugs #795 & #937)
- add deaccented and romanised page names to index word list - remove stop words from tokens used in backlink search
darcs-hash:20061105195453-9b6ab-6c4989eb75782
backlinks fixes (bugs #795 & #937)
- add deaccented and romanised page names to index word list - remove stop words from tokens used in backlink search
darcs-hash:20061105195453-9b6ab-6c4989eb75782af60a3de3bddbc99a83de2b4c80.gz
show more ...
|
| #
10ffc9dd |
| 08-Oct-2006 |
Andreas Gohr <andi@splitbrain.org> |
remove unused code
This patch removes some commented code fragments and alternative snippet generators
darcs-hash:20061008090624-7ad00-14bfee2ded6c6c8ef43ad02a4c02a5d95ee9daf7.gz
|
| #
2626ee0c |
| 28-Sep-2006 |
chris <chris@jalakai.co.uk> |
more utf8_substr improvements (re FS#891 and yesterday's patch)
- rework utf8_substr() NOMBSTRING code to always use pcre - remove work around for utf8_substr() and large strings from ft_snippet()
more utf8_substr improvements (re FS#891 and yesterday's patch)
- rework utf8_substr() NOMBSTRING code to always use pcre - remove work around for utf8_substr() and large strings from ft_snippet()
darcs-hash:20060928165122-9b6ab-0eefc216f07f9d7e7d8eb62ce26605c28ee340fa.gz
show more ...
|
| #
4b5f4f4e |
| 11-Sep-2006 |
chris <chris@jalakai.co.uk> |
parser caching update
This patch primarily updates p_cached_xhtml() and p_cached_instructions() to allow their caching logic to be surrounded by an event trigger.
p_cached_xhtml() has been rewritte
parser caching update
This patch primarily updates p_cached_xhtml() and p_cached_instructions() to allow their caching logic to be surrounded by an event trigger.
p_cached_xhtml() has been rewritten as the more general p_cached_output() to support other render output formats besides 'xhtml'. All calls to p_cached_xhtml() have been changed to refer to the new function.
New event:
name: PARSER_CACHE_USE data: cache object (see below) action: determine if cache file can be used preventable: yes result: bool, true to use cache file, false otherwise
Cache operations have been generalised in a new class, cache, extended to cache_parser, cache_renderer & cache_instructions. Details can be found in inc/cache.php
For handling of above event, key properties are: - page, if present the wiki page id, may not always be present, e.g. when called for locale xhtml files - file, source file - mode, renderer mode (e.g. 'xhtml') or 'i' for instructions
Other changes: - cache class counts cache hits against attempts, results are stored in {cache_dir}/cache_stats.txt - adds metadata dependency to renderer page cache - replaces purgefile dependency for renderer cache with metadata 'relation references' (internal link) dependency for wiki pages only
darcs-hash:20060911021418-9b6ab-19601ed194b8c8e45236ab72c3e23d78bf777e6c.gz
show more ...
|
| #
3be6e394 |
| 01-Sep-2006 |
chris <chris@jalakai.co.uk> |
update backlink search to use metadata
darcs-hash:20060901002016-9b6ab-716518138edf541a869510d7c2934b9474547fc3.gz
|
| #
0e70946d |
| 31-Aug-2006 |
chris <chris@jalakai.co.uk> |
add unittests for bug#891
darcs-hash:20060831092146-9b6ab-b00aa29c982ab18117f476b3d01d5111915c9d4b.gz
|
| #
9ee93076 |
| 31-Aug-2006 |
chris <chris@jalakai.co.uk> |
search improvements
ft_snippet() - make utf8 algorithm default - add workaround for utf8_substr() limitations, bug #891 - fix some indexes which missed out on conversion to utf8 character counts -
search improvements
ft_snippet() - make utf8 algorithm default - add workaround for utf8_substr() limitations, bug #891 - fix some indexes which missed out on conversion to utf8 character counts - minor improvements
idx_lookup() - minor changes to wildcard matching code to improve performance (changes based on profiling results)
utf8 - specifically set mb_internal_coding to utf-8 when mb_string functions will be used.
darcs-hash:20060831003413-9b6ab-712021eda3c959ffe79d8d3fe91d2c9a8acf2b58.gz
show more ...
|
| #
ced0762e |
| 26-Aug-2006 |
chris <chris@jalakai.co.uk> |
ft_snippet() update
- correct "opt1" algorithm for multibyte utf8 - minor improvement to "opt2" for short pages - add "utf8" algorithm, this algorithm endeavours to work with whole utf8 charac
ft_snippet() update
- correct "opt1" algorithm for multibyte utf8 - minor improvement to "opt2" for short pages - add "utf8" algorithm, this algorithm endeavours to work with whole utf8 character as much as possible. The resulting snippet will tend to 100 characters, rather than the 100 bytes of "opt1" and "opt2".
darcs-hash:20060826234333-9b6ab-ae4c60c8855a92b133cb8d5a230098203f610e7b.gz
show more ...
|
| #
5953e889 |
| 26-Aug-2006 |
chris <chris@jalakai.co.uk> |
ft_snippet() update, fix utf8 problems
darcs-hash:20060826095311-9b6ab-9a6f272cc7c7532eb2bad8f7b4404c5a16b71109.gz
|
| #
0eac1afb |
| 26-Aug-2006 |
Andreas Gohr <andi@splitbrain.org> |
code to remove bad UTF-8 bytes added
This adds code to remove or replace invalid UTF-8 bytes and uses it in the ft_snippets function.
darcs-hash:20060826082919-7ad00-a94004de159ae93ff5b7270fd3e631f
code to remove bad UTF-8 bytes added
This adds code to remove or replace invalid UTF-8 bytes and uses it in the ft_snippets function.
darcs-hash:20060826082919-7ad00-a94004de159ae93ff5b7270fd3e631ff467233cd.gz
show more ...
|
| #
95a12943 |
| 25-Aug-2006 |
chris <chris@jalakai.co.uk> |
update to previous ft_snippet() patch, improve snippet text selection
darcs-hash:20060825134730-9b6ab-086ee0647af39c4398cf1726324d8215722a39db.gz
|
| #
bd2cb6fc |
| 25-Aug-2006 |
chris <chris@jalakai.co.uk> |
ft_snippet optimisations
This patch includes two alternative algorithms for ft_snippet(), the code which prepares the snippets seen on the search page - and the most time consuming part of the produ
ft_snippet optimisations
This patch includes two alternative algorithms for ft_snippet(), the code which prepares the snippets seen on the search page - and the most time consuming part of the production of that page.
If you have $conf['allowdebug'] on, you can specify the search algorithm to use by adding &_search
darcs-hash:20060825104046-9b6ab-942d81a43cf0f85bfd235cabf6c35dd4b20e0b71.gz
show more ...
|
| #
a219c1f0 |
| 18-May-2006 |
Michael Klier chi@chimeric.de <andi@splitbrain.org> |
namespace-restricted fulltext-search part2
- now its possible to restrict the fulltext-search to multible namespaces
Examples:
searchword @ns1 @ns2 @ns3
"exact phr
namespace-restricted fulltext-search part2
- now its possible to restrict the fulltext-search to multible namespaces
Examples:
searchword @ns1 @ns2 @ns3
"exact phrase" @ns1 @ns2 @ns3
darcs-hash:20060518204647-484ab-061521a81f13360e33496e5163e3cd263a9c1ad6.gz
show more ...
|