| #
699b8a0b |
| 18-Jul-2007 |
Andreas Gohr <andi@splitbrain.org> |
fix asian word search FS#1188
darcs-hash:20070718073121-7ad00-60e45fb3913fa3745511c640a55aa1b7446a3657.gz
|
| #
15018435 |
| 01-Mar-2007 |
Andreas Gohr <andi@splitbrain.org> |
fix pass by reference problem in indexer.php
darcs-hash:20070301211751-7ad00-d4212a363176501a31a0971a00f81e18ee00fab3.gz
|
| #
a6a30c1a |
| 27-Feb-2007 |
Esther Brunner <wikidesign@gmail.com> |
INDEXER_PAGE_ADD event
darcs-hash:20070227124424-20862-78b4e1863830e88aa9564e6b9c58fa0cdf03d41c.gz
|
| #
adb16d4f |
| 26-Feb-2007 |
Andreas Gohr <andi@splitbrain.org> |
soted indexer is now default
darcs-hash:20070226175529-7ad00-4d3d984da1edbf2ded546cfbd7374f97f032d032.gz
|
| #
d5b23302 |
| 17-Nov-2006 |
Tom N Harris <tnharris@whoopdedo.org> |
Indexer asian language fixes and speed-ups
Make Chinese and Japanese work better with the new indexer. Some missing punctuation added to utf8_stripspecials. Misc. other changes to make indexing fast
Indexer asian language fixes and speed-ups
Make Chinese and Japanese work better with the new indexer. Some missing punctuation added to utf8_stripspecials. Misc. other changes to make indexing faster. The indexes will expire on backend upgrades, so you don't have to delete *.indexed
darcs-hash:20061117123032-6942e-774b38e08234928c49b37e40addba375acf67ac0.gz
show more ...
|
| #
b2bc63f0 |
| 14-Nov-2006 |
Andreas Gohr <andi@splitbrain.org> |
bracket fix in inc/indexer.php
darcs-hash:20061114210440-7ad00-841acaf84e77e7bea16b96317531bd502ee44938.gz
|
| #
3fc667cf |
| 13-Nov-2006 |
chris <chris@jalakai.co.uk> |
fixes for stricter php5 typing (bug#978)
darcs-hash:20061113122645-9b6ab-e5f5be2e88eea7eb00643e6a5210086f46191c30.gz
|
| #
579b0f7e |
| 12-Nov-2006 |
TNHarris <telliamed@fastmail.us> |
Word-Length Indexer
A modification to the indexer that sorts words based on length. This should make searching a little bit more efficient. After the patch is applied, your old index will be automat
Word-Length Indexer
A modification to the indexer that sorts words based on length. This should make searching a little bit more efficient. After the patch is applied, your old index will be automatically converted to the new format (when you visit a page). The new index format is:
1. Index files are stored in savedir/index 2. Word lists are stored as wlen.idx. This used to be word.idx. 3. Word indexes are stored as ilen.idx. This used to be index.idx. 4. The page list, page.idx, is simply copied to the new location.
Any plugins you have, such as the blog plugin, that read the index files need to be updated.
darcs-hash:20061112194900-2b9f0-a975498ccf0a1d39c6df73b79bcd028d5e81c389.gz
show more ...
|
| #
6b06b652 |
| 05-Nov-2006 |
chris <chris@jalakai.co.uk> |
backlinks fixes (bugs #795 & #937)
- add deaccented and romanised page names to index word list - remove stop words from tokens used in backlink search
darcs-hash:20061105195453-9b6ab-6c4989eb75782
backlinks fixes (bugs #795 & #937)
- add deaccented and romanised page names to index word list - remove stop words from tokens used in backlink search
darcs-hash:20061105195453-9b6ab-6c4989eb75782af60a3de3bddbc99a83de2b4c80.gz
show more ...
|
| #
9ee93076 |
| 31-Aug-2006 |
chris <chris@jalakai.co.uk> |
search improvements
ft_snippet() - make utf8 algorithm default - add workaround for utf8_substr() limitations, bug #891 - fix some indexes which missed out on conversion to utf8 character counts -
search improvements
ft_snippet() - make utf8 algorithm default - add workaround for utf8_substr() limitations, bug #891 - fix some indexes which missed out on conversion to utf8 character counts - minor improvements
idx_lookup() - minor changes to wildcard matching code to improve performance (changes based on profiling results)
utf8 - specifically set mb_internal_coding to utf-8 when mb_string functions will be used.
darcs-hash:20060831003413-9b6ab-712021eda3c959ffe79d8d3fe91d2c9a8acf2b58.gz
show more ...
|
| #
0d8ea614 |
| 25-Aug-2006 |
chris <chris@jalakai.co.uk> |
update wikiFN with third paramter, $clean
value defaults to true
patch also includes an update to idx_parseIndexLine to make use of the new parameter - the index file (if built by DokuWiki's method
update wikiFN with third paramter, $clean
value defaults to true
patch also includes an update to idx_parseIndexLine to make use of the new parameter - the index file (if built by DokuWiki's methods) will contain already "clean" IDs.
darcs-hash:20060825144112-9b6ab-55adc71cf55bb58468fb3f0b03b9001ab149a82b.gz
show more ...
|
| #
4efb9a42 |
| 18-Jun-2006 |
Andreas Gohr <andi@splitbrain.org> |
fixed stupid bug in search query parser
darcs-hash:20060618134515-7ad00-3097e310ccdaf793b5da3bd49a54723fea7ec260.gz
|
| #
3aee4c27 |
| 07-May-2006 |
Andreas Gohr <andi@splitbrain.org> |
changed all occurances of rename() to io_rename()
darcs-hash:20060507101333-7ad00-e687d797fbee26e0b0bc7741ff8a1af496c101bf.gz
|
| #
98c86858 |
| 17-Feb-2006 |
Andreas Gohr <andi@splitbrain.org> |
file cleanups
This patch cleans up the source code to satisfy the coding guidelines (see http://wiki.splitbrain.org/wiki:development#coding_style)
It converts files to UNIX lineendings and removes
file cleanups
This patch cleans up the source code to satisfy the coding guidelines (see http://wiki.splitbrain.org/wiki:development#coding_style)
It converts files to UNIX lineendings and removes tabs and trailing whitespace. Not all files were cleaned yet.
darcs-hash:20060217222040-7ad00-bba3d2bee3b5aa7cbb5184258abd50805cd071bf.gz
show more ...
|
| #
63201c6e |
| 26-Jan-2006 |
Osamu Higuchi <osamu@higuchi.com> |
fixed indexer word counts for UTF-8 words #653
darcs-hash:20060126233702-87e23-9382dd77b66f263fa51ad02dc31264c667fdbc70.gz
|
| #
ad81d431 |
| 27-Nov-2005 |
Andreas Gohr <andi@splitbrain.org> |
Wildcardsearch added #552 #632
Now searching for word parts is possible by adding or prepending a * character to the searchword:
'foo*' searches for words beginning with 'foo' eg. 'foobar' '*foo' l
Wildcardsearch added #552 #632
Now searching for word parts is possible by adding or prepending a * character to the searchword:
'foo*' searches for words beginning with 'foo' eg. 'foobar' '*foo' looks for words ending in 'foo' eg. 'barfoo' '*foo*' gets anything with 'foo' in it eg. 'barfoobaz'
darcs-hash:20051127180723-7ad00-1eb29e812ddaf38d9812697bb1cffffe9a5fb330.gz
show more ...
|
| #
91bb5faa |
| 09-Oct-2005 |
Andreas Gohr <andi@splitbrain.org> |
ignore regexp failures when handling asian chars
The new handling of asian chars as single words needs a recent PCRE library (PHP 4.3.10 is known work). If this support isn't available the regexp co
ignore regexp failures when handling asian chars
The new handling of asian chars as single words needs a recent PCRE library (PHP 4.3.10 is known work). If this support isn't available the regexp compilation will fail. This patch adds a workaround - this means the search will not work as expected with asian words on older PHP versions.
darcs-hash:20051009124833-7ad00-1319829be5cb73246e13eb65e4c950d43c6ce5bf.gz
show more ...
|
| #
93a60ad2 |
| 25-Sep-2005 |
Andreas Gohr <andi@splitbrain.org> |
asian language support for the indexer #563
Asian languages do not use spaces to seperate words. The indexer however does a word based lookup. Splitting for example Japanese texts into real words is
asian language support for the indexer #563
Asian languages do not use spaces to seperate words. The indexer however does a word based lookup. Splitting for example Japanese texts into real words is only possible with complicated natural language processing, something completely out of scope for DokuWiki.
This patch solves the problem by treating all asian characters as single words. When an asian word (consisting of multiple characters) is searched it is treated as a phrase search, looking up each charcter by it self first, then checking for the phrase in found documents.
darcs-hash:20050925175451-7ad00-933b33b51b5f2fa05e736c18b8db58a5fdbf41ce.gz
show more ...
|
| #
3cbaa9a4 |
| 21-Sep-2005 |
Andreas Gohr <andi@splitbrain.org> |
backlinkfix for pages with special characters #548
darcs-hash:20050921195118-7ad00-9070166cbaa26e3f27f7b92382346a70f5c479a1.gz
|
| #
d437bcc4 |
| 18-Sep-2005 |
Andreas Gohr <andi@splitbrain.org> |
more efficient changelog reading for recent changes
getRecents now reads the changelog backwards in 4KB chunks instead of loading the whole file into an array and rsort it. This should be more memor
more efficient changelog reading for recent changes
getRecents now reads the changelog backwards in 4KB chunks instead of loading the whole file into an array and rsort it. This should be more memory efficient (and probably faster) for large change logs.
Note: the format of the array returned by getRecents changed slightly plugins relying on it need to be adjusted. Sorry.
darcs-hash:20050918121008-7ad00-1fdba47d29b0c038c6e4e4edc1d4c93e5ba769e9.gz
show more ...
|
| #
d18f28de |
| 12-Sep-2005 |
Andreas Gohr <andi@splitbrain.org> |
fixed stupid bug in indexer
There was a stupid bug in the indexer which prevented the adding of new words (only non ASCII words were added)
darcs-hash:20050912145813-7ad00-4351dbb1ab984d97322953c0b
fixed stupid bug in indexer
There was a stupid bug in the indexer which prevented the adding of new words (only non ASCII words were added)
darcs-hash:20050912145813-7ad00-4351dbb1ab984d97322953c0ba4c9962ad887697.gz
show more ...
|
| #
63773904 |
| 12-Sep-2005 |
Andreas Gohr <andi@splitbrain.org> |
added missing ACL checks for new index based searches
darcs-hash:20050912143027-7ad00-b2f3165d8db7122a453ecc63ad031af4467f691f.gz
|
| #
9684e36c |
| 07-Sep-2005 |
Andreas Gohr <andi@splitbrain.org> |
try faster rename before falling back to copy in indexer
darcs-hash:20050907210643-7ad00-a5cd36dc8b48ca445af87e9f066c7a54a98a3658.gz
|
| #
d7c3763d |
| 06-Sep-2005 |
Dave Doyle <ddoyle@canadalawbook.ca> |
indexer rename bugfix for Win32
darcs-hash:20050906214043-a62d3-65097acf0b035fd6fe9833136a15f9562e69970f.gz
|
| #
f5eb7cf0 |
| 28-Aug-2005 |
Andreas Gohr <andi@splitbrain.org> |
new fulltext search function using the index
The new search function was added but is not yet integrated into DokuWikis interface.
darcs-hash:20050828152821-7ad00-a6e79a9dc5aaf41c547cf42dccdbc3b5bc
new fulltext search function using the index
The new search function was added but is not yet integrated into DokuWikis interface.
darcs-hash:20050828152821-7ad00-a6e79a9dc5aaf41c547cf42dccdbc3b5bc8d303e.gz
show more ...
|