History log of /dokuwiki/inc/indexer.php (Results 101 – 125 of 156)
Revision Date Author Comments
# 037b5573 15-Nov-2010 Michael Hamann <michael@content-space.de>

Indexer improvement: replace _freadline by fgets

In PHP versions newer than 4.3.0 fgets reads a whole line regardless of
its length when no length is given. Thus the loop in _freadline isn't
needed.

Indexer improvement: replace _freadline by fgets

In PHP versions newer than 4.3.0 fgets reads a whole line regardless of
its length when no length is given. Thus the loop in _freadline isn't
needed. This increases the speed significantly as _freadline was called
very often.

show more ...


# 06af2d03 15-Nov-2010 Michael Hamann <michael@content-space.de>

Indexer speed improvement: joined array vs. single lines

From my experience with a benchmark of the indexer it is faster to first
join the array of all index entries and then write them back togethe

Indexer speed improvement: joined array vs. single lines

From my experience with a benchmark of the indexer it is faster to first
join the array of all index entries and then write them back together
instead of writing every single entry. This might increase memory usage,
but I couldn't see a significant increase and this function is also only
used for the small index files, not for the large pagewords index.

show more ...


# 5bcab0c4 15-Nov-2010 Tom N Harris <tnharris@whoopdedo.org>

tokenizer was returning prematurely


# 4e1bf408 14-Nov-2010 Tom N Harris <tnharris@whoopdedo.org>

Refactor tokenizer to avoid splitting multiple times


# 4b9792c6 14-Nov-2010 Tom N Harris <tnharris@whoopdedo.org>

Measure length of multi-character Asian words


# 3a1a171b 14-Nov-2010 Tom N Harris <tnharris@whoopdedo.org>

Remove unused idx_touchIndex function


# 74afac00 18-Oct-2010 Andreas Gohr <andi@splitbrain.org>

removed deprecated index update function


# 80423ab6 16-Jun-2010 Adrian Lang <lang@cosmocode.de>

Perform quick search in title as well


# 22952965 23-Mar-2010 YoBoY <yoboy.leguesh@gmail.com>

Limiting use of readdir in the idx_indexLengths function (v2).

Each searches on the wiki use this function. Scanning the index directory eachtime is time consuming with a constant series of disk acc

Limiting use of readdir in the idx_indexLengths function (v2).

Each searches on the wiki use this function. Scanning the index directory eachtime is time consuming with a constant series of disk access.
Switching a normal search to use file_exists 1 or more times, and not readdir all the directory.
Switching a wildcard search to use a lengths.idx file containing all the word lengths used in the wiki, file generated if a new configuration parameter $conf[readdircache] is not 0 and fixed to a time in second. Creation of a new function idx_listIndexLengths to do this part.

show more ...


# 16905344 31-Jan-2010 Andreas Gohr <andi@splitbrain.org>

first attempt to centralize all include loading

Classes are loaded throug PHP5's class autoloader, all other
includes are just loaded by default. This skips a lot of
require_once calls.

Parser and

first attempt to centralize all include loading

Classes are loaded throug PHP5's class autoloader, all other
includes are just loaded by default. This skips a lot of
require_once calls.

Parser and Plugin stuff isn't handled by the class loader yet.

show more ...


# fcd3bb7c 31-Jan-2010 Andreas Gohr <andi@splitbrain.org>

fixed file header


# c66972f2 04-Nov-2009 Adrian Lang <lang@cosmocode.de>

Emit less E_NOTICEs and E_STRICTs

Changes of behaviour are:
* Allow the user name, title & description \e2\80\9c0\e2\80\9d
* Default to Port 443 if using HTTPS
* Set $INFO['isadmin'] and $IN

Emit less E_NOTICEs and E_STRICTs

Changes of behaviour are:
* Allow the user name, title & description \e2\80\9c0\e2\80\9d
* Default to Port 443 if using HTTPS
* Set $INFO['isadmin'] and $INFO['ismanager'] to \e2\80\9cfalse\e2\80\9d even if no user is
logged in
* Do not pass empty fragment field in the event data for event
ACTION_SHOW_REDIRECT
* Handle chunked encoding in HTTPClient

darcs-hash:20091104100115-e4919-5cf6397d4a457e3f98a8ca49fbdab03f2147721d.gz

show more ...


# db959ae3 20-Oct-2009 Andreas Gohr <andi@splitbrain.org>

Coding Standard Cleanup

Ignore-this: 259cb5773c3144c6c706d87298dcf674

darcs-hash:20091020212338-7ad00-6bf1c5c403491f136a1c02af5ecd9f84d7227107.gz


# d3fb3219 19-Jan-2009 Andreas Gohr <andi@splitbrain.org>

Changed minimum word length for fulltext index to 2

darcs-hash:20090119190920-7ad00-5409285ea5c44379fec906d08f5ccb710eac5b6d.gz


# c5418046 18-Jan-2009 Andreas Gohr <andi@splitbrain.org>

fixed indexer which was broken by miscalculation in previous optimization

darcs-hash:20090118200357-7ad00-2d3a8dcb57ef5d19efe65fd4af8c26af261aef06.gz


# dd35e9c9 26-Dec-2008 Andreas Gohr <andi@splitbrain.org>

minor optimizations in the fulltext indexing methods

darcs-hash:20081226183403-7ad00-1a4d08ab0f674eb3dcda131dd49ddaeb27129ad6.gz


# fa8adffe 13-Dec-2008 Andreas Gohr <andi@splitbrain.org>

removed some illogical path setups

darcs-hash:20081213090400-7ad00-4e21cd75978bb07513f32f5d750658e8d777c59e.gz


# 33815ce2 07-Dec-2008 Chris Smith <chris.eureka@jalakai.co.uk>

Change search index min. token length to a define (IDX_MINWORDLENGTH)

Currently the min. token length is 3 (note, this doesn't apply to numeric tokens).
The value set in inc/indexer.php can be overr

Change search index min. token length to a define (IDX_MINWORDLENGTH)

Currently the min. token length is 3 (note, this doesn't apply to numeric tokens).
The value set in inc/indexer.php can be overridden by defining IDX_MINWORDLENGTH
elsewhere (e.g. conf/local.protected.php).

darcs-hash:20081207161129-f07c6-6432947fe5d74666409d1e00222eaa489374c32f.gz

show more ...


# 60c15d7d 15-Feb-2008 Andreas Gohr <andi@splitbrain.org>

better highlighting for phrase searches FS#1193

This patch makes the highlighting of phrases in search snippets and on
the pages itself much better.

Now a regexp gets passed to the ?s

darcs-hash:2

better highlighting for phrase searches FS#1193

This patch makes the highlighting of phrases in search snippets and on
the pages itself much better.

Now a regexp gets passed to the ?s

darcs-hash:20080215174653-7ad00-cd2d6f7d408db7b7dd3cb9974c3eb27f3a9baeac.gz

show more ...


# b6344591 12-Oct-2007 Tom N Harris <tnharris@whoopdedo.org>

Reduce memory requirement for indexer

darcs-hash:20071012000327-6942e-bdef26ce258dea0229ad8b8dbbc7c089dea880ad.gz


# 103c256a 30-Sep-2007 Chris Smith <chris@jalakai.co.uk>

add page_exists function (inc/pageutils.php)

bool page_exists($id, $rev

darcs-hash:20070930021040-d26fc-e3847bfdd20a36154685262eca94211cfd461e83.gz


# 544a3ce3 01-Oct-2007 Tom N Harris <tnharris@whoopdedo.org>

Remove extraneous print statement

darcs-hash:20071001192639-6942e-f7abb7a91f0b3d9c42267df233815debbdd5ad58.gz


# 00976812 30-Sep-2007 Andreas Gohr <andi@splitbrain.org>

don't use realpath() anymore (FS#1261 and others)

The use of realpath() to clean up relative file names caused some
trouble in certain setups relying on symlinks or having restricitve
file structure

don't use realpath() anymore (FS#1261 and others)

The use of realpath() to clean up relative file names caused some
trouble in certain setups relying on symlinks or having restricitve
file structure setups.

This patch replaces all realpath() calls with a PHP only replacement
which should solve those problems.

darcs-hash:20070930184250-7ad00-512ff04c95f57fc9eaf104f80372237a3c94286f.gz

show more ...


# a0c5c349 19-Sep-2007 Tom N Harris <tnharris@whoopdedo.org>

Remove obsolete words from search index

Creates another index file 'pagewords.idx' for the words in each page.
Words that are deleted from a page can then be removed from the word index.
The indexer

Remove obsolete words from search index

Creates another index file 'pagewords.idx' for the words in each page.
Words that are deleted from a page can then be removed from the word index.
The indexer version is incremented to force rebuilding of the index.
Also, a minor flaw in the regexp for asian words is fixed.

darcs-hash:20070919194244-6942e-2e08157dcf4fdf166b35b36a0faf8a3dfb415ad9.gz

show more ...


# c9db30f9 09-Aug-2007 Andreas Gohr <andi@splitbrain.org>

spelling fix FS#1220

darcs-hash:20070809212154-7ad00-bde57d95f9b61840f1cdac4d60f89bcd0ae83c4a.gz


1234567