18817535bSAndreas Gohr<?php 28817535bSAndreas Gohr 3f6ef2e50SAndreas Gohruse dokuwiki\Extension\CLIPlugin; 4f6ef2e50SAndreas Gohruse dokuwiki\plugin\aichat\Chunk; 5*c2b7a1f7SAndreas Gohruse dokuwiki\plugin\aichat\ModelFactory; 601f06932SAndreas Gohruse dokuwiki\Search\Indexer; 7c4584168SAndreas Gohruse splitbrain\phpcli\Colors; 88817535bSAndreas Gohruse splitbrain\phpcli\Options; 93379af09SAndreas Gohruse splitbrain\phpcli\TableFormatter; 108817535bSAndreas Gohr 118817535bSAndreas Gohr/** 128817535bSAndreas Gohr * DokuWiki Plugin aichat (CLI Component) 138817535bSAndreas Gohr * 148817535bSAndreas Gohr * @license GPL 2 http://www.gnu.org/licenses/gpl-2.0.html 158817535bSAndreas Gohr * @author Andreas Gohr <gohr@cosmocode.de> 168817535bSAndreas Gohr */ 17f6ef2e50SAndreas Gohrclass cli_plugin_aichat extends CLIPlugin 188817535bSAndreas Gohr{ 190337f47fSAndreas Gohr /** @var helper_plugin_aichat */ 200337f47fSAndreas Gohr protected $helper; 210337f47fSAndreas Gohr 22*c2b7a1f7SAndreas Gohr /** @inheritdoc */ 230337f47fSAndreas Gohr public function __construct($autocatch = true) 240337f47fSAndreas Gohr { 250337f47fSAndreas Gohr parent::__construct($autocatch); 260337f47fSAndreas Gohr $this->helper = plugin_load('helper', 'aichat'); 273379af09SAndreas Gohr $this->helper->setLogger($this); 28*c2b7a1f7SAndreas Gohr $this->loadConfig(); 290337f47fSAndreas Gohr } 300337f47fSAndreas Gohr 318817535bSAndreas Gohr /** @inheritDoc */ 328817535bSAndreas Gohr protected function setup(Options $options) 338817535bSAndreas Gohr { 34bddd899cSAndreas Gohr $options->useCompactHelp(); 35bddd899cSAndreas Gohr 365284515dSAndreas Gohr $options->setHelp( 375284515dSAndreas Gohr 'Manage and query the AI chatbot data. Please note that calls to your LLM provider will be made. ' . 385284515dSAndreas Gohr 'This may incur costs.' 395284515dSAndreas Gohr ); 408817535bSAndreas Gohr 415284515dSAndreas Gohr $options->registerCommand( 425284515dSAndreas Gohr 'embed', 435284515dSAndreas Gohr 'Create embeddings for all pages. This skips pages that already have embeddings' 445284515dSAndreas Gohr ); 455284515dSAndreas Gohr $options->registerOption( 465284515dSAndreas Gohr 'clear', 475284515dSAndreas Gohr 'Clear all existing embeddings before creating new ones', 487ebc7895Ssplitbrain 'c', 497ebc7895Ssplitbrain false, 507ebc7895Ssplitbrain 'embed' 515284515dSAndreas Gohr ); 528817535bSAndreas Gohr 53e8451b21SAndreas Gohr $options->registerCommand('maintenance', 'Run storage maintenance. Refer to the documentation for details.'); 543379af09SAndreas Gohr 558817535bSAndreas Gohr $options->registerCommand('similar', 'Search for similar pages'); 568817535bSAndreas Gohr $options->registerArgument('query', 'Look up chunks similar to this query', true, 'similar'); 578817535bSAndreas Gohr 588817535bSAndreas Gohr $options->registerCommand('ask', 'Ask a question'); 598817535bSAndreas Gohr $options->registerArgument('question', 'The question to ask', true, 'ask'); 60c4584168SAndreas Gohr 61c4584168SAndreas Gohr $options->registerCommand('chat', 'Start an interactive chat session'); 62ad38c5fdSAndreas Gohr 63e8451b21SAndreas Gohr $options->registerCommand('models', 'List available models'); 64e8451b21SAndreas Gohr 65e75dc39fSAndreas Gohr $options->registerCommand('info', 'Get Info about the vector storage and other stats'); 668c8b7ba6SAndreas Gohr 67ad38c5fdSAndreas Gohr $options->registerCommand('split', 'Split a page into chunks (for debugging)'); 68ad38c5fdSAndreas Gohr $options->registerArgument('page', 'The page to split', true, 'split'); 695786be46SAndreas Gohr 7001f06932SAndreas Gohr $options->registerCommand('page', 'Check if chunks for a given page are available (for debugging)'); 7101f06932SAndreas Gohr $options->registerArgument('page', 'The page to check', true, 'page'); 72dc355d57SAndreas Gohr $options->registerOption('dump', 'Dump the chunks', 'd', false, 'page'); 7301f06932SAndreas Gohr 748c8b7ba6SAndreas Gohr $options->registerCommand('tsv', 'Create TSV files for visualizing at http://projector.tensorflow.org/' . 758c8b7ba6SAndreas Gohr ' Not supported on all storages.'); 768c8b7ba6SAndreas Gohr $options->registerArgument('vector.tsv', 'The vector file', false, 'tsv'); 778c8b7ba6SAndreas Gohr $options->registerArgument('meta.tsv', 'The meta file', false, 'tsv'); 788817535bSAndreas Gohr } 798817535bSAndreas Gohr 808817535bSAndreas Gohr /** @inheritDoc */ 818817535bSAndreas Gohr protected function main(Options $options) 828817535bSAndreas Gohr { 83*c2b7a1f7SAndreas Gohr if ($this->loglevel['debug']['enabled']) { 84*c2b7a1f7SAndreas Gohr $this->helper->factory->setDebug(true); 85*c2b7a1f7SAndreas Gohr } 86*c2b7a1f7SAndreas Gohr 873379af09SAndreas Gohr ini_set('memory_limit', -1); 888817535bSAndreas Gohr switch ($options->getCmd()) { 898817535bSAndreas Gohr case 'embed': 905284515dSAndreas Gohr $this->createEmbeddings($options->getOpt('clear')); 918817535bSAndreas Gohr break; 923379af09SAndreas Gohr case 'maintenance': 933379af09SAndreas Gohr $this->runMaintenance(); 943379af09SAndreas Gohr break; 958817535bSAndreas Gohr case 'similar': 968817535bSAndreas Gohr $this->similar($options->getArgs()[0]); 978817535bSAndreas Gohr break; 987552f1aaSAndreas Gohr case 'ask': 997552f1aaSAndreas Gohr $this->ask($options->getArgs()[0]); 1007552f1aaSAndreas Gohr break; 101c4584168SAndreas Gohr case 'chat': 102c4584168SAndreas Gohr $this->chat(); 103c4584168SAndreas Gohr break; 104e8451b21SAndreas Gohr case 'models': 105e8451b21SAndreas Gohr $this->models(); 106e8451b21SAndreas Gohr break; 107ad38c5fdSAndreas Gohr case 'split': 108ad38c5fdSAndreas Gohr $this->split($options->getArgs()[0]); 109ad38c5fdSAndreas Gohr break; 11001f06932SAndreas Gohr case 'page': 111dc355d57SAndreas Gohr $this->page($options->getArgs()[0], $options->getOpt('dump')); 11201f06932SAndreas Gohr break; 1135786be46SAndreas Gohr case 'info': 114f6ef2e50SAndreas Gohr $this->showinfo(); 1155786be46SAndreas Gohr break; 1168c8b7ba6SAndreas Gohr case 'tsv': 1178c8b7ba6SAndreas Gohr $args = $options->getArgs(); 1188c8b7ba6SAndreas Gohr $vector = $args[0] ?? 'vector.tsv'; 1198c8b7ba6SAndreas Gohr $meta = $args[1] ?? 'meta.tsv'; 1208c8b7ba6SAndreas Gohr $this->tsv($vector, $meta); 1218c8b7ba6SAndreas Gohr break; 1228817535bSAndreas Gohr default: 1238817535bSAndreas Gohr echo $options->help(); 1248817535bSAndreas Gohr } 1258817535bSAndreas Gohr } 1268817535bSAndreas Gohr 127c4584168SAndreas Gohr /** 1285786be46SAndreas Gohr * @return void 1295786be46SAndreas Gohr */ 130f6ef2e50SAndreas Gohr protected function showinfo() 1315786be46SAndreas Gohr { 1323379af09SAndreas Gohr $stats = [ 13399b713bfSAndreas Gohr 'chat model' => $this->getConf('chatmodel'), 13499b713bfSAndreas Gohr 'embed model' => $this->getConf('embedmodel'), 1353379af09SAndreas Gohr ]; 136e75dc39fSAndreas Gohr $stats = array_merge( 137e75dc39fSAndreas Gohr $stats, 138e75dc39fSAndreas Gohr array_map('dformat', $this->helper->getRunData()), 139e75dc39fSAndreas Gohr $this->helper->getStorage()->statistics() 140e75dc39fSAndreas Gohr ); 1413379af09SAndreas Gohr $this->printTable($stats); 1427ee8b02dSAndreas Gohr } 143911314cdSAndreas Gohr 1443379af09SAndreas Gohr /** 1453379af09SAndreas Gohr * Print key value data as tabular data 1463379af09SAndreas Gohr * 1473379af09SAndreas Gohr * @param array $data 1483379af09SAndreas Gohr * @param int $level 1493379af09SAndreas Gohr * @return void 1503379af09SAndreas Gohr */ 1513379af09SAndreas Gohr protected function printTable($data, $level = 0) 1523379af09SAndreas Gohr { 1533379af09SAndreas Gohr $tf = new TableFormatter($this->colors); 1543379af09SAndreas Gohr foreach ($data as $key => $value) { 1553379af09SAndreas Gohr if (is_array($value)) { 1563379af09SAndreas Gohr echo $tf->format( 157e75dc39fSAndreas Gohr [$level * 2, 20, '*'], 1583379af09SAndreas Gohr ['', $key, ''], 1593379af09SAndreas Gohr [Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE] 1603379af09SAndreas Gohr ); 1613379af09SAndreas Gohr $this->printTable($value, $level + 1); 1623379af09SAndreas Gohr } else { 1633379af09SAndreas Gohr echo $tf->format( 164e75dc39fSAndreas Gohr [$level * 2, 20, '*'], 1653379af09SAndreas Gohr ['', $key, $value], 1663379af09SAndreas Gohr [Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE, Colors::C_LIGHTGRAY] 1673379af09SAndreas Gohr ); 1683379af09SAndreas Gohr } 1693379af09SAndreas Gohr } 1705786be46SAndreas Gohr } 1715786be46SAndreas Gohr 1725786be46SAndreas Gohr /** 17301f06932SAndreas Gohr * Check chunk availability for a given page 17401f06932SAndreas Gohr * 17501f06932SAndreas Gohr * @param string $page 17601f06932SAndreas Gohr * @return void 17701f06932SAndreas Gohr */ 178dc355d57SAndreas Gohr protected function page($page, $dump = false) 17901f06932SAndreas Gohr { 18001f06932SAndreas Gohr $indexer = new Indexer(); 18101f06932SAndreas Gohr $pages = $indexer->getPages(); 18201f06932SAndreas Gohr $pos = array_search(cleanID($page), $pages); 18301f06932SAndreas Gohr 18401f06932SAndreas Gohr if ($pos === false) { 18501f06932SAndreas Gohr $this->error('Page not found'); 18601f06932SAndreas Gohr return; 18701f06932SAndreas Gohr } 18801f06932SAndreas Gohr 18901f06932SAndreas Gohr $storage = $this->helper->getStorage(); 19001f06932SAndreas Gohr $chunks = $storage->getPageChunks($page, $pos * 100); 19101f06932SAndreas Gohr if ($chunks) { 19201f06932SAndreas Gohr $this->success('Found ' . count($chunks) . ' chunks'); 193dc355d57SAndreas Gohr if ($dump) { 194dc355d57SAndreas Gohr echo json_encode($chunks, JSON_PRETTY_PRINT); 195dc355d57SAndreas Gohr } 19601f06932SAndreas Gohr } else { 19701f06932SAndreas Gohr $this->error('No chunks found'); 19801f06932SAndreas Gohr } 19901f06932SAndreas Gohr } 20001f06932SAndreas Gohr 20101f06932SAndreas Gohr /** 202ad38c5fdSAndreas Gohr * Split the given page into chunks and print them 203ad38c5fdSAndreas Gohr * 204ad38c5fdSAndreas Gohr * @param string $page 205ad38c5fdSAndreas Gohr * @return void 206ad38c5fdSAndreas Gohr * @throws Exception 207ad38c5fdSAndreas Gohr */ 208ad38c5fdSAndreas Gohr protected function split($page) 209ad38c5fdSAndreas Gohr { 210ad38c5fdSAndreas Gohr $text = rawWiki($page); 211ad38c5fdSAndreas Gohr $chunks = $this->helper->getEmbeddings()->splitIntoChunks($text); 212ad38c5fdSAndreas Gohr foreach ($chunks as $chunk) { 213ad38c5fdSAndreas Gohr echo $chunk; 214ad38c5fdSAndreas Gohr echo "\n"; 215ad38c5fdSAndreas Gohr $this->colors->ptln('--------------------------------', Colors::C_LIGHTPURPLE); 216ad38c5fdSAndreas Gohr } 217ad38c5fdSAndreas Gohr $this->success('Split into ' . count($chunks) . ' chunks'); 218ad38c5fdSAndreas Gohr } 219ad38c5fdSAndreas Gohr 220ad38c5fdSAndreas Gohr /** 221c4584168SAndreas Gohr * Interactive Chat Session 222c4584168SAndreas Gohr * 223c4584168SAndreas Gohr * @return void 224c4584168SAndreas Gohr * @throws Exception 225c4584168SAndreas Gohr */ 226c4584168SAndreas Gohr protected function chat() 227c4584168SAndreas Gohr { 228c4584168SAndreas Gohr $history = []; 229c4584168SAndreas Gohr while ($q = $this->readLine('Your Question')) { 2306a18e0f4SAndreas Gohr $this->helper->getChatModel()->resetUsageStats(); 23151aa8517SAndreas Gohr $this->helper->getRephraseModel()->resetUsageStats(); 232*c2b7a1f7SAndreas Gohr $this->helper->getEmbeddingModel()->resetUsageStats(); 233f6ef2e50SAndreas Gohr $result = $this->helper->askChatQuestion($q, $history); 234f6ef2e50SAndreas Gohr $this->colors->ptln("Interpretation: {$result['question']}", Colors::C_LIGHTPURPLE); 235f6ef2e50SAndreas Gohr $history[] = [$result['question'], $result['answer']]; 236c4584168SAndreas Gohr $this->printAnswer($result); 237c4584168SAndreas Gohr } 238c4584168SAndreas Gohr } 239c4584168SAndreas Gohr 240*c2b7a1f7SAndreas Gohr /** 241*c2b7a1f7SAndreas Gohr * Print information about the available models 242*c2b7a1f7SAndreas Gohr * 243*c2b7a1f7SAndreas Gohr * @return void 244*c2b7a1f7SAndreas Gohr */ 245e8451b21SAndreas Gohr protected function models() 246e8451b21SAndreas Gohr { 247*c2b7a1f7SAndreas Gohr $result = (new ModelFactory($this->conf))->getModels(); 248e8451b21SAndreas Gohr 249e8451b21SAndreas Gohr $td = new TableFormatter($this->colors); 250e8451b21SAndreas Gohr $cols = [30, 20, 20, '*']; 251e8451b21SAndreas Gohr echo "==== Chat Models ====\n\n"; 252e8451b21SAndreas Gohr echo $td->format( 253e8451b21SAndreas Gohr $cols, 254e8451b21SAndreas Gohr ['Model', 'Token Limits', 'Price USD/M', 'Description'], 255e8451b21SAndreas Gohr [Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE] 256e8451b21SAndreas Gohr ); 257e8451b21SAndreas Gohr foreach ($result['chat'] as $name => $info) { 258e8451b21SAndreas Gohr echo $td->format( 259e8451b21SAndreas Gohr $cols, 260e8451b21SAndreas Gohr [ 261e8451b21SAndreas Gohr $name, 262e8451b21SAndreas Gohr sprintf(" In: %7d\nOut: %7d", $info['inputTokens'], $info['outputTokens']), 2632045e15aSAndreas Gohr sprintf(" In: %.2f\nOut: %.2f", $info['inputTokenPrice'], $info['outputTokenPrice']), 264e8451b21SAndreas Gohr $info['description'] . "\n" 265e8451b21SAndreas Gohr ], 266e8451b21SAndreas Gohr [ 267*c2b7a1f7SAndreas Gohr $info['instance'] ? Colors::C_LIGHTGREEN : Colors::C_LIGHTRED, 268e8451b21SAndreas Gohr ] 269e8451b21SAndreas Gohr ); 270e8451b21SAndreas Gohr } 271e8451b21SAndreas Gohr 27287e46484SAndreas Gohr $cols = [30, 10, 10, 10, '*']; 273e8451b21SAndreas Gohr echo "==== Embedding Models ====\n\n"; 274e8451b21SAndreas Gohr echo $td->format( 275e8451b21SAndreas Gohr $cols, 27687e46484SAndreas Gohr ['Model', 'Token Limits', 'Price USD/M', 'Dimensions', 'Description'], 27787e46484SAndreas Gohr [Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE, Colors::C_LIGHTBLUE] 278e8451b21SAndreas Gohr ); 279e8451b21SAndreas Gohr foreach ($result['embedding'] as $name => $info) { 280e8451b21SAndreas Gohr echo $td->format( 281e8451b21SAndreas Gohr $cols, 282e8451b21SAndreas Gohr [ 283e8451b21SAndreas Gohr $name, 284e8451b21SAndreas Gohr sprintf("%7d", $info['inputTokens']), 285e8451b21SAndreas Gohr sprintf("%.2f", $info['inputTokenPrice']), 28687e46484SAndreas Gohr $info['dimensions'], 287e8451b21SAndreas Gohr $info['description'] . "\n" 288e8451b21SAndreas Gohr ], 289e8451b21SAndreas Gohr [ 290*c2b7a1f7SAndreas Gohr $info['instance'] ? Colors::C_LIGHTGREEN : Colors::C_LIGHTRED, 291e8451b21SAndreas Gohr ] 292e8451b21SAndreas Gohr ); 293e8451b21SAndreas Gohr } 294e8451b21SAndreas Gohr 295e8451b21SAndreas Gohr $this->colors->ptln('Current prices may differ', Colors::C_RED); 296e8451b21SAndreas Gohr } 297e8451b21SAndreas Gohr 298c4584168SAndreas Gohr /** 299c4584168SAndreas Gohr * Handle a single, standalone question 300c4584168SAndreas Gohr * 301c4584168SAndreas Gohr * @param string $query 302c4584168SAndreas Gohr * @return void 303c4584168SAndreas Gohr * @throws Exception 304c4584168SAndreas Gohr */ 305c4584168SAndreas Gohr protected function ask($query) 306c4584168SAndreas Gohr { 3070337f47fSAndreas Gohr $result = $this->helper->askQuestion($query); 308c4584168SAndreas Gohr $this->printAnswer($result); 3097552f1aaSAndreas Gohr } 3107552f1aaSAndreas Gohr 311c4584168SAndreas Gohr /** 312c4584168SAndreas Gohr * Get the pages that are similar to the query 313c4584168SAndreas Gohr * 314c4584168SAndreas Gohr * @param string $query 315c4584168SAndreas Gohr * @return void 316c4584168SAndreas Gohr */ 3178817535bSAndreas Gohr protected function similar($query) 3188817535bSAndreas Gohr { 319e33a1d7aSAndreas Gohr $langlimit = $this->helper->getLanguageLimit(); 320e33a1d7aSAndreas Gohr if ($langlimit) { 321e33a1d7aSAndreas Gohr $this->info('Limiting results to {lang}', ['lang' => $langlimit]); 322e33a1d7aSAndreas Gohr } 323e33a1d7aSAndreas Gohr 324e33a1d7aSAndreas Gohr $sources = $this->helper->getEmbeddings()->getSimilarChunks($query, $langlimit); 325f6ef2e50SAndreas Gohr $this->printSources($sources); 3268817535bSAndreas Gohr } 3278817535bSAndreas Gohr 328c4584168SAndreas Gohr /** 3293379af09SAndreas Gohr * Run the maintenance tasks 3303379af09SAndreas Gohr * 3313379af09SAndreas Gohr * @return void 3323379af09SAndreas Gohr */ 3333379af09SAndreas Gohr protected function runMaintenance() 3343379af09SAndreas Gohr { 3353379af09SAndreas Gohr $start = time(); 3363379af09SAndreas Gohr $this->helper->getStorage()->runMaintenance(); 3373379af09SAndreas Gohr $this->notice('Peak memory used: {memory}', ['memory' => filesize_h(memory_get_peak_usage(true))]); 3383379af09SAndreas Gohr $this->notice('Spent time: {time}min', ['time' => round((time() - $start) / 60, 2)]); 339e75dc39fSAndreas Gohr 340e75dc39fSAndreas Gohr $data = $this->helper->getRunData(); 341e75dc39fSAndreas Gohr $data['maintenance ran at'] = time(); 342e75dc39fSAndreas Gohr $this->helper->setRunData($data); 3433379af09SAndreas Gohr } 3443379af09SAndreas Gohr 3453379af09SAndreas Gohr /** 346c4584168SAndreas Gohr * Recreate chunks and embeddings for all pages 347c4584168SAndreas Gohr * 348c4584168SAndreas Gohr * @return void 349c4584168SAndreas Gohr */ 3505284515dSAndreas Gohr protected function createEmbeddings($clear) 3518817535bSAndreas Gohr { 352d5c102b3SAndreas Gohr [$skipRE, $matchRE] = $this->getRegexps(); 353d5c102b3SAndreas Gohr 3543379af09SAndreas Gohr $start = time(); 355d5c102b3SAndreas Gohr $this->helper->getEmbeddings()->createNewIndex($skipRE, $matchRE, $clear); 356ad38c5fdSAndreas Gohr $this->notice('Peak memory used: {memory}', ['memory' => filesize_h(memory_get_peak_usage(true))]); 3573379af09SAndreas Gohr $this->notice('Spent time: {time}min', ['time' => round((time() - $start) / 60, 2)]); 358e75dc39fSAndreas Gohr 359e75dc39fSAndreas Gohr $data = $this->helper->getRunData(); 360e75dc39fSAndreas Gohr $data['embed ran at'] = time(); 361e75dc39fSAndreas Gohr $this->helper->setRunData($data); 3628817535bSAndreas Gohr } 3638817535bSAndreas Gohr 364c4584168SAndreas Gohr /** 3658c8b7ba6SAndreas Gohr * Dump TSV files for debugging 3668c8b7ba6SAndreas Gohr * 3678c8b7ba6SAndreas Gohr * @return void 3688c8b7ba6SAndreas Gohr */ 3698c8b7ba6SAndreas Gohr protected function tsv($vector, $meta) 3708c8b7ba6SAndreas Gohr { 3718c8b7ba6SAndreas Gohr 3728c8b7ba6SAndreas Gohr $storage = $this->helper->getStorage(); 3738c8b7ba6SAndreas Gohr $storage->dumpTSV($vector, $meta); 3748c8b7ba6SAndreas Gohr $this->success('written to ' . $vector . ' and ' . $meta); 3758c8b7ba6SAndreas Gohr } 3768c8b7ba6SAndreas Gohr 3778c8b7ba6SAndreas Gohr /** 37855392016SAndreas Gohr * Print the given detailed answer in a nice way 37955392016SAndreas Gohr * 38055392016SAndreas Gohr * @param array $answer 38155392016SAndreas Gohr * @return void 38255392016SAndreas Gohr */ 38355392016SAndreas Gohr protected function printAnswer($answer) 38455392016SAndreas Gohr { 38555392016SAndreas Gohr $this->colors->ptln($answer['answer'], Colors::C_LIGHTCYAN); 38655392016SAndreas Gohr echo "\n"; 387f6ef2e50SAndreas Gohr $this->printSources($answer['sources']); 38855392016SAndreas Gohr echo "\n"; 38955392016SAndreas Gohr $this->printUsage(); 39055392016SAndreas Gohr } 39155392016SAndreas Gohr 39255392016SAndreas Gohr /** 393f6ef2e50SAndreas Gohr * Print the given sources 394f6ef2e50SAndreas Gohr * 395f6ef2e50SAndreas Gohr * @param Chunk[] $sources 396f6ef2e50SAndreas Gohr * @return void 397f6ef2e50SAndreas Gohr */ 398f6ef2e50SAndreas Gohr protected function printSources($sources) 399f6ef2e50SAndreas Gohr { 400f6ef2e50SAndreas Gohr foreach ($sources as $source) { 401f6ef2e50SAndreas Gohr /** @var Chunk $source */ 4029b3d1b36SAndreas Gohr $this->colors->ptln( 4039b3d1b36SAndreas Gohr "\t" . $source->getPage() . ' ' . $source->getId() . ' (' . $source->getScore() . ')', 4049b3d1b36SAndreas Gohr Colors::C_LIGHTBLUE 4059b3d1b36SAndreas Gohr ); 406f6ef2e50SAndreas Gohr } 407f6ef2e50SAndreas Gohr } 408f6ef2e50SAndreas Gohr 409f6ef2e50SAndreas Gohr /** 41055392016SAndreas Gohr * Print the usage statistics for OpenAI 41155392016SAndreas Gohr * 41255392016SAndreas Gohr * @return void 41355392016SAndreas Gohr */ 414f6ef2e50SAndreas Gohr protected function printUsage() 415f6ef2e50SAndreas Gohr { 41651aa8517SAndreas Gohr $chat = $this->helper->getChatModel()->getUsageStats(); 41751aa8517SAndreas Gohr $rephrase = $this->helper->getRephraseModel()->getUsageStats(); 418*c2b7a1f7SAndreas Gohr $embed = $this->helper->getEmbeddingModel()->getUsageStats(); 41951aa8517SAndreas Gohr 42055392016SAndreas Gohr $this->info( 42151aa8517SAndreas Gohr 'Made {requests} requests in {time}s to models. Used {tokens} tokens for about ${cost}.', 42251aa8517SAndreas Gohr [ 42351aa8517SAndreas Gohr 'requests' => $chat['requests'] + $rephrase['requests'] + $embed['requests'], 42451aa8517SAndreas Gohr 'time' => $chat['time'] + $rephrase['time'] + $embed['time'], 42551aa8517SAndreas Gohr 'tokens' => $chat['tokens'] + $chat['tokens'] + $embed['tokens'], 42651aa8517SAndreas Gohr 'cost' => $chat['cost'] + $chat['cost'] + $embed['cost'], 42751aa8517SAndreas Gohr ] 42855392016SAndreas Gohr ); 42955392016SAndreas Gohr } 43055392016SAndreas Gohr 43155392016SAndreas Gohr /** 432c4584168SAndreas Gohr * Interactively ask for a value from the user 433c4584168SAndreas Gohr * 434c4584168SAndreas Gohr * @param string $prompt 435c4584168SAndreas Gohr * @return string 436c4584168SAndreas Gohr */ 437c4584168SAndreas Gohr protected function readLine($prompt) 438c4584168SAndreas Gohr { 439c4584168SAndreas Gohr $value = ''; 4408817535bSAndreas Gohr 441c4584168SAndreas Gohr while ($value === '') { 442c4584168SAndreas Gohr echo $prompt; 443c4584168SAndreas Gohr echo ': '; 444c4584168SAndreas Gohr 445c4584168SAndreas Gohr $fh = fopen('php://stdin', 'r'); 446c4584168SAndreas Gohr $value = trim(fgets($fh)); 447c4584168SAndreas Gohr fclose($fh); 448c4584168SAndreas Gohr } 449c4584168SAndreas Gohr 450c4584168SAndreas Gohr return $value; 451c4584168SAndreas Gohr } 452d5c102b3SAndreas Gohr 453d5c102b3SAndreas Gohr /** 454d5c102b3SAndreas Gohr * Read the skip and match regex from the config 455d5c102b3SAndreas Gohr * 456d5c102b3SAndreas Gohr * Ensures the regular expressions are valid 457d5c102b3SAndreas Gohr * 458d5c102b3SAndreas Gohr * @return string[] [$skipRE, $matchRE] 459d5c102b3SAndreas Gohr */ 460d5c102b3SAndreas Gohr protected function getRegexps() 461d5c102b3SAndreas Gohr { 462d5c102b3SAndreas Gohr $skip = $this->getConf('skipRegex'); 463d5c102b3SAndreas Gohr $skipRE = ''; 464d5c102b3SAndreas Gohr $match = $this->getConf('matchRegex'); 465d5c102b3SAndreas Gohr $matchRE = ''; 466d5c102b3SAndreas Gohr 467d5c102b3SAndreas Gohr if ($skip) { 468d5c102b3SAndreas Gohr $skipRE = '/' . $skip . '/'; 46949a7d3ccSsplitbrain if (@preg_match($skipRE, '') === false) { 470d5c102b3SAndreas Gohr $this->error(preg_last_error_msg()); 471d5c102b3SAndreas Gohr $this->error('Invalid regular expression in $conf[\'skipRegex\']. Ignored.'); 472d5c102b3SAndreas Gohr $skipRE = ''; 473d5c102b3SAndreas Gohr } else { 474d5c102b3SAndreas Gohr $this->success('Skipping pages matching ' . $skipRE); 475d5c102b3SAndreas Gohr } 476d5c102b3SAndreas Gohr } 477d5c102b3SAndreas Gohr 478d5c102b3SAndreas Gohr if ($match) { 479d5c102b3SAndreas Gohr $matchRE = '/' . $match . '/'; 48049a7d3ccSsplitbrain if (@preg_match($matchRE, '') === false) { 481d5c102b3SAndreas Gohr $this->error(preg_last_error_msg()); 482d5c102b3SAndreas Gohr $this->error('Invalid regular expression in $conf[\'matchRegex\']. Ignored.'); 483d5c102b3SAndreas Gohr $matchRE = ''; 484d5c102b3SAndreas Gohr } else { 485d5c102b3SAndreas Gohr $this->success('Only indexing pages matching ' . $matchRE); 486d5c102b3SAndreas Gohr } 487d5c102b3SAndreas Gohr } 488d5c102b3SAndreas Gohr return [$skipRE, $matchRE]; 489d5c102b3SAndreas Gohr } 4908817535bSAndreas Gohr} 491