1<html>
2<head>
3<title>How do &quot;fetchers&quot; work?</title>
4<link rel="stylesheet" type="text/css" medial="all" title="Default" href="css/help.css"/>
5<style type="text/css">
6div.note {
7  margin: 0.5em 0;
8}
9
10div.class {
11  margin: 0.5em 0 0.5em 2em;
12}
13
14div.interface {
15  margin: 1em 0 0.5em 0;
16  padding: 2px 5px;
17  background-color: #f0f0f0;
18}
19
20span.interface_name {
21  font-weight: bold;
22}
23
24span.method_name {
25  font-weight: bold;
26}
27</style>
28</head>
29<body>
30
31<h1>How do &quot;fetchers&quot; work?</h1>
32<p>
33Basically, &quot;fetcher&quot; is a simple object responsible for delivering external files to the script.
34Default fetcher object supplied with html2ps/pdf fetches HTML, images and CSS from remote sites using HTTP protocol.
35If you're using your own fetcher, you need to implement 'get_data' function returning contents of requested file and,
36probably, 'get_base_url', returning URL to be used as a base one while resolving relative URLs in recently fetched HTML file.
37</p>
38
39<p>
40The image below illustrates simple html2ps session using default fetcher while converting html file from abstract test.com site.
41</p>
42
43<img src="uml/Simple_fetcher_session.PNG"/>
44
45<p>
46If you have pages stored on your local system or dynamically generated and kept in memory, you don't need to use HTTP protocol to fetch them.
47In this case, you should use custom fetcher, so session will look similar to image below. Note that fetcher processes <em>all</em> requests,
48returning valid content for all requests; this makes difference from the <em>very simple</em> fetcher supplied with html2ps, which <em>does always
49return</em> memory string content whatever the request is. Internals of the fully-featured fetcher will depend on your system architecture greatly,
50so most likely such fetcher will never be included to html2ps distribution.
51</p>
52
53<img src="uml/Custom_fetcher_session.PNG"/>
54
55<p>
56The image below illustrates why images and external stylesheets are not rendered when you're using <em>too simple</em> fetcher object.
57</p>
58
59<img src="uml/Simple_custom_fetcher_session.PNG"/>
60
61<p>
62Sometimes you need to fetch files from different places; for example, HTML code is generated locally, while images and CSS files should be fetched via
63HTTP protocol. In this case you'll need to use several fetchers at once, as illustrated below. Note that in this case you need to implement 'get_base_url'
64function returning correct URL so script will be able to resolve relative URLs contained in HTML code.
65</p>
66
67<img src="uml/Multiple_fetcher_session.PNG"/>
68
69</body>
70</html>