Image Caching in PL/SQL Applications
Posted by Tyler Muth on February 4, 2008
‘Apollo, we have a solution’. I’ve been working with PL/SQL based web applications for many years now, and one of the more annoying aspects of mod_plsql is that when you serve images out of the database, your browser never seems to cache them. I’ve played around with the Cache-Control, Expires, and Last-Modified headers, but nothing seems to work. Every time I reload the page, the browser reloads all images coming from mod_plsql, while it gets images that come from the file system from cache. This results in the page feeling slow to the end user when it has a lot of images that could have been cached.
One of the projects I’m currently working on is going to use a lot of images stored in the database. Faced with the caching issue, I was considering using Java to write the images out to the file system so Apache could serve them in a way that would be cached. Before I went down that path, I decided to do some tracing with Firebug and Wireshark. A few things to note about these tools. When tracing image caching with Firebug > Net > Images, a light gray bar indicates a cached image and a dark gray bar indicates a non-cached image. When using wireshark, right-click on a relevant captured packet, such as a Get of an image, and select “Follow TCP Stream”. This will show you the packet stream in order.
I started by tracing a static page served by Apache that contained some images served from the file system. What I discovered was that the browser was making the same “Get” requests every time I reloaded a page, whether I pressed shift+refresh or simply refresh, returning a non-cached or cached page respectively. One caveat to this was that the “If-Modified-Since” and “If-None-Match” headers were only there on cached pages, but I’ll get into this in a bit. So, the browser was requesting the image (or file) every time, which meant it was not caching the image. So, something was happening server-side that I was missing.
After a little more tracing, then some reading of the HTTP spec, I figured out how the request-response process worked. The following example is an imaginary conversation between Firefox and Apache:
Firefox: Get logo.gif
Apache: HTTP 200 “OK”, ETag: ABC123, Content-Length: 35, contents of gif file
Firefox: Get logo.gif, If-None-Match: ABC123
Apache: HTTP 304 “Not Modified”, ETag: ABC123, Content-Length: 35, headers closed, no file returned
On the first request for logo.gif, Apache returned an ETag header as well as the contents of the file. On the second request, Firefox sent the value of ETag as the If-None-Match header. Apache compared the values and simply returned a 304 status, not the image since the values were the same.
What is ETag? It’s simply a unique identifier for a file that should change any time the file changes. It’s corresponding request header is “If-None-Match”. The same comparison can be made between the “Last-Modified” response header and the “If-Modified-Since” request header. The HTTP spec considers ETag a “strong” attribute and “Last-Modified” a weak attribute, so I will focus on ETag from here on out.
So, all this time we were expecting Apache or mod_plsql to cache our images, we (the PL/SQL programmers) were actually the ones responsible for implementing this caching. Fortunately it’s very simple, especially now that you understand the problem.
To make this work, you need to do 3 things:
- Make the If-None-Match CGI environment variable available to mod_plsq or the embedded gateway.
- Send the ETag header back with any image request.
- When an image is requested, compare the ETag and If-None-Match headers and either return a 304 Not Modified header or return the image.
If you’re using the Embedded PL/SQL Gateway, execute the following code as SYS to add 2 CGI environment variables:
begin DBMS_EPG.SET_DAD_ATTRIBUTE ( dad_name => 'APEX', attr_name => 'cgi-environment-list', attr_value => 'HTTP_IF_NONE_MATCH'); DBMS_EPG.SET_DAD_ATTRIBUTE ( dad_name => 'APEX', attr_name => 'cgi-environment-list', attr_value => 'IF_MODIFIED_SINCE'); end; /
If you’re using Oracle HTTP Server (Apache + mod_plsql), add the following lines to your Database Access Descriptor (DAD), then bounce OHS to apply your changes:
OK, now for the code:
create or replace procedure image_test( p_name in varchar2) is l_mime varchar2(255); l_length number; l_file_name varchar2(2000); l_id varchar2(32); l_created_on date; l_file_etag varchar2(100); l_header_etag varchar2(100); l_file_blob blob; begin for c1 in (select id,name,filename,title,mime_type, doc_size,created_on,blob_content from my_files a where a.filename = p_name) loop l_id := c1.id; l_created_on := c1.created_on; l_file_name := c1.filename; l_mime := c1.mime_type; l_length := c1.doc_size; l_file_blob := c1.blob_content; end loop; --c1 l_header_etag := owa_util.get_cgi_env('HTTP_IF_NONE_MATCH'); l_file_etag := l_id||to_char(l_created_on,'JHH24MISS'); if l_file_etag = l_header_etag then owa_util.status_line( NSTATUS => 304, CREASON => 'Not Modified', BCLOSE_HEADER => true); return; else owa_util.mime_header(nvl(l_mime,'application/octet'), FALSE ); htp.p('Date: ' ||to_char(l_created_on,'Dy, DD Mon RRRR hh24:mi:ss')||' GMT'); htp.p('Content-length: ' || l_length); htp.p('Content-Disposition: filename="' || l_file_name || '"'); htp.p('ETag: ' || l_file_etag); htp.p('Expires: ' ||to_char(l_created_on+1,'Dy, DD Mon RRRR hh24:mi:ss')||' GMT'); htp.p('Cache-Control: max-age=300'); owa_util.http_header_close; wpg_docload.download_file(l_file_blob); end if; end image_test; /
Lines 27-35 are the important ones where we decide whether the image / file has changed and simply return a 304 status code, NOT the image itself. Note that I’m constructing the ETag header by concatenating the primary key with last-updated / created_on date (formatted as a Julian date with time). This means that if the image or file is updated, the date will change, thus changing the ETag. This, in turn will allow your browser to download the updated version.