Tyler Muth’s Blog

Technology with a focus on Oracle, Application Express and Linux

Posts Tagged ‘Oracle PL/SQL Images Caching’

Image Caching in PL/SQL Applications

Posted by Tyler Muth on February 4, 2008

‘Apollo, we have a solution’. I’ve been working with PL/SQL based web applications for many years now, and one of the more annoying aspects of mod_plsql is that when you serve images out of the database, your browser never seems to cache them. I’ve played around with the Cache-Control, Expires, and Last-Modified headers, but nothing seems to work. Every time I reload the page, the browser reloads all images coming from mod_plsql, while it gets images that come from the file system from cache. This results in the page feeling slow to the end user when it has a lot of images that could have been cached.

One of the projects I’m currently working on is going to use a lot of images stored in the database. Faced with the caching issue, I was considering using Java to write the images out to the file system so Apache could serve them in a way that would be cached. Before I went down that path, I decided to do some tracing with Firebug and Wireshark. A few things to note about these tools. When tracing image caching with Firebug > Net > Images, a light gray bar indicates a cached image and a dark gray bar indicates a non-cached image. When using wireshark, right-click on a relevant captured packet, such as a Get of an image, and select “Follow TCP Stream”. This will show you the packet stream in order.

I started by tracing a static page served by Apache that contained some images served from the file system. What I discovered was that the browser was making the same “Get” requests every time I reloaded a page, whether I pressed shift+refresh or simply refresh, returning a non-cached or cached page respectively. One caveat to this was that the “If-Modified-Since” and “If-None-Match” headers were only there on cached pages, but I’ll get into this in a bit. So, the browser was requesting the image (or file) every time, which meant it was not caching the image. So, something was happening server-side that I was missing.

After a little more tracing, then some reading of the HTTP spec, I figured out how the request-response process worked. The following example is an imaginary conversation between Firefox and Apache:

Firefox: Get logo.gif
Apache: HTTP 200 “OK”, ETag: ABC123, Content-Length: 35, contents of gif file

Firefox: Get logo.gif, If-None-Match: ABC123
Apache: HTTP 304 “Not Modified”, ETag: ABC123, Content-Length: 35, headers closed, no file returned

On the first request for logo.gif, Apache returned an ETag header as well as the contents of the file. On the second request, Firefox sent the value of ETag as the If-None-Match header. Apache compared the values and simply returned a 304 status, not the image since the values were the same.

What is ETag? It’s simply a unique identifier for a file that should change any time the file changes. It’s corresponding request header is “If-None-Match”. The same comparison can be made between the “Last-Modified” response header and the “If-Modified-Since” request header. The HTTP spec considers ETag a “strong” attribute and “Last-Modified” a weak attribute, so I will focus on ETag from here on out.

So, all this time we were expecting Apache or mod_plsql to cache our images, we (the PL/SQL programmers) were actually the ones responsible for implementing this caching. Fortunately it’s very simple, especially now that you understand the problem.

To make this work, you need to do 3 things:

  1. Make the If-None-Match CGI environment variable available to mod_plsq or the embedded gateway.
  2. Send the ETag header back with any image request.
  3. When an image is requested, compare the ETag and If-None-Match headers and either return a 304 Not Modified header or return the image.

If you’re using the Embedded PL/SQL Gateway, execute the following code as SYS to add 2 CGI environment variables:

   dad_name    => 'APEX',
   attr_name   => 'cgi-environment-list',
   attr_value  => 'HTTP_IF_NONE_MATCH');

   dad_name    => 'APEX',
   attr_name   => 'cgi-environment-list',
   attr_value  => 'IF_MODIFIED_SINCE');

If you’re using Oracle HTTP Server (Apache + mod_plsql), add the following lines to your Database Access Descriptor (DAD), then bounce OHS to apply your changes:
PlsqlCGIEnvironmentList HTTP_IF_NONE_MATCH

OK, now for the code:

create or replace procedure image_test(
    p_name      in varchar2)
    l_mime          varchar2(255);
    l_length        number;
    l_file_name     varchar2(2000);
    l_id            varchar2(32);
    l_created_on    date;
    l_file_etag     varchar2(100);
    l_header_etag   varchar2(100);
    l_file_blob     blob;

    for c1 in (select id,name,filename,title,mime_type,
                 from my_files a
                where a.filename = p_name)
        l_id            := c1.id;
        l_created_on    := c1.created_on;
        l_file_name     := c1.filename;
        l_mime          := c1.mime_type;
        l_length        := c1.doc_size;
        l_file_blob     := c1.blob_content;
    end loop; --c1

    l_header_etag   := owa_util.get_cgi_env('HTTP_IF_NONE_MATCH');
    l_file_etag     := l_id||to_char(l_created_on,'JHH24MISS');

    if l_file_etag = l_header_etag then
            NSTATUS         => 304,
            CREASON         => 'Not Modified',
            BCLOSE_HEADER   => true);
        owa_util.mime_header(nvl(l_mime,'application/octet'), FALSE );

        htp.p('Date: ' ||to_char(l_created_on,'Dy, DD Mon RRRR hh24:mi:ss')||' GMT');
        htp.p('Content-length: ' || l_length);
        htp.p('Content-Disposition: filename="' || l_file_name || '"');
        htp.p('ETag: ' || l_file_etag);
        htp.p('Expires: ' ||to_char(l_created_on+1,'Dy, DD Mon RRRR hh24:mi:ss')||' GMT');
        htp.p('Cache-Control: max-age=300');
    end if;
end image_test;

Lines 27-35 are the important ones where we decide whether the image / file has changed and simply return a 304 status code, NOT the image itself. Note that I’m constructing the ETag header by concatenating the primary key with last-updated / created_on date (formatted as a Julian date with time). This means that if the image or file is updated, the date will change, thus changing the ETag. This, in turn will allow your browser to download the updated version.

Posted in APEX, Oracle | Tagged: | 26 Comments »