Tyler Muth’s Blog

Technology with a focus on Oracle, Application Express and Linux

Image Caching in PL/SQL Applications

Posted by Tyler Muth on February 4, 2008

‘Apollo, we have a solution’. I’ve been working with PL/SQL based web applications for many years now, and one of the more annoying aspects of mod_plsql is that when you serve images out of the database, your browser never seems to cache them. I’ve played around with the Cache-Control, Expires, and Last-Modified headers, but nothing seems to work. Every time I reload the page, the browser reloads all images coming from mod_plsql, while it gets images that come from the file system from cache. This results in the page feeling slow to the end user when it has a lot of images that could have been cached.

One of the projects I’m currently working on is going to use a lot of images stored in the database. Faced with the caching issue, I was considering using Java to write the images out to the file system so Apache could serve them in a way that would be cached. Before I went down that path, I decided to do some tracing with Firebug and Wireshark. A few things to note about these tools. When tracing image caching with Firebug > Net > Images, a light gray bar indicates a cached image and a dark gray bar indicates a non-cached image. When using wireshark, right-click on a relevant captured packet, such as a Get of an image, and select “Follow TCP Stream”. This will show you the packet stream in order.

I started by tracing a static page served by Apache that contained some images served from the file system. What I discovered was that the browser was making the same “Get” requests every time I reloaded a page, whether I pressed shift+refresh or simply refresh, returning a non-cached or cached page respectively. One caveat to this was that the “If-Modified-Since” and “If-None-Match” headers were only there on cached pages, but I’ll get into this in a bit. So, the browser was requesting the image (or file) every time, which meant it was not caching the image. So, something was happening server-side that I was missing.

After a little more tracing, then some reading of the HTTP spec, I figured out how the request-response process worked. The following example is an imaginary conversation between Firefox and Apache:

Firefox: Get logo.gif
Apache: HTTP 200 “OK”, ETag: ABC123, Content-Length: 35, contents of gif file

Firefox: Get logo.gif, If-None-Match: ABC123
Apache: HTTP 304 “Not Modified”, ETag: ABC123, Content-Length: 35, headers closed, no file returned

On the first request for logo.gif, Apache returned an ETag header as well as the contents of the file. On the second request, Firefox sent the value of ETag as the If-None-Match header. Apache compared the values and simply returned a 304 status, not the image since the values were the same.

What is ETag? It’s simply a unique identifier for a file that should change any time the file changes. It’s corresponding request header is “If-None-Match”. The same comparison can be made between the “Last-Modified” response header and the “If-Modified-Since” request header. The HTTP spec considers ETag a “strong” attribute and “Last-Modified” a weak attribute, so I will focus on ETag from here on out.

So, all this time we were expecting Apache or mod_plsql to cache our images, we (the PL/SQL programmers) were actually the ones responsible for implementing this caching. Fortunately it’s very simple, especially now that you understand the problem.

To make this work, you need to do 3 things:

  1. Make the If-None-Match CGI environment variable available to mod_plsq or the embedded gateway.
  2. Send the ETag header back with any image request.
  3. When an image is requested, compare the ETag and If-None-Match headers and either return a 304 Not Modified header or return the image.

If you’re using the Embedded PL/SQL Gateway, execute the following code as SYS to add 2 CGI environment variables:

begin
DBMS_EPG.SET_DAD_ATTRIBUTE (
   dad_name    => 'APEX',
   attr_name   => 'cgi-environment-list',
   attr_value  => 'HTTP_IF_NONE_MATCH');

DBMS_EPG.SET_DAD_ATTRIBUTE (
   dad_name    => 'APEX',
   attr_name   => 'cgi-environment-list',
   attr_value  => 'IF_MODIFIED_SINCE');
end;
/

If you’re using Oracle HTTP Server (Apache + mod_plsql), add the following lines to your Database Access Descriptor (DAD), then bounce OHS to apply your changes:
PlsqlCGIEnvironmentList HTTP_IF_NONE_MATCH
PlsqlCGIEnvironmentList IF_MODIFIED_SINCE

OK, now for the code:

create or replace procedure image_test(
    p_name      in varchar2)
is
    l_mime          varchar2(255);
    l_length        number;
    l_file_name     varchar2(2000);
    l_id            varchar2(32);
    l_created_on    date;
    l_file_etag     varchar2(100);
    l_header_etag   varchar2(100);
    l_file_blob     blob;

begin
    for c1 in (select id,name,filename,title,mime_type,
                      doc_size,created_on,blob_content
                 from my_files a
                where a.filename = p_name)
    loop
        l_id            := c1.id;
        l_created_on    := c1.created_on;
        l_file_name     := c1.filename;
        l_mime          := c1.mime_type;
        l_length        := c1.doc_size;
        l_file_blob     := c1.blob_content;
    end loop; --c1

    l_header_etag   := owa_util.get_cgi_env('HTTP_IF_NONE_MATCH');
    l_file_etag     := l_id||to_char(l_created_on,'JHH24MISS');

    if l_file_etag = l_header_etag then
        owa_util.status_line(
            NSTATUS         => 304,
            CREASON         => 'Not Modified',
            BCLOSE_HEADER   => true);
        return;
    else
        owa_util.mime_header(nvl(l_mime,'application/octet'), FALSE );

        htp.p('Date: ' ||to_char(l_created_on,'Dy, DD Mon RRRR hh24:mi:ss')||' GMT');
        htp.p('Content-length: ' || l_length);
        htp.p('Content-Disposition: filename="' || l_file_name || '"');
        htp.p('ETag: ' || l_file_etag);
        htp.p('Expires: ' ||to_char(l_created_on+1,'Dy, DD Mon RRRR hh24:mi:ss')||' GMT');
        htp.p('Cache-Control: max-age=300');
        owa_util.http_header_close;
        wpg_docload.download_file(l_file_blob);
    end if;
end image_test;
/

Lines 27-35 are the important ones where we decide whether the image / file has changed and simply return a 304 status code, NOT the image itself. Note that I’m constructing the ETag header by concatenating the primary key with last-updated / created_on date (formatted as a Julian date with time). This means that if the image or file is updated, the date will change, thus changing the ETag. This, in turn will allow your browser to download the updated version.

About these ads

26 Responses to “Image Caching in PL/SQL Applications”

  1. Ano said

    Very interesting. Presumably this fix will be rolled into the next version of Apex (the Apache/EPG setup and a modified version of the “built-in” image display procedure)?

  2. John Scott said

    Hi Tyler,

    Excellent write up, I wrote a Whitepaper on the importance of image caching back for Collaborate 07 –

    http://jes.blogs.shellprompt.net/2007/05/18/apex-delivering-pages-in-3-seconds-or-less/

    I included some benchmarks which compared an application that used image caching versus one which used the default behaviour (i.e. non cached), the difference is potentially huge. This is one very overlooked area that can greatly enhance the scalability of an application (since every image results in extra database work in an uncached app).

    Great work, as always.

    John.

  3. tylermuth said

    Ano,

    Yes, I presented this to the APEX team. I’m pretty sure it will make it into 3.1, but I can’t make any promises. In their defense, I sent this to them when they only had a few days of development time left for 3.1. They would then need to validate my findings, implement them in their code, and test them. If this enhancement doesn’t make it into 3.1 it’s not because they were ignoring it, just that I made this discovery at a really inconvenient time.

    Tyler

  4. Tyler is just being modest, this fix has been put into APEX and will be included into 3.1

    Carl

  5. Hi Tyler,

    thanks for sharing. It can make a big performance difference as John demonstrated with the performance tests in his white paper and it would/will be great to have it as default in APEX.

    One note about your implementation. Isn’t the “Expires” actually the most important part in your HTTP header response? Because it will prevent that the browser even does the HTTP GET check for a changed “ETag”, as long as the image/time isn’t expired. That will really speed it up. The “ETag” will be good if the image/time has expired, to verify if it is still valid.

    Patrick

  6. Tyler Muth said

    In my testing, even setting the “Expires” header to a valid future date did not force the browser to read it from cache. As an example, I just ran wireshark to trace a session of loading yahoo’s home page several times. Here are the request and response headers from the 2nd time I loaded the page:

    GET /us.yimg.com/i/ww/beta/y3.gif HTTP/1.1
    Accept: image/png,*/*;q=0.5
    Referer: http://www.yahoo.com/
    If-Modified-Since: Thu, 07 Dec 2006 18:57:29 GMT
    Cache-Control: max-age=0
    HTTP/1.1 304 Not Modified
    Content-Type: image/gif
    Last-Modified: Thu, 07 Dec 2006 18:57:29 GMT
    Cache-Control: max-age=298902395
    Expires: Thu, 27 Jul 2017 05:29:42 GMT
    Date: Tue, 05 Feb 2008 17:03:07 GMT

    I trimmed out some of the completely irrelevant lines, but you can see that even though yahoo set an Expires header, my browser definitely made a request to the HTTP server for this image. The same is true for Firefox 2 and IE7 on my laptop.

    Tyler

  7. Ano said

    Carl: That’s good news indeed! Do you have any tentative date for the Apex 3.1 release?

  8. Gabor Kecskemeti said

    Tyler: IE has a browser setting to control how often it checks for new versions of cached pages (Internet Options > General tab > Temporary Internet Files group > Settings). Does this have any effect on this behaviour? I don’t know if Firefox has any similar settings. In Opera you can actually set two separate values for pages and for images.

  9. John Scott said

    Great news that it’s included in 3.1, this was always one of the low-hanging fruit you could work on to make an application much more scalable, so it’ll be nice that this behaviour should now be a default.

    Tyler,

    In your example where the Expires header didn’t seem to be honored by the browser, did you check in Firefox to see if the image did indeed have an expiry header that was recognised by the browser? I’ve had a few occasions where the formatting of the date was not quite right for the browser to ‘understand’ it and therefore it did not honour the expiry header.

    John.

  10. Tyler Muth said

    John,

    Two problems with “Expires”. First, what value do you choose? For totally static images, you could choose something like sysdate+1 (or more), but the nature of images / files that are stored in the database is that they tend to change. Thus, one user could update a file, but other users would not see the change to the file for a day. You could possibly get around this by renaming the file anytime it changes, but I’m not totally convinced this is a bullet-proof solution.

    Second, even if you set “Expires”, you are relying on the end-users browser settings. Even using the default settings in IE7, Firefox 2 and 3, and Opera, I observed mixed results. Take a look at this link from yahoo where they state that “40-60% of Yahoo!’s users have an empty cache experience and ~20% of all page views are done with an empty cache”.

    My conclusion is this. Using the ETag technique will guarantee that all requests after the first one will only return an HTTP header, not the whole file. Adding the “Expires” header is a nice addition for files that you know will not change. It will keep the additional HTTP requests from happening at all… most of the time.

    Tyler

  11. Andy said

    Does anyone know how this can be applied to Oracle Portal?

    When images items are used on a secure Portal page they have no ETag and are therefore resent every time.

    (Portal version 10.1.4)

    Thanks

  12. Jan said

    This still hits the database right?
    Why not solve this on the middle tier?

  13. Tyler Muth said

    Jan,

    Yes, this will still hit the database. Do you have a suggestion for how it could be solved in the middle tier?

    Tyler

  14. Phil Winfield said

    Tyler, I spent a bit of time looking into improving performance for a client but it seems that if Oracle Application server is serving up the pages, images and other page components can be cached already. Is it correct to say this is specific to the HTTP server that ships with 10g or does OAS not really do this?

    Phil

  15. Tyler Muth said

    Phil, I would need more details to comment, but this situation would apply to Oracle Application Server when you are serving images out of the database using mod_plsql with your own custom code.

  16. Greg Jarmiolowski said

    This addresses per user/browser cache. What about caching on the webserver file system with owa_cache? Is that already in the APEX plumbing?

  17. Loga said

    Excellent valuable solution for my searched query :)
    Loga.

  18. renu said

    Could any one reply how to implement into oracle apex

  19. Tyler Muth said

    Renu,

    I was doing this research for an APEX application. You could call the “image_test” procedure from an APEX report, or when someone clicks on a link, or using htp.p in a PL/SQL region. On a related note, APEX 3.1 actually implements this technique, so the images you upload through the APEX UI use this technique to handle downloads. One more reason to upgrade to 3.1.

  20. naineumNaice said

    kxgkwzzqxsxuglsgwell, hi admin adn people nice forum indeed. how’s life? hope it’s introduce branch ;)

  21. stewstryker said

    John,

    Expires really got implemented in Apex 3.1? Because we’re running 3.1.1 and I just ran the YSlow plugin on my new app’s page with an Interactive Report and it complained about no caching.

    Here’s the log from YSlow as I thought you’d like to see all the IR images not w/o expires. If the .png filenames look familiar it’s because I copied the tabs from the development environment.

    Finally, I noticed that they include Etags (which YSlow recommends) but it says they’re not being used?

    I’ve trimmed the body after the first one, for brevity.


    These components do not have a far future Expires or cache-control: max-age header:

    * [HTTP headers] (no expires) https://oracle-www.ourserver.edu/i/themes/theme_13/theme_3_1.css
    *
    ParamsHeadersPost
    Response Headers
    Date:Sat, 11 Apr 2009 20:15:01 GMT
    Server:Oracle-Application-Server-10g
    Last-Modified:Thu, 21 Aug 2008 20:57:24 GMT
    Etag:"104572-71f3-48add6b4"
    Accept-Ranges:bytes
    Content-Length:29171
    Connection:close
    Content-Type:text/css
    X-Pad:avoid browser bug
    Loading...
    ...
    * [HTTP headers] (no expires) .../i/css/apex_3_1.css
    * [HTTP headers] (no expires) .../htmldb/harpo/wwv_flow_file_mgr.get_file?p_security_group_id=65488007[snip]
    * [HTTP headers] (no expires) .../i/javascript/apex_ns_3_1.js
    * [HTTP headers] (no expires) .../i/javascript/apex_3_1.js
    * [HTTP headers] (no expires) .../i/javascript/apex_get_3_1.js
    * [HTTP headers] (no expires) .../i/javascript/apex_builder.js
    * [HTTP headers] (no expires) .../images/Volt-logo.gif
    * [HTTP headers] (no expires) .../images/tabOffL.gif
    * [HTTP headers] (no expires) .../images/tabOffR.gif
    * [HTTP headers] (no expires) .../images/tabOnL.gif
    * [HTTP headers] (no expires) .../images/tabOnR.gif
    * [HTTP headers] (no expires) .../i/htmldb/builder/finder_w_dropdown.gif
    * [HTTP headers] (no expires) .../i/ws/action_dropdown.gif
    * [HTTP headers] (no expires) .../i/ws/edit_col_32.gif
    * [HTTP headers] (no expires) .../i/ws/filter_32.gif
    * [HTTP headers] (no expires) .../i/ws/sort_32.gif
    * [HTTP headers] (no expires) .../i/ws/break_col_32x24.gif
    * [HTTP headers] (no expires) .../i/ws/highlight_32x24.gif
    * [HTTP headers] (no expires) .../i/ws/calc_32x24.gif
    * [HTTP headers] (no expires) .../i/ws/sum_32x24.gif
    * [HTTP headers] (no expires) .../i/ws/chart_32x24.gif
    * [HTTP headers] (no expires) .../i/ws/disk_32.gif
    * [HTTP headers] (no expires) .../i/ws/reset_32x24.gif
    * [HTTP headers] (no expires) .../i/ws/help_32x24.gif
    * [HTTP headers] (no expires) .../i/ws/download_32x24.gif
    * [HTTP headers] (no expires) .../i/jtfunexe.gif
    * [HTTP headers] (no expires) .../i/ws/dup.gif
    * [HTTP headers] (no expires) .../i/ws/ddown.gif
    * [HTTP headers] (no expires) .../i/ws/trash.gif
    * [HTTP headers] (no expires) .../i/ws/break_col.gif
    * [HTTP headers] (no expires) .../i/ws/info_20x18.gif
    * [HTTP headers] (no expires) .../i/ws/calc.gif
    * [HTTP headers] (no expires) .../i/ws/ajax-loader.gif
    * [HTTP headers] (no expires) https://www.dartmouth.edu/~alfund/assets/jpg/banner-dcf.jpg
    * [HTTP headers] (no expires) .../images/tabOnM.gif
    * [HTTP headers] (no expires) .../i/htmldb/misc/tabOnM.png
    * [HTTP headers] (no expires) .../i/htmldb/builder/builder_find_bg.png
    * [HTTP headers] (no expires) .../i/htmldb/builder/builder_find_top.png
    * [HTTP headers] (no expires) .../i/htmldb/builder/builder_find_bottom.png
    * [HTTP headers] (no expires) .../i/ws/report_bg.gif

  22. Tyler Muth said

    Stew,

    Expires headers in APEX will have no impact on images served by Apache from the file system, as is the case with anything under /i/… You need to enable this in Apache. Here’s the doc:

    http://httpd.apache.org/docs/1.3/mod/mod_expires.html

    Tyler

  23. forexbroker…

    […]Image Caching in PL/SQL Applications « Tyler Muth’s Blog[…]…

  24. Nora said

    You don’t have to hit the gym for two and three hours each day to lose weight, but it does help to squeeze in 30 minutes of physical activity each day. In fact, I always suggest to those overweight vegetarians to keep to a max of 4 pounds a week so you won’t suffer loose skin post weight loss.
    Now, the average healthy amount people are supposed to
    lose is 2 pounds per week.

  25. McAfee said

    An outstanding share! I have just forwarded this onto a friend
    who had been doing a little research on this.
    And he in fact ordered me lunch simply because I stumbled upon it for him…
    lol. So let me reword this…. Thanks for the meal!! But yeah, thanx for spending the time to discuss this matter here on your web site.

  26. When someone writes an post he/she keeps the idea of a user in his/her mind that how a user can be aware
    of it. Therefore that’s why this paragraph is perfect.
    Thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 93 other followers

%d bloggers like this: