Image Caching in PL/SQL Applications
Posted by Tyler Muth on February 4, 2008
‘Apollo, we have a solution’. I’ve been working with PL/SQL based web applications for many years now, and one of the more annoying aspects of mod_plsql is that when you serve images out of the database, your browser never seems to cache them. I’ve played around with the Cache-Control, Expires, and Last-Modified headers, but nothing seems to work. Every time I reload the page, the browser reloads all images coming from mod_plsql, while it gets images that come from the file system from cache. This results in the page feeling slow to the end user when it has a lot of images that could have been cached.
One of the projects I’m currently working on is going to use a lot of images stored in the database. Faced with the caching issue, I was considering using Java to write the images out to the file system so Apache could serve them in a way that would be cached. Before I went down that path, I decided to do some tracing with Firebug and Wireshark. A few things to note about these tools. When tracing image caching with Firebug > Net > Images, a light gray bar indicates a cached image and a dark gray bar indicates a non-cached image. When using wireshark, right-click on a relevant captured packet, such as a Get of an image, and select “Follow TCP Stream”. This will show you the packet stream in order.
I started by tracing a static page served by Apache that contained some images served from the file system. What I discovered was that the browser was making the same “Get” requests every time I reloaded a page, whether I pressed shift+refresh or simply refresh, returning a non-cached or cached page respectively. One caveat to this was that the “If-Modified-Since” and “If-None-Match” headers were only there on cached pages, but I’ll get into this in a bit. So, the browser was requesting the image (or file) every time, which meant it was not caching the image. So, something was happening server-side that I was missing.
After a little more tracing, then some reading of the HTTP spec, I figured out how the request-response process worked. The following example is an imaginary conversation between Firefox and Apache:
Firefox: Get logo.gif
Apache: HTTP 200 “OK”, ETag: ABC123, Content-Length: 35, contents of gif file
Firefox: Get logo.gif, If-None-Match: ABC123
Apache: HTTP 304 “Not Modified”, ETag: ABC123, Content-Length: 35, headers closed, no file returned
On the first request for logo.gif, Apache returned an ETag header as well as the contents of the file. On the second request, Firefox sent the value of ETag as the If-None-Match header. Apache compared the values and simply returned a 304 status, not the image since the values were the same.
What is ETag? It’s simply a unique identifier for a file that should change any time the file changes. It’s corresponding request header is “If-None-Match”. The same comparison can be made between the “Last-Modified” response header and the “If-Modified-Since” request header. The HTTP spec considers ETag a “strong” attribute and “Last-Modified” a weak attribute, so I will focus on ETag from here on out.
So, all this time we were expecting Apache or mod_plsql to cache our images, we (the PL/SQL programmers) were actually the ones responsible for implementing this caching. Fortunately it’s very simple, especially now that you understand the problem.
To make this work, you need to do 3 things:
- Make the If-None-Match CGI environment variable available to mod_plsq or the embedded gateway.
- Send the ETag header back with any image request.
- When an image is requested, compare the ETag and If-None-Match headers and either return a 304 Not Modified header or return the image.
If you’re using the Embedded PL/SQL Gateway, execute the following code as SYS to add 2 CGI environment variables:
begin DBMS_EPG.SET_DAD_ATTRIBUTE ( dad_name => 'APEX', attr_name => 'cgi-environment-list', attr_value => 'HTTP_IF_NONE_MATCH'); DBMS_EPG.SET_DAD_ATTRIBUTE ( dad_name => 'APEX', attr_name => 'cgi-environment-list', attr_value => 'IF_MODIFIED_SINCE'); end; /
If you’re using Oracle HTTP Server (Apache + mod_plsql), add the following lines to your Database Access Descriptor (DAD), then bounce OHS to apply your changes:
PlsqlCGIEnvironmentList HTTP_IF_NONE_MATCH
PlsqlCGIEnvironmentList IF_MODIFIED_SINCE
OK, now for the code:
create or replace procedure image_test(
p_name in varchar2)
is
l_mime varchar2(255);
l_length number;
l_file_name varchar2(2000);
l_id varchar2(32);
l_created_on date;
l_file_etag varchar2(100);
l_header_etag varchar2(100);
l_file_blob blob;
begin
for c1 in (select id,name,filename,title,mime_type,
doc_size,created_on,blob_content
from my_files a
where a.filename = p_name)
loop
l_id := c1.id;
l_created_on := c1.created_on;
l_file_name := c1.filename;
l_mime := c1.mime_type;
l_length := c1.doc_size;
l_file_blob := c1.blob_content;
end loop; --c1
l_header_etag := owa_util.get_cgi_env('HTTP_IF_NONE_MATCH');
l_file_etag := l_id||to_char(l_created_on,'JHH24MISS');
if l_file_etag = l_header_etag then
owa_util.status_line(
NSTATUS => 304,
CREASON => 'Not Modified',
BCLOSE_HEADER => true);
return;
else
owa_util.mime_header(nvl(l_mime,'application/octet'), FALSE );
htp.p('Date: ' ||to_char(l_created_on,'Dy, DD Mon RRRR hh24:mi:ss')||' GMT');
htp.p('Content-length: ' || l_length);
htp.p('Content-Disposition: filename="' || l_file_name || '"');
htp.p('ETag: ' || l_file_etag);
htp.p('Expires: ' ||to_char(l_created_on+1,'Dy, DD Mon RRRR hh24:mi:ss')||' GMT');
htp.p('Cache-Control: max-age=300');
owa_util.http_header_close;
wpg_docload.download_file(l_file_blob);
end if;
end image_test;
/
Lines 27-35 are the important ones where we decide whether the image / file has changed and simply return a 304 status code, NOT the image itself. Note that I’m constructing the ETag header by concatenating the primary key with last-updated / created_on date (formatted as a Julian date with time). This means that if the image or file is updated, the date will change, thus changing the ETag. This, in turn will allow your browser to download the updated version.
Ano said
Very interesting. Presumably this fix will be rolled into the next version of Apex (the Apache/EPG setup and a modified version of the “built-in” image display procedure)?
John Scott said
Hi Tyler,
Excellent write up, I wrote a Whitepaper on the importance of image caching back for Collaborate 07 -
http://jes.blogs.shellprompt.net/2007/05/18/apex-delivering-pages-in-3-seconds-or-less/
I included some benchmarks which compared an application that used image caching versus one which used the default behaviour (i.e. non cached), the difference is potentially huge. This is one very overlooked area that can greatly enhance the scalability of an application (since every image results in extra database work in an uncached app).
Great work, as always.
John.
tylermuth said
Ano,
Yes, I presented this to the APEX team. I’m pretty sure it will make it into 3.1, but I can’t make any promises. In their defense, I sent this to them when they only had a few days of development time left for 3.1. They would then need to validate my findings, implement them in their code, and test them. If this enhancement doesn’t make it into 3.1 it’s not because they were ignoring it, just that I made this discovery at a really inconvenient time.
Tyler
Carl Backstrom said
Tyler is just being modest, this fix has been put into APEX and will be included into 3.1
Carl
Patrick Wolf said
Hi Tyler,
thanks for sharing. It can make a big performance difference as John demonstrated with the performance tests in his white paper and it would/will be great to have it as default in APEX.
One note about your implementation. Isn’t the “Expires” actually the most important part in your HTTP header response? Because it will prevent that the browser even does the HTTP GET check for a changed “ETag”, as long as the image/time isn’t expired. That will really speed it up. The “ETag” will be good if the image/time has expired, to verify if it is still valid.
Patrick
Tyler Muth said
In my testing, even setting the “Expires” header to a valid future date did not force the browser to read it from cache. As an example, I just ran wireshark to trace a session of loading yahoo’s home page several times. Here are the request and response headers from the 2nd time I loaded the page:
I trimmed out some of the completely irrelevant lines, but you can see that even though yahoo set an Expires header, my browser definitely made a request to the HTTP server for this image. The same is true for Firefox 2 and IE7 on my laptop.
Tyler
Ano said
Carl: That’s good news indeed! Do you have any tentative date for the Apex 3.1 release?
Gabor Kecskemeti said
Tyler: IE has a browser setting to control how often it checks for new versions of cached pages (Internet Options > General tab > Temporary Internet Files group > Settings). Does this have any effect on this behaviour? I don’t know if Firefox has any similar settings. In Opera you can actually set two separate values for pages and for images.
John Scott said
Great news that it’s included in 3.1, this was always one of the low-hanging fruit you could work on to make an application much more scalable, so it’ll be nice that this behaviour should now be a default.
Tyler,
In your example where the Expires header didn’t seem to be honored by the browser, did you check in Firefox to see if the image did indeed have an expiry header that was recognised by the browser? I’ve had a few occasions where the formatting of the date was not quite right for the browser to ‘understand’ it and therefore it did not honour the expiry header.
John.
Tyler Muth said
John,
Two problems with “Expires”. First, what value do you choose? For totally static images, you could choose something like sysdate+1 (or more), but the nature of images / files that are stored in the database is that they tend to change. Thus, one user could update a file, but other users would not see the change to the file for a day. You could possibly get around this by renaming the file anytime it changes, but I’m not totally convinced this is a bullet-proof solution.
Second, even if you set “Expires”, you are relying on the end-users browser settings. Even using the default settings in IE7, Firefox 2 and 3, and Opera, I observed mixed results. Take a look at this link from yahoo where they state that “40-60% of Yahoo!’s users have an empty cache experience and ~20% of all page views are done with an empty cache”.
My conclusion is this. Using the ETag technique will guarantee that all requests after the first one will only return an HTTP header, not the whole file. Adding the “Expires” header is a nice addition for files that you know will not change. It will keep the additional HTTP requests from happening at all… most of the time.
Tyler
Andy said
Does anyone know how this can be applied to Oracle Portal?
When images items are used on a secure Portal page they have no ETag and are therefore resent every time.
(Portal version 10.1.4)
Thanks
Jan said
This still hits the database right?
Why not solve this on the middle tier?
Tyler Muth said
Jan,
Yes, this will still hit the database. Do you have a suggestion for how it could be solved in the middle tier?
Tyler
Phil Winfield said
Tyler, I spent a bit of time looking into improving performance for a client but it seems that if Oracle Application server is serving up the pages, images and other page components can be cached already. Is it correct to say this is specific to the HTTP server that ships with 10g or does OAS not really do this?
Phil
Tyler Muth said
Phil, I would need more details to comment, but this situation would apply to Oracle Application Server when you are serving images out of the database using mod_plsql with your own custom code.
Greg Jarmiolowski said
This addresses per user/browser cache. What about caching on the webserver file system with owa_cache? Is that already in the APEX plumbing?
Loga said
Excellent valuable solution for my searched query
Loga.
renu said
Could any one reply how to implement into oracle apex
Tyler Muth said
Renu,
I was doing this research for an APEX application. You could call the “image_test” procedure from an APEX report, or when someone clicks on a link, or using htp.p in a PL/SQL region. On a related note, APEX 3.1 actually implements this technique, so the images you upload through the APEX UI use this technique to handle downloads. One more reason to upgrade to 3.1.
naineumNaice said
kxgkwzzqxsxuglsgwell, hi admin adn people nice forum indeed. how’s life? hope it’s introduce branch
stewstryker said
John,
Expires really got implemented in Apex 3.1? Because we’re running 3.1.1 and I just ran the YSlow plugin on my new app’s page with an Interactive Report and it complained about no caching.
Here’s the log from YSlow as I thought you’d like to see all the IR images not w/o expires. If the .png filenames look familiar it’s because I copied the tabs from the development environment.
Finally, I noticed that they include Etags (which YSlow recommends) but it says they’re not being used?
I’ve trimmed the body after the first one, for brevity.
These components do not have a far future Expires or cache-control: max-age header:
* [HTTP headers] (no expires) https://oracle-www.ourserver.edu/i/themes/theme_13/theme_3_1.css
*
ParamsHeadersPost
Response Headers
Date:Sat, 11 Apr 2009 20:15:01 GMT
Server:Oracle-Application-Server-10g
Last-Modified:Thu, 21 Aug 2008 20:57:24 GMT
Etag:"104572-71f3-48add6b4"
Accept-Ranges:bytes
Content-Length:29171
Connection:close
Content-Type:text/css
X-Pad:avoid browser bug
Loading...
...
* [HTTP headers] (no expires) .../i/css/apex_3_1.css
* [HTTP headers] (no expires) .../htmldb/harpo/wwv_flow_file_mgr.get_file?p_security_group_id=65488007[snip]
* [HTTP headers] (no expires) .../i/javascript/apex_ns_3_1.js
* [HTTP headers] (no expires) .../i/javascript/apex_3_1.js
* [HTTP headers] (no expires) .../i/javascript/apex_get_3_1.js
* [HTTP headers] (no expires) .../i/javascript/apex_builder.js
* [HTTP headers] (no expires) .../images/Volt-logo.gif
* [HTTP headers] (no expires) .../images/tabOffL.gif
* [HTTP headers] (no expires) .../images/tabOffR.gif
* [HTTP headers] (no expires) .../images/tabOnL.gif
* [HTTP headers] (no expires) .../images/tabOnR.gif
* [HTTP headers] (no expires) .../i/htmldb/builder/finder_w_dropdown.gif
* [HTTP headers] (no expires) .../i/ws/action_dropdown.gif
* [HTTP headers] (no expires) .../i/ws/edit_col_32.gif
* [HTTP headers] (no expires) .../i/ws/filter_32.gif
* [HTTP headers] (no expires) .../i/ws/sort_32.gif
* [HTTP headers] (no expires) .../i/ws/break_col_32x24.gif
* [HTTP headers] (no expires) .../i/ws/highlight_32x24.gif
* [HTTP headers] (no expires) .../i/ws/calc_32x24.gif
* [HTTP headers] (no expires) .../i/ws/sum_32x24.gif
* [HTTP headers] (no expires) .../i/ws/chart_32x24.gif
* [HTTP headers] (no expires) .../i/ws/disk_32.gif
* [HTTP headers] (no expires) .../i/ws/reset_32x24.gif
* [HTTP headers] (no expires) .../i/ws/help_32x24.gif
* [HTTP headers] (no expires) .../i/ws/download_32x24.gif
* [HTTP headers] (no expires) .../i/jtfunexe.gif
* [HTTP headers] (no expires) .../i/ws/dup.gif
* [HTTP headers] (no expires) .../i/ws/ddown.gif
* [HTTP headers] (no expires) .../i/ws/trash.gif
* [HTTP headers] (no expires) .../i/ws/break_col.gif
* [HTTP headers] (no expires) .../i/ws/info_20x18.gif
* [HTTP headers] (no expires) .../i/ws/calc.gif
* [HTTP headers] (no expires) .../i/ws/ajax-loader.gif
* [HTTP headers] (no expires) https://www.dartmouth.edu/~alfund/assets/jpg/banner-dcf.jpg
* [HTTP headers] (no expires) .../images/tabOnM.gif
* [HTTP headers] (no expires) .../i/htmldb/misc/tabOnM.png
* [HTTP headers] (no expires) .../i/htmldb/builder/builder_find_bg.png
* [HTTP headers] (no expires) .../i/htmldb/builder/builder_find_top.png
* [HTTP headers] (no expires) .../i/htmldb/builder/builder_find_bottom.png
* [HTTP headers] (no expires) .../i/ws/report_bg.gif
Tyler Muth said
Stew,
Expires headers in APEX will have no impact on images served by Apache from the file system, as is the case with anything under /i/… You need to enable this in Apache. Here’s the doc:
http://httpd.apache.org/docs/1.3/mod/mod_expires.html
Tyler
forexbroker said
forexbroker…
[...]Image Caching in PL/SQL Applications « Tyler Muth’s Blog[...]…