Tyler Muth’s Blog

Technology with a focus on Oracle, Application Express and Linux

Exadata Intelligent Storage

Posted by Tyler Muth on June 15, 2010

One of the key differentiators of Exadata is the concept that processing occurs in the storage. For example, if you issue a query such as:

select first_name,last_name

from employees

where department = ‘Accounting’

sort by last_name

The storage cell will actually process the “where” predicate.  This means the database doesn’t have to do this work, but more importantly it means we don’t have to ship the entire employees table over the network from the storage to the database like we do with traditional storage.  The storage cell will filter the rows AND columns down to just what we asked for then pass those results back to the database.  The database will then further process the results, such as sorts, aggregates, etc.

There are a lot more details to the concept of Intelligent Storage that I’ll cover in subsequent posts, but I wanted to start with it as a broad concept. I was “dancing” around the concept in a presentation I was working on and I’d like to thank Tom Kyte for pointing out the importance of this concept.

What does Intelligent Storage mean for performance?  It not only gives us dramatically better baseline performance, but it also means that as our capacity to store data grows, our ability to process that data grows.  In contrast, with a traditional SAN or NAS solution, adding more storage only increases capacity, it does not increase performance.  Sure, you have to add enough spindles to spread the load, but at a certain point adding more spindles will do nothing to improve your performance.  The bottleneck will simply shift from the storage to the network.

Lets use an example of scanning a 10 TB table, such as our previous example based on the employees table.  When we query this table in a traditional storage model, we’ll have to ship all 10 TB across the wire. Yes, the database will likely cache portions of this table in the buffer cache, but you simply don’t have enough RAM to cache 10 TB.  Now when we run the same query against Exadata, we only need to ship the rows and columns we asked for across the network.  Now imagine that table grows to 100 TB and we need to add storage to handle this growth.  In a traditional model, as our table grows our query will get continuously slower. With Exadata, adding storage also adds processing power which allows us to maintain consistent performance even as our table continues to grow.

combined

3 Responses to “Exadata Intelligent Storage”

  1. […] Exadata Intelligent Storage […]

  2. Dilip Chhetri said

    Claim: “When we query this table in a traditional storage model, we’ll have to ship all 10 TB across the wire”

    I think there is a fundamental flaw here with the assumption that DataBase server is dumb (OR I say the author picked wrong example). Let’s say I implement my Database using a traditional file-system (for e.g. ext2 over iSCSI lun), and my directory structure is something like,
    /mydb/departments/accounting/employees.txt

    So, for the above mention DB query, I will have to read (i.e. send over the network) only the “employees.txt” file, which I think would be much smaller than 10 TB database.

  3. […] in the face of a very differently behaving database. Tyler Muth has an interesting post which considers the Exadata feature which handles WHERE clause processing within intelligent storage. But best of […]

Leave a comment