Exadata Intelligent Storage
Posted by Tyler Muth on June 15, 2010
One of the key differentiators of Exadata is the concept that processing occurs in the storage. For example, if you issue a query such as:
select first_name,last_name from employees where department = ‘Accounting’ sort by last_name
The storage cell will actually process the “where” predicate. This means the database doesn’t have to do this work, but more importantly it means we don’t have to ship the entire employees table over the network from the storage to the database like we do with traditional storage. The storage cell will filter the rows AND columns down to just what we asked for then pass those results back to the database. The database will then further process the results, such as sorts, aggregates, etc.
There are a lot more details to the concept of Intelligent Storage that I’ll cover in subsequent posts, but I wanted to start with it as a broad concept. I was “dancing” around the concept in a presentation I was working on and I’d like to thank Tom Kyte for pointing out the importance of this concept.
What does Intelligent Storage mean for performance? It not only gives us dramatically better baseline performance, but it also means that as our capacity to store data grows, our ability to process that data grows. In contrast, with a traditional SAN or NAS solution, adding more storage only increases capacity, it does not increase performance. Sure, you have to add enough spindles to spread the load, but at a certain point adding more spindles will do nothing to improve your performance. The bottleneck will simply shift from the storage to the network.
Lets use an example of scanning a 10 TB table, such as our previous example based on the employees table. When we query this table in a traditional storage model, we’ll have to ship all 10 TB across the wire. Yes, the database will likely cache portions of this table in the buffer cache, but you simply don’t have enough RAM to cache 10 TB. Now when we run the same query against Exadata, we only need to ship the rows and columns we asked for across the network. Now imagine that table grows to 100 TB and we need to add storage to handle this growth. In a traditional model, as our table grows our query will get continuously slower. With Exadata, adding storage also adds processing power which allows us to maintain consistent performance even as our table continues to grow.