IBM announced on April 3rd a new version of DB2. I tend to be stuck a few versions back due to what WebSphere Commerce certifies with, but I’m constantly learning about and drooling over new versions. I think many DBAs end up stuck a version or two back for longer than they would like. I haven’t actually gotten to work with DB2 10.1 yet. And along comes DB2 10.5. I’m excited to see what more I can learn about it at the conference and beyond, but I’ve got some good tidbits already.
Like with 10.1, the messaging from IBM is pretty heavy on Big Data. I get that – it’s a hot area, and I’m keeping up on a general level. But I focus my career and my blog on e-commerce and OLTP databases right now, so I’m always looking for the tidbits that can apply to e-commerce databases.
DB2 10.5 Features for E-Commerce
The details I’m excited about in 10.5 that I’ve learned so far are related to Pure Scale. I’ve been trying to talk clients into Pure Scale for at least a year now. I think it’s a great product. But most of my clients think the technology is too young and untested for them. I think we’re getting past that now.
Remember that PureScale is IBM’s answer to active-active. It’s a cluster with shared disk. Essentially a direct competitor to Oracle RAC. But the performance scalability exceeds Oracle’s. PureScale was introduced as a separate code base with DB2 9.8, and was merged into the standard code base with 10.1. When introduced, PureScale was only available on AIX on very specific IBM hardware. Over time, those hardware restrictions have loosened, and included Linux.
DB2 10.5 removes some of the hardware restrictions for PureScale – now IBM will support any x86 servers. There are still restrictions for switches and adapters and such. I’m looking forward for the day when I can get PureScale up and running in Amazon EC2 to play with it.
DB2 10.5 will support HADR with PureScale. So you could now have a completely separate cluster in another data center as a warm standby. I like this idea, because my clients who would be willing to pay for active-active, also tend to want true DR. I also hear that TSA will automate fail-over in this scenario – I want to see that in action!
DB2 10.5 PureScale will support online additions of cluster members and application of maintenance. Dang, didn’t realize they didn’t already, but this is good to know. I can’t imagine selling a client on PureScale and then telling them I have to take the whole cluster down to apply a Fix Pack.
DB2 10.5 PureScale will now support on-disk Encryption. Not a feature I’ve dealt much with.
Simplification of Packaging
IBM is also simplifying packaging of DB2. They’re now offering essentially four paid options, as I understand it:
- Workgroup Edition
- Advanced Workgroup Edition
- Enterprise Edition
- Advanced Enterprise Edition
My understanding is that features like BLU Acceleration and Data Compression will no longer be available as add-on features, but only as a part of the “Advanced” Editions. That’s actually supposed to be starting on April 23rd. Personally, I kind of like the simplification, rather than having all these add-on features – an advantage over Oracle’s confusing (to me) array of options.
Also, some features aren’t currently compatible with others, so the choices generally look like this:
DPF (Data Partitioning Feature)
Even if my focus is e-commerce, I cannot help but mention the cool analytics features. The push of IBM’s messaging here is BLU acceleration. The official documents say they’re speeding up analytics workloads by between 8 and 25 times. And some of the folks in on the beta are reporting even more. I think BNSF was saying they saw a 4 billion row join go from 10 minutes in Teradata to less than a second in DB2 10.5. IBM seems to really be going after NoSQL here. I’m not a fan of NoSQL because it’s usually not ACID and usually not appropriate for most of the e-commerce workload. “BLU acceleration” seems to include compression ratios at 90-95%, columnar processing, data skipping, neat tricks like Single Instruction, Multiple Data(SMID). With all these cool features, it seems pretty ground breaking for the analytics workload to me.
My understanding is that BLU acceleration can be enabled on a table by table basis. So for mixed workloads where reporting data resides in different tables, this could be a huge advantage.
DB2 10.5 Other Features
I really look forward to the IDUG conference in Orlando this year where I can really flush out any other features that are not a part of the initial messaging. There was so much more detail available there for the 10.1 release.
When is DB2 10.5 Going to be Available?
IBM is targeting availability of GA for DB2 10.5 for mid-June.
I think this is the current draft of the 10.5 Info Center, and maybe even a copy you can play with: http://www.ibm.com/developerworks/data/db2preview/index.html
Register for tech talk on DB2 BLU: http://bit.ly/tt2013may8
Thank you for good explanation of 10.5 new features.
Can i ask you where you find info (and may be share it) about incredible results from BNSF ?
I try Technolgy Preview, but get some strange results for simple select statement emulating fullscan on column-table (in intra_parallel mode 5 time slowly than on row-based adaptive compressed table)
Best regards, Dmitry
Kent Collins from BNSF spoke on the day-long announcement of db2 10.5 from the Silicon Valley labs. I’m not finding a replay of that, but I’ll ask around to see if there is one. I see that he also has a session scheduled at IDUG – bet it would be a good one, if you’re going: http://www.idug.org/e/in/eid=21&req=info&s=1503&all=1
Alright, I asked around, and you can get to the video where I heard BNSF’s results here:
using password “almaden2013.”
I think it’s in the “IBM BigData – Session 01” there, somewhere after the first hour. Lots of interesting stuff in those videos.
one of the most interesting points was that pureScale on Linux isn’t bound on System x anymore. Actually I didn’t find any information about that, nothing official (including Kepler Information Center) that gives me the confidence to tell our customers that finally they could use pureScale with their desired hardware.
Can you tell me from where you got this information?
I bet it will be discussed at this event: bit.ly/tt2013june.
I belive it was included somewhere in these videos (lot to wade through): http://www.livestream.com/private/ibmbigdata?t=1364247362224&t=1366038120604&t=1370869318940
The information was directly from various reputable sources at IBM, but the video link above is the closest I have to something I can currently point you to. The technology preview for 10.5 may also include information on it: http://www.ibm.com/developerworks/data/db2preview/index.html.
I wouldn’t promise anything based on my word alone – it could be that IBM decides to move it out of this release. I’ve seen at least one feature that has both been described as a part of this release and as a part of a “future” release.
Oh, and I noticed that in the Info Center too – it is specifically not in the technology preview Info Center.
I asked on a tech talk today, and the answer was “near future”. I suspect it just didn’t make it into 10.5, and hopefully will be in the next release or even in a fixpack.
Thanks for remembering me, Ember. I think I noticed your question – I am logged-in, too.
Just want to clarify, that exceptional results for BNSF and many other IBM clients database setups used in preliminary BLU evaluation
was not caused by shifting to a NoSQL , as you may initially suspected, but rather by these major 6 “big things” implemented in the BLU Accelerator module:
-Deep HW Instruction Exploatation (SIMD)
-Optimal Memory Caching
Speaking about NOSQL – IBM did included additional support for NoSQL JSON data stores in db2 10.5, but these changes shouldn’t play any role in produced performance results.