(Gorodenkoff/Shutterstock)
Fixing huge knowledge issues usually requires creating new computing approaches and new applied sciences. However typically, the newer applied sciences and strategies create extra issues that didn’t beforehand exist. One upcoming huge knowledge analytics vendor that’s discovered a contented medium balancing new tech and confirmed strategies is Ocient.
Ocient was based in 2016 by a bunch of technologists led by Chris Gladwin, who was the founding father of object storage vendor Cleversafe, which IBM purchased in 2015 for $1.4 billion. Again then, huge knowledge lakes constructed on distributed file programs like HDFS and object storage programs like S3 have been thought-about innovative. Equally, many firms have been advised that one of the best ways to scale huge knowledge workloads was to separate the compute and storage layers, which allowed them to scale independently.
Knowledge was so huge, we have been advised, that one needed to centralize the information, ideally within the cloud, and convey compute to the storage. The storage media underlying HDFS or S3–and which most cloud knowledge warehouses, like Snowflake and Redshift, are designed to make use of–was invariably spinning disk, which even at the moment is the most cost effective type of on-line storage.
However Gladwin and his crew had a special tackle the scenario. They noticed the spinning disk that AWS was investing so closely in as an obstacle to progress. One might run huge SQL analytics jobs by sending knowledge throughout NICs to the storage layer, however it wouldn’t essentially be the quickest nor the most cost effective method.
As an alternative, Ocient developed its personal analytics database with a brand new structure that’s designed round NMVe drives. And as an alternative of separating compute and storage, Ocient’s design introduced the 2 again collectively. These two architectural design factors allowed Ocient to ship huge efficiency positive factors on a few of the hardest huge knowledge challenges, in keeping with George Kondiles, Ocient’s co-founder and chief architect.
NVMe drives maintain substantial efficiency benefits over spinning disks (ALPAL-images/Shutterstock)
“When knowledge get very massive, within the petabytes and above, with tens to a whole lot of trillions of information, what we see, what we consider to be true, is that the abstraction layer that exists between the storage and the compute is a considerable obstacle to realizing enormous efficiency positive factors on the queries, comparatively talking, on that knowledge,” Kondiles stated. “We work very intently with eliminating all these abstraction layers in order that we’re capable of simply mainly speak on to the information, learn straight from the information in situ, do as a lot of the evaluation as we will with these actually wider pipes proper off the bins on these actually massive knowledge units.”
A typical NVMe drive can learn knowledge at speeds as much as 3,000 MB per second and 200,000 IOPS with direct connections to the PCIe bus. A ten,000 RPM spinning disk, however, can learn knowledge at accelerates 250 MB per second and ship perhaps 160 IOPS. When compute and storage are disconnected, as is the style, there’s extra community latency.
Ocient’s deal with using NVMe drives gave it an enormous efficiency increase over knowledge lakes, which invariably use spinning disk. Whereas NVMe drives are dearer than spinning disk, they’ll entry knowledge 30x quicker or extra, which give them an enormous efficiency benefit. For sure varieties of always-on huge knowledge workloads, the speedup that Ocient’s method is nicely value any additional prices that will outcome from having to purchase numerous NVMe drives and working them in an on-prem style.
Again in 2016, few analytics database distributors have been creating databases with NVMe in thoughts. Ocient sensed a chance. “We have been all in on this NVMe drive idea very early on primarily based on simply us noticing that the prevailing database software program that was on the market wasn’t essentially capitalizing on the type of comparatively novel capabilities that the drives have,” Kondiles advised BigDATAwire in a current interview. “And that was why we leaned in on it.”
That’s to not say that knowledge lakes working on object storage don’t have their place. Firms that may’t predict what their analytical wants are going to be will profit from the extra elasticity that the separation of compute and storage deliver. However for sure varieties of always-on OLAP workloads–the kind that contain tens of petabytes of information and a whole lot of trillions of information–the overhead incurred by accessing HDDs over the community in an information lake setting is simply an excessive amount of.
“The types of issues that we’re concentrating on and making an attempt to unravel, we see some actually substantial efficiency enhancements, price enhancements…precisely for these types of information that the information lake method doesn’t essentially at all times have the very best outcomes,” Kondiles stated. “In some eventualities, there’s loads of worth available by conserving them separate. And in others, there’s loads of worth to not essentially attempt to power a sq. peg in a spherical gap.”
Ocient caters to firms with a few of the largest huge knowledge necessities, reminiscent of telcos, advert tech corporations, governmental businesses, monetary companies, and enterprises with large-scale observability workloads. Lots of Ocient’s clients run their Ocient clusters on-prem, though there’s nothing to stop the Ocient software program from being run within the cloud, which some clients do.
Co-locating compute and storage reduces prices, however brings ancillary advantages too, Kondiles stated. “We have been focusing totally on efficiency and value effectiveness,” he stated. “But it surely’s additionally house discount and power discount, since you’re taking what was a bunch of storage nodes and a bunch of compute nodes and also you mix them collectively right into a single set of nodes, and the result’s decrease knowledge middle footprint and decrease energy utilization.”
Ocient’s analytics database is constructed on the relational mannequin and makes use of commonplace ANSI SQL to entry knowledge. On high of that, it provides time-series and geospatial elements, which invariably are necessary within the form of huge IoT- and senor-generated knowledge units that Ocient clients need to crunch. It additionally contains some machine studying primitives that enable clients to run predictive analytic features.
However Ocient’s database isn’t your backyard selection SQL retailer. For example, the corporate has constructed erasure coding straight into its question engine, which permits it to attenuate the quantity of duplicate knowledge that clients retailer whereas retaining the potential to do a full restoration within the occasion of drive losses. That’s an instance of Ocient borrowing concepts from object retailer distributors.
Right here’s one other space the place Ocient zigs whereas the remainder of the business zags: secondary indexes.
“It’s one thing that loads of the larger names type of moved away from simply due to the perceived complexity of managing a schema and the varieties of queries and no matter else,” Kondiles stated. “And what we discovered is that, particularly at these scales, the secondary indexes might be essential for attaining cheap both execution instances or prices for the system just because the information quantity is so excessive.”
Ocient has taken a practical method to the way it develops its software program. It helps newer applied sciences, reminiscent of NVMe and erasure encoding, whereas concurrently adopting older architectures, like secondary indexes and co-located compute and storage, when it is smart to take action.
The method appears to be reasoning. The corporate stated final week that bookings over the primary 5 months of 2025 have been almost triple the speed of final 12 months. In April, the Chicago-based firm introduced the closure of its $42.1 million Sequence B spherical, bringing the corporate’s complete enterprise funding to $132 million.
The corporate is now in enlargement mode and searching for to develop revenues. As a part of that drive, final week Ocient introduced in John Morris to be its new CEO, changing Gladwin within the nook suite. Morris and Gladwin labored collectively beforehand at Cleversafe, the place Morris was introduced in as CEO previous to the IBM acquisition.
“I couldn’t be extra thrilled to welcome John as he takes the helm as CEO,” stated Gladwin, who’s now Ocient’s govt chairman. “His operational and strategic management come at a pivotal time for Ocient, very like when he joined Cleversafe and helped drive tripled revenues, which resulted within the firm’s $1.4B acquisition and 10x returns for traders.
Will historical past repeat itself? Solely time will inform.
Associated Objects:
Hyperscale Analytics Rising Quicker Than Anticipated, Ocient Says
Ocient Report Chronicles the Rise of Hyperscale Knowledge
The Community is the New Storage Bottleneck

