DUG Technology: Exascale Flash Storage

Seismic analysis is a high performance computing (HPC) discipline that pieces together what lies under the surface of the earth from nothing more than the reflection of sound. To come up with useful 3D analyses requires petabytes (PB) of data and thousands of powerful computers. Not even major oil companies possess all of the computational resources necessary to conduct all of this analysis in-house, so they turn to companies like DUG Technology to tease out details from their mountains of data.

DUG refers to this capability as HPC-as-a-service (HPCaaS): specialized, full-stack exascale computation available on demand. Traditionally, DUG’s compute-as-a-service technology was available only to specific customers, such as major oil and gas companies. As the market took notice of its capabilities, DUG expanded its offering to other industry verticals that use this same service to tackle a diverse set of extreme computational needs.

DUG decided to bring the same “bring-nothing-but-your-data” ease of service to businesses outside of the energy sector. DUG knew that it could serve these new industry verticals economically because of the specialized DUG McCloud service for HPC. VAST Data Universal Storage, powered by Intel® technologies, undergirds DUG McCloud and enabled DUG to successfully break into new verticals, including academia, astrophysics, medicine and genomics, wildfire modeling, and COVID-19 research. However, getting to this point required a sea change in how DUG dealt with its storage.

For its first decade of operation, DUG had been deploying and managing HDD-based storage to deliver the scale and cost economy that its seismic workloads required. During that time, DUG thoroughly optimized its applications to make use of the capabilities, and avoid the limits, of its Lustre HDD-based infrastructure. Here, DUG had to make many compromises. For example, when Lustre file system clients would hit peak throughput for a given workflow, other users sharing the same file system would suffer. From a resilience perspective, although DUG designed its software to protect against HDD failures, the need to swap out failed drives on a weekly basis was a constant thorn in DUG’s side.

Finally, while DUG’s applications were well optimized for Lustre and HDD storage, the new applications that DUG was evolving to support all handled storage input/output (I/O) differently. Storage versatility and multitenancy became vitally important to DUG; any new solution would need to support a broad set of requirements and to support them at exascale. DUG also needed storage that could handle the multiplicity of throughput requirements for different applications. DUG looked to solid state drive (SSD)-based storage to provide higher performance and reliability. However, moving to SSDs on Lustre would have been prohibitively expensive, and affordability was paramount for DUG.

In order to build a resilient and adaptive storage environment that enabled expansion into new markets, DUG required a new approach to storage.

Immersion-cooled servers at a DUG data center.

Solution: VAST Data Universal Storage
DUG chose VAST Data Universal Storage to expand its business and support the needs of a wide diversity of new markets and customers. The Universal Storage offering combines the speed and scale of a parallel file system with a new level of flash affordability and multitenancy to deliver a complete technological leap forward for DUG. VAST Data’s disaggregated shared everything (DASE) architecture also provides consistent performance by isolating non-optimized I/O so as not to impact other tenants. With the DASE approach, VAST Data eliminates the concurrency challenges of parallel storage to deliver high performance for specific workloads that does not come at the expense of other workloads.

Beyond significantly improving the customer performance experience, VAST Data provides a combination of reliability, management, and support that is not otherwise found with legacy HPC storage technologies. VAST Data’s DASE architecture supplies exascale scalability, enabling DUG to grow to tens of petabytes of flash storage with no single points of failure in an architecture that can quickly recover from failure. The reliability of the DASE architecture comes “for free”: it is a direct result of VAST Data’s data-protection efficiency and the architecture's statelessness. Beyond resilience, VAST Data Universal Storage also simplifies the deployment and management experience for DUG by providing an integrated scale-out appliance that consistently pushes out new features that are automatically applied while the system is online, so there’s no downtime for DUG.

Overview of VAST Data Universal Storage with Intel Storage Technologies
VAST Data Universal Storage provides a single, global namespace so that each application has access to all of the associated data for that workload. The VAST Data solution combines all-flash drive performance, massive scalability, the economics of archive storage, and the simplicity of plug-and-play network-attached storage (NAS) connectivity.

Intel® SSDs provide the hardware basis for the cost-efficiency and reliability of VAST Data Universal Storage. Intel’s pairing of vertical floating-gate technology and complementary metal-oxide-semiconductor (CMOS) under-array architecture delivers the highest areal density (gigabytes of storage per square millimeter) in the industry for the same bits per cell.1 This means that Intel® QLC 3D NAND SSDs provide not only greater areal density than previous-generation triple-level cell (TLC) media, but greater areal density and higher reliability than competing quad-level cell (QLC) designs based on charge-trap technology.1 The architectural innovations from Intel enable the VAST Data solution to economically store all data on flash drives. The cost effectiveness and high reliability of Intel QLC 3D NAND SSDs provides the foundation for VAST Data’s architecture to reduce costs by up to 85 percent compared to HDDs, providing a dollar-per-gigabyte (GB) cost similar to that of HDD-based systems over 10 years.23

Intel® Optane™ SSDs further accelerate write performance for workloads running on VAST Data Universal Storage. Crucially, Intel Optane SSDs buffer writes to storage, which enables full QLC erase-block writes. The low latency, high endurance, and high 4K random-write performance of Intel Optane SSDs help ensure that long-term and short-term data are not co-located in large QLC blocks. Intel Optane SSDs shield Intel QLC 3D NAND SSDs from inefficient write behavior, which is one reason VAST Data can offer a 10-year SSD endurance guarantee while also delivering the economic benefit of cost-effective QLC NAND.23

Logical diagram of the VAST Data Universal Storage solution.

Storage capacity, cost, and capability are only part of the VAST Data Universal Storage story, however. The VAST Data solution is also quite sophisticated in the implementation of new algorithms that pioneer all-new levels of data-reduction and data-protection efficiency.4 VAST Data Universal Storage brings all of these architectural aspects together with 2nd Gen Intel® Xeon® Scalable processors to implement a new class of global algorithms in a DASE cluster.4 These processors provide the computation power underlying VAST Data Universal Storage and vital acceleration libraries. The storage performance development kit (SPDK) serves as an accelerant for VAST Data Universal Storage to deliver low-latency access from every CPU to every QLC and Intel Optane SSD. The SPDK thereby eliminates the need for complex and volatile cache-coherency operations that can otherwise inhibit scale in legacy shared-nothing storage architectures.

VAST Data Universal Storage interconnects CPUs with NVM Express (NVMe) devices using the NVMe over fabrics (NVMe-oF) protocol to provide distributed scale with the performance and latency of direct-attached storage (DAS).5 NVMe-oF runs over standard Ethernet or InfiniBand networks to enable the disaggregation of resources and a shared-everything architecture over commodity data center fabrics. The VAST Data connection exposes the system via ubiquitous protocols such as network file system (NFS), server message block (SMB), and an Amazon S3–compatible API, so that applications that consume universal storage do not require specialized adapters, formats, or protocols.

VAST Data Changed How DUG Handles Data
DUG has been fully in production with VAST Data since December 2019 at DUG’s data centers in Houston, Texas, and Perth, Australia, with plans for further expansion. In fact, DUG plans to double its compute capabilities in Houston and more than double those capabilities in Perth during 2020 and 2021. Fortunately, the VAST Data solution becomes more reliable, not less so, as it grows.

DUG’s data-storage needs have always been large. Seismic processing projects arrive at DUG with more than 1 PB of data, and they experience a 6–8x expansion in the course of processing. During a single seismic-processing project, DUG will copy and write that data up to 50 times—and DUG typically has more than 100 projects running simultaneously at any given time. VAST Data Universal Storage is perfect for this type of data growth, and it helps DUG ensure that competing applications all experience performance fairness on a shared HPC computing resource.

Beyond efficiently handling huge quantities of data, VAST Data’s data reduction is another attraction. For DUG, this is a cost-reducer. Even with seismic data, which is notoriously difficult to reduce, VAST Data's powerful data reduction capabilities can save significant amounts of money. DUG sees greater savings through data reduction with different workloads using VAST Data’s new similarity-based approach to global data compression.

Another advantage for DUG is that VAST Data remotely manages the storage for DUG 24/7. This is the first time that DUG has benefited from having a vendor provide remote appliance management for its storage. DUG experiences zero downtime for updates, and its IT admins can feel confident knowing that VAST Data is closely monitoring the performance and availability of their environment. Because of this, DUG can expand storage capacity without having to grow its storage team.

Storage as a Strategic Asset
DUG’s successful move into new markets was made possible by VAST Data Universal Storage, powered by Intel technologies. The VAST Data storage solution provided DUG with the capacity, performance, and reliability to get rid of HDDs, move beyond complex HPC file-storage technology, and provide a leadership-class customer experience for customers within and beyond the oil and gas industry. An all-silicon storage offering provides the consistency and diversity of high performance that makes it possible for DUG to efficiently build out its multitenant cloud environment for its next wave of growth. The storage, reliability, and ease of management afforded by VAST Data has turned storage into a strategic asset for DUG, and has enabled it to better achieve its broader business goals.

About DUG Technology
With more than 17 years of experience and data centers in Perth, Houston, London, and Kuala Lumpur, DUG Technology is at the forefront of HPC. It combines innovative hardware and software solutions that enable clients to make use of large and complex datasets. DUG Technology’s industry experience and strong grounding in applied physics has equipped it to provide state-of-the-art HPCaaS delivered either direct-to-client or via its DUG McCloud platform.

Learn More
Read the VAST data exascale NAS white paper.

Explore Related Products and Solutions

Intel® Xeon® Scalable Processors

Drive actionable insight, count on hardware-based security, and deploy dynamic service delivery with Intel® Xeon® Scalable processors.

Learn more

Intel® Optane™ SSDs

Intel® Optane™ technology is the first major memory and storage breakthrough in 25 years.

Learn more

Intel® SSDs

Intel® SSDs for the data center are optimized for performance, reliability, and endurance.

Learn more


英特尔® 技术的特性和优势取决于系统配置,并可能需要支持的硬件、软件或服务激活。实际性能可能因系统配置的不同而有所差异。没有任何计算机系统能够保证绝对安全。请咨询您的系统制造商或零售商,也可10bet娱乐体育宝贝5.net www.bellamarbeach.com 获取更多信息。// 性能测试中使用的软件和工作负载仅在英特尔® 微处理器上针对性能进行了优化。SYSmark 和 MobileMark 等性能测试使用特定的计算机系统、组件、软件、操作和功能进行测量。上述任何要素的变动都有可能导致测试结果的变化。您应该查询其他信息和性能测试,以帮助您对正在考虑购买的产品作出全面的评估,包括该产品在与其他产品结合使用时的性能表现。如欲了解更多完整信息,请访问 www.intel.cn/benchmarks。// 性能结果基于配置中所规定日期的测试,可能无法反映所有公开的安全更新。有关详细信息,请参见配置信息披露。没有任何产品或组件能保证绝对安全。// 所描述的成本降低方案仅用作示例,表明某些基于英特尔® 的产品在特定环境和配置下会如何影响未来的成本,并节约成本。环境各不相同。英特尔不保证任何成本和成本的节约。// 英特尔并不控制或审核本文档引用的第三方基准资料或网站。您应访问引用的网站,确认参考资料准确无误。// 在某些测试案例中,结果以英特尔内部分析或架构模拟或建模为基础来评测或模拟,且仅供参考。您的系统硬件、软件或配置的任何不同均可能会影响实际性能。


1Comparing areal density, Intel measured data on a 512 GB Intel 3D NAND SSD and on representative competitors based on 2017 IEEE International Solid-State Circuits Conference papers citing Samsung Electronics and Western Digital/Toshiba die sizes for a 64-stacked 3D NAND component. Source: ISSCC 2018; H. Maejima; ISSCC 2019 C. Siau.
2VAST Data. “Redefining Storage Economics.” https://vastdata.com/economics/.
3VAST Data. “Zero Compromises Guarantee.” April 2019. https://vastdata.com/wp-content/uploads/2019/06/VAST_Data-Zero-Compromises-Guarantee.pdf.
4VAST Data. “Universal Storage: Innovation to Break Decades of Tradeoffs.” February 2020. https://vastdata.com/wp-content/uploads/2019/04/VAST-Data-Overview.pdf.
5VAST Data. “Architecture.” https://vastdata.com/architecture/.