Convergence Without Compromise

April 6, 2018April 6, 2018 Josh De JongAWS, Backup, Cloud, Datrium, DR, DVX, HCI, SFD, SFD15, Storage, TFD, VMware2 Comments

Hyperconverged Infrastructure (HCI) gets a lot of attention these days, and rightly so. With HCI we’ve seen a move towards an easy-to-use, pay-as-you-grow approach to the datacenter that was previously missing. Complex storage array that required you to purchase all your capacity up front is what I started with in my career. While expansion of these storage arrays was possible, often times we were buying all the storage we’d need for 3-5 years even though we wouldn’t be consuming it for multiple years.

While HCI certainly made things easier it was far from perfect. Mixing storage and compute nodes into a single server meant maintenance operations needed to account for both available compute resources as well as available storage capacity to accomodate offline storage. At times we would actually sacrifice our data protection scheme in order to takes nodes offline and hope there were no additional failures within a cluster at the same time. Not ideal when we’re talking about production storage.

Get Down with the DVX

Datrium and the DVX platform aim to address these problems in an interesting way. Datrium separates storage and compute nodes much like traditional two tier system, but utilizes SSDs inside each of the hosts to act as a read cache. By moving the cache into the host we’re able to increase performance with every host we add. This decoupling of cache from the storage layer means we’re not queuing up reads at a storage array that is trying to satisfy the requests of all the connected hosts over the same connected switches. While this sounds very similar to previous technologies we’ve seen before (Infinio and PernixData come to mind), the differentiator is the storage awareness.

The Datrium DVX solution utilizes their own storage nodes for the persistent storage piece. With the caching and storage being fully aware of each other, Datrium is able to offer end-to-end encryption from the hypervisor down to the persistent storage while still being able to take advantage of deduplication and compression. Often times encrypting data at the storage array level means we are forced to give up these data efficiencies, but not in the case of Datrium. We get an additional level of data security without having to make any compromises.

No Knobs, No Problems

HCI vendors have really pushed the configuration abilities within their systems. Customers can choose what data is deduplicated and compressed, whether or not it should be encrypted, how many copies of their data should be kept, and is erasure coding a better choice than traditional RAID just to name a few. This is where Datrium separates itself from its HCI competitors. Disaggregating compute nodes from the persistent storage layer, Datrium’s DVX system manages to deliver performance and features without penalty. Once again, no compromises.

Erasure coding, dedupe and compression, double-device failure protection, data encryption; every one of these features is always on and doesn’t require any separate licensing or configuration. The advantage here isn’t just in administrative overheard, but also in performance. Datrium’s performance numbers are based on each one of these features enabled. No tricks. No Gimmicks. What you see is what you get; unlike many of their competitors that hide behind unrealistic configurations many of these features being disabled.

3 Tiers, 1 Solution

Datrium aims to bring together a Tier 1 HCI-like solution, combined with scale-out backup storage and Cloud-based DR all in the same system. With integrated snapshots that utilize VMware snapshots as well as VSS integration, they are able to perform crash consistent and application consistent snapshots of virtual machines right on the box. This, of course, is table stakes when it comes to modern storage arrays. The differentiator is that Datrium is able to do this at the VM-level despite presenting NFS to the virtual hosts. Now we’re not just backing up all the VMs that live in a LUN or volume, we’re able to get as granular as the virtual disk itself. No VVOLs required.

Adding another level of visibility into the mix, Datrium reports its latency at the individual Virtual Machine level instead of at the storage array. Traditional storage array vendors talk about their ultra-low latency, but this reported latency is what the array is seeing not taking into account the latency imposed by virtual hosts and switching infrastructure. With each different component in the virtual infrastructure having its own queues, varying utilization and available bandwidth, the latency a Virtual Machine experiences is much greater than what the array is reporting. Datrium is offering this full visibility at the individual Virtual machine level so you know how your environment is actually performing. Dr. Traylor from The Math Citadel has an excellent overview of queuing theory, Little’s Law, and the math behind it.

The cloud-based integrations also allows for an additional level of data availability. Instead of requiring an additional backup software, Datrium allows for replication of your data to a DVX running in the cloud. Now we have an offsite copy of your data ready to be restored in the event of VM corruption or deletion. Replication is also dedupe-aware, meaning data isn’t being sent to the cloud if it is already present helping to minimize bandwidth requirements and speeding up the replication process.

Cloudy Skies Ahead

While I am very reluctant to trust one solution with my primary and backup data, in certain situations I can see the advantages. Integrations with AWS allowing for virtual machines to be restored from the Cloud-based DVX means your DR site can now be in AWS. Datrium has lowered the barrier to the cloud for a lot of customers with the features they’ve included in the DVX platform.

Datrium continues to make a good product even better. The additional features available in version 4.0 of DVX make this not only a great fit for SMB customers, but enterprises as well. A feature-rich, no-knobs approach to enterprise storage with backup and DR-capabilities all rolled into one. Datrium is definitely worth a look.

________________________________________

Disclaimer: During Storage Field Day 15, my expenses (flight, hotel, transportation) were paid for by Gestalt IT. I am under no obligation by Gestalt IT or Datrium to write about any of the presented content nor am I compensated for such writing.

The Challenge of Scale

March 15, 2018 Josh De JongAWS, Cloud, Dropbox, Scale, SFD, SFD15, Storage, TFD2 Comments

Working in the SMB space for the majority of my career meant rarely worrying about hitting scale limits in the hardware and software I was responsible for. A few years ago, the idea of managing a data footprint of 20-30TB was huge for me. I didn’t have the data storage requirements, I didn’t have the number of virtual machines, I didn’t face struggles of scale. As I moved into the enterprise that scale went up massively. 20-30TB quickly became multiple petabytes. The struggles you face at the enterprise-level are much different.

While listening to James Cowling from Dropbox present on their “Magic Pocket” storage system, he said something that really put their scale into perspective. Building a storage system of 30 petabytes was referred to as a “toy system.” As they explored the possibility of moving users’ data out of Amazon and into their datacenters, a storage system needed to meet their ever-increasing storage demands. Storage software capable of managing 30PB was easier to come by then software capable of managing 500PB. When building this homegrown solution to hold all the file content for its users, theirs was a challenge few others have had to face. With that much data being hosted in AWS there was no off-the-shelf product capable of managing this scale.

While the move from AWS to on-premises sounds simple, issues like scale are just the tip of the iceberg. Dropbox didn’t just need to write a massively scalable filesystem, work hand-in-hand with hardware vendors to find the right design, determine the best way to migrate their data to their datacenters, ensure data integrity, and validate every aspect throughout this entire process, but they needed the time to do all of this right the first time. When your job is content storage and collaboration, “losing” data isn’t an option. Having confidence in your solution and management granting the autonomy necessary to “reset the clock” if and when bugs were found is the only way this move was going to be successful.

And what prompted the decision to move out of AWS’s S3 storage? Cost. To the tune of nearly $75 million in operating expenses over the 2 years since getting out of AWS. Storage is cheap and getting cheaper, but storage at scale is an expensive endeavor. While the cost savings is signficant, the performance gain was significant as well. Dropbox saw a dramatic performance increase by bringing data into their datacenters and using their new storage system. This is just a reminder that the real cost of “cloud” is often much higher than companies expect.

Back to the issue of scale. Storage wasn’t the only issue they faced. Now with over 1 exabyte of storage and growing at a rate of nearly 10PB per month, they also faced an issue of bandwidth. Dropbox sees around 2Tb of data moving in and out of its datacenters per second. PER SECOND. With that kind of demand, minimizing traffic and chatter inside their network is important as well. Events such as disk, switch, or power failures shouldn’t be creating additional rebuild traffic inside the network impacting disk and network performance. The Dropbox datacenter monitoring solution is just as advanced as the storage system; capable of analyzing the impact of any such failures in the datacenter and triggering rebuilds and redistribution only when necessary. There is a balance of network versus disk cost when it comes to how and where to rebuild that data.

Designing a highly availability, redundant, always-on infrastructure looks different depending on your scale. Application-level redundancy, storage-level redundancy, combined with a robust monitoring solution are just a few of the techniques Dropbox has utilized to ensure application and data availability. The Dropbox approach may not be common, but was necessary for long term success. Sometimes the only way to reach your goals is to think outside the box.

________________________________________

Disclaimer: During Storage Field Day 15, my expenses (flight, hotel, transportation) were paid for by Gestalt IT. Dropbox provided each delegate with a small gift (sticker, notepad, coffee), but I am under no obligation to write about any of the presented content nor am I compensated for such writing.

Pure Storage – Enterprise Ready, Pure and Simple

December 16, 2015January 12, 2016 Josh De JongPure Storage, Pure1, SFD, SFD8, Tech Field Day, TFD1 Comment

Disruption! Disk is dead! Flash forever!

When Pure Storage first came into the public eye they were loud, they were bold and everyone took notice. Whether you liked their marketing campaigns or their even louder marketing team, Pure marked the beginning of the flash revolution. They weren’t the first to do flash, they were just the first to convince us that all flash was right for our datacenter. If IOPs and low latency mattered, Pure was the vendor you needed.

With a recent IPO and new hardware platform (flasharray//m) there has been a lot going on with Pure Storage. While the announcement of their latest product was over 6 months ago (June 2015), my expectations for Pure are always high. The Tech Field Day delegates at Storage Field Day 8 got a chance to listen to what Pure Storage has been doing, their focus, and hopefully what was still to come. This time around, however, I was left a little disappointed.

When a company who touts disruption as loud as Pure, you expect big things. Believe me, it’s not that what they’ve built isn’t impressive. A brand new, 3U dual-controller array built to eliminate single points of failure, maximize performance and display their orange logo as bright and loud as their messaging has always been is impressive. But that’s old news. This announcement feels like it was forever ago and we’re curious where Pure goes from here. What’s next for Pure?

Sadly, we don’t know. Roadmap and futures were off limits for this newly-public company. The feeling I get is that Pure is focusing on refinement, whether in their products or just in their messaging. Customers want visibility into their arrays. They want non-disruptive upgrades. They want health monitoring and alerting. They want that enterprise feeling companies like NetApp and EMC provide. Pure Storage’s focus right now is the enterprise and everything that Pure1 provides.

Pure1 is SaaS-based management and support of your Pure Storage arrays. Gone are the days of setting up management servers in each of your datacenters to manage and collect metrics of all your arrays. Pure1 allows you to login to a web interface and view statistics and alerts on all your arrays from a single portal. That’s one less server you’re forced to manage, update, and maintain to hold on to your historical data. Pure Storage arrays phone home all the important metrics (every 30 seconds) that matter so you have that single interface. New features can be delivered faster to all its customers with the collected data without the need to perform an update to your software.

Pure1 is being leveraged to open tickets on your behalf. Pure Storage said that roughly 70% of customer tickets are opened proactively by Pure1. Pure has worked at eliminating the “noise” of these alerts as well, focusing on root cause alerts as opposed to just all the alerts that may come from a cable or drive being pulled. These proactive tickets include bug fixes for issues that may come up in your environment based on your current configuration. I can recall a few times in my career where I’ve ran into a software bug on my storage array that the vendor was aware of, but never informed me I was susceptible to. This is the kind of information all customers want, but not all vendors provide.

Pure Storage has taken a non-disruptive everything approach to its software releases as well. While legacy vendors have left me reluctant to upgrade my arrays in the past, Pure Storage touted 91% customer adoption of its software within 8 months of release. Pure’s customers are trusting in its development, enough to even perform upgrades in the middle of the day while still serving production workloads. That is confidence and shows the capabilities of these arrays and Pure’s engineers. Non-disruptive no matter the operation.

Is the marketing machine that was Pure Storage dead? My hope and feeling is no. Pure is finding out who they are as $PSTG. This is a period of refinement and maturity for the 6-year-old company. We’ll have to wait and see what they’ll do next and I’m sure we’ll all take notice.

__________

Watch all the videos from Pure Storage at Storage Field Day 8 here.

Disclaimer: During Storage Field Day 8, my expenses (flight, hotel, etc) were paid for by Tech Field Day. I am under no obligation to write about any of the presented content nor am I compensated by any of the presenting companies for such writing.

Storage Field Day Here I Come!

September 30, 2015 Josh De JongSFD, SFD8, Storage, Storage Field Day, Tech Field Day, TFD2 Comments

Storage has been a component of my job for most of my IT career. It’s something that I’ve enjoyed, but hadn’t been something I’ve had the time to focus on. Coming from smaller organizations, I’ve been responsible for almost everything in the environment which rarely gave me an opportunity to be an expert in any one technology.

A few years ago the company I worked for was going through a storage refresh and I was tasked with evaluating our existing storage platform and determine our needs going forward. I spent time with nearly every major storage vendor there was going into depth on every aspect I could in order to determine the “best choice.” In the end I gained an understanding of storage that I never had before and it became a passion for me.

All that being said, I am both honored and humbled to be selected as a delegate for Storage Field Day 8! The Tech Field Day events have been something I’ve watched over the last couple of years and I have become a huge fan. These event give the viewers a chance to learn and ask questions about the latest technologies from the presenters. These events are about getting through the marketing and getting into the details. This is a great opportunity to educate yourself on the different products being presented.

Don’t miss all the presentations for Storage Field Day 8 on October 21-23. I am particularly interested in hearing more about what Coho Data and Cohesity are doing, but I’m looking forward to all the presentations.

Euro * Brew

How-To's | Virtualization | Storage

SFD

Pure Storage – Enterprise Ready, Pure and Simple

Storage Field Day Here I Come!