Data Domain Boost over WAN is here!

Last week we made Avamar 7.1 Generally Available to the public. For more information see here. One of the features we introduced was support for Data Domain Boost over WAN. Naturally, I wanted to try it out by backing up my new MacBook (that I am still learning) from my EMC work office to my home lab over an SSL VPN tunnel.

The first thing I did was upgrade the lab to Avamar 7.1 and seed the first backup to the Data Domain over the LAN. The total size of the seed backup was 195GB. I stumbled a little getting the first seed backup as I didn’t realise when the MacBook sleeps the backup shuts down. Once I figured out how to use my MacBook 🙂 I had my seed.

I kicked off the backup from work and after a few cycles it averaged 53 minutes. What’s really cool is it only sent 123MB as illustrated by the Activity Monitor.

In throughput terms thats 216 GB/hour. Not too bad for a full recovery point.


Naturally, my colleagues had a bunch of questions for me.

What is the latency to your home lab?

At first glance I thought it was reasonable. However, when I started looking into it I realised it was quite poor with a moderate amount of packet loss. At 64bytes it was averaging 163ms with 2% packet loss. That’s not healthy. Why so bad? Turns out even though the distance between the EMC office and my home is only ~25 km the IP traffic is routed via the US (Massachusetts specifically) and back. This is far from optimal, but it shows even under poor conditions DD Boost over WAN is rock solid.

So this had me thinking. How do I perform a WAN test and keep the test in country? How about over my mobiles 4G network? I did the same backup again this time tethered via USB cable to my mobile phone network.

Network quality was better at 109ms with no packet loss. This time around the backup took 36 minutes (325 GB/hour).

The next question was how does DD Boost over WAN compare to the Avamar client-side protocol which writes to Avamar Data Store nodes?

After reconfiguring the Avamar client and seeding the first backup to Avamar storage I ran the subsequent backup over the 4G network to the Avamar storage. This time around the backup took 17 minutes (688 GB/hour) halving the time relative to DD Boost over WAN.

I expected this to be the case. One of the differences between the Avamar algorithm and DD Boost over WAN is the degree of client-side caching.

The Avamar algorithm maintains two levels of caching on the client. The file metadata hash cache and the file data hash cache. The file metadata hash cache is used to avoid exchanges with the Avamar storage when the file it is backing up has not changed. If the file has changed the file blocks get hashed and these hashes are compared to the local data hash cache. If the local data hash cache returns a hit we avoid an exchange with the Avamar storage. If we get a data hash cache miss we must ask the Avamar storage if the hash is present at the other end.

In the case of DD Boost over WAN with Avamar software, we only have at our disposal one level of client-side caching – the file metadata hash cache. In the event a files contents has changed between backups, DD Boost over WAN relies on more exchanges with the Data Domain appliance to determine if a file data hash is present at the other end.  As such, in the case where the round trip time is elevated, we should expect backups to take longer in the DD Boost over WAN use case compared to the Avamar algorithm.

As a comparison I also compared DD Boost over LAN backups to the Avamar algorithm. As expected, they both took the same time (11 minutes) as the benefit of multiple levels of caching only materialise when the round trip time is elevated. The time it took is essentially how long it takes to traverse the file system and hash the file metadata on my MacBook. Had the MacBook not been a bottleneck I would expect DD Boost to be more efficient on the client (because the client performs less work for the same outcome) in LAN use cases.

It’s always wise to know the differences and plan accordingly. Of course, your conditions and results will vary. Hope this helps. Peter..