Recently, I had the pleasure of visiting Trade Me, who are responsible for generating the largest percentage of all web traffic (and most likely shipped parcels) in my country.
If you’re not from New Zealand, you might be asking, “Who is Trade Me?” Glad you asked. Originally founded in 1999, they are the largest Internet auction and online classified website operating in New Zealand for New Zealanders. In December 2011, Trade Me Group Ltd (“Trade Me”) was publicly listed on New Zealand’s Securities Exchange (where I work). Trade Me also operates many sister websites, including Find Someone, Travelbug, and Holiday Houses.
As of December 2017, Trade Me’s flagship website, trademe.co.nz, was the fifth most visited in New Zealand behind Google (both google.com and google.co.nz), YouTube and Facebook according to Alexa Internet. Each month, Trade Me collectively generates an average over 600TB of web traffic originating in New Zealand.
Trade Me has over five million active users, with an average of well over three quarters of a million people signing in each and every day. In a country with a humble population of around 4.7 million, you know that Trade Me’s small team must be busy.
Last month, Trade Me had over 334.6 million unique page views with an average of over 80 unique image objects per page (based on user defaults when I logged in to my account). Meeting users’ expectations for responsiveness and ease of use takes an astounding amount of data. How do you manage the associated privacy, performance and efficiency constraints involved with storing and serving up billions of images when you are not a Fortune 500 tech company? Let’s find out.
Meet the Team
With a small Engineering team of 10 talented and light-hearted individuals in an upbeat, fast paced office environment, the Trade Me team is strong and focused.
In the photo above, you’ll meet the Trade Me team. The guys are true data visionaries. From left to right we have:
- Evan Fraser (Performance Engineer)
- Chris Pearman (System Engineer)
- Joseph Wilkie (System Engineer)
- Sam Speight (System Engineer)
- Julius Dabre (Storage Engineer)
- Jason Keating (Engineering Lead)
As with all organisations, Trade Me has faced rapid growth and new ways of implementing technologies over the years. Systems and services become hard to manage and maintain, and what was once a perfectly good solution is now a painful legacy. Why does this happen?
Growing complexity creates inefficiencies as volumes of ingested unstructured data increases and speed of data retrieval decreases.
Trade Me had proactively identified that the infrastructure of the past was not sustainable and would soon not be able to keep up with the company’s data growth. It identified three significant pain points:
- Growing complexity of managing very large repositories of images.
- Decreasing speed of effective retrieval of hundreds of millions of images
- Increasing volume of unstructured data over time, which was intensified by the substantial growth and success of the business and the ongoing requirement of data retention
Understanding that the traditional methods of using file-based retrieval of unstructured data was about to exceed the storage and performance limits, Trade Me started looking at ways to alleviate their growing pains.
Storage Engineer Julius Dabre first looked at restructuring the traditional method of file storage and increasing the storage available to accommodate the growing number of files.
While this was initially successful, it soon started to create retrieval issues related to performance. Caching servers had a hit rate of around 80%, and, if the connection failed, they would fall back to the underlying storage to provide images directly.
Trade Me had a critical decision to make: Move forward with the current solution and look for relief from additional caching servers or look for a future-proofed solution to seamlessly migrate and manage the explosion of images and associated data.
Julius first looked at the increased usage of international third-party caching servers, but this would mean all images would be served from outside New Zealand and failure to reach the caching servers would ultimately render the site non-functional.
Using the public cloud and leveraging a hyperscaler like AWS or Microsoft Azure was very attractive. And it would have been all too simple, except for one catch: New Zealand has no public cloud hyperscaler. The closest hyperscale provider is located in Australia. If 80% of your customers were located in New Zealand, would you be willing to bet your business on a cable?
This led Julius to look at on-premises Object Storage for Trade Me’s existing private cloud. He invited multiple storage vendors for solution discussions and evaluated different aspects with their requirements, including the technology, support model and licensing of each solution.
According to Julius, “While each of the presented solutions had merit, NetApp StorageGRID stood out from the competition and enabled Trade Me to reutilize existing infrastructure, reducing costs with a future option of using a public cloud hyperscaler without tedious migration or application refactoring.”
It was the right call. For customers with very large sets of unstructured data, StorageGRID® Webscale ultimately enables successful business outcomes aligned with thoughtful strategy so that customers can reduce costs, empower agility and overall decrease time to deliver.
The initial implementation at Trade Me was on StorageGRID WebScale 10.3 and had a modest 5 million objects. This soon grew to hundreds of millions of objects as Trade Me seamlessly migrated more environments to the StorageGRID Webscale buckets.
With the release of StorageGRID WebScale 11.0, Julius says that Trade Me will be looking to upgrade to take advantage of Cloud Mirror, which will allow the replication of objects in native format into a target S3 bucket in Trade Me’s private cloud or, just as easily, in the public cloud, which means that Trade Me can apply cloud resources directly to the data that has been replicated to the cloud.
Trade Me is also benefiting from StorageGRID WebScale’s metadata tagging, rich sets of APIs, and SDK support and integration that empower automation and self-service. Head on over to NetApp.io for StorageGRID Webscale PowerShell examples.
Finally, Julius sums up StorageGRID Webscale as, “fully featured, enterprise-hardened technology that is stable at scale.” He adds, “I have seen no issues with replication. Event notifications are detailed and precise. And performance, elasticity and functional real-world use are all exceeding expectations.”