The days of small data are over. It wasn’t long ago that even the largest enterprise could store all its data on-premise. But today, there are even some small businesses that would find that prospect laughable.

 

That’s because we’re in the middle of something that’s not so much a data explosion as a Data Big Bang. Data is no longer just being created or used by humans or computers. Nowadays data is generated by your watch, your toaster, your house, and your car, not to mention by businesses to power their products and services.

 

As a result, the way we think about data has changed from something to be stored to something to be leveraged. With the first generation of object storage, we saw a focus on cost-effectively storing cold data for archive or backup. However, organizations now demand their data storage solution be able to intake and securely store massive amounts of data while still being accessible to power Tier 1 applications whenever required.

Data comes to the driveway

Almost every industry is being both revolutionized and challenged by the need to effectively store and access all this new data. One of the most interesting examples is perhaps the rise of the autonomous vehicle.

 

For more than 100 years, the amount of data a car generated was exactly zero. While increasingly filled with cutting-edge engineering and technology, cars have been the ultimate in closed, islanded systems.

 

But in just the past few years, cars have become some of the largest producers and users of data on the planet–or off: according to NASA, the Hubble Telescope generates 844 GB of data per month, or 10.12 TB per year. Meanwhile, a single test autonomous vehicle is capable of generating up to 70 TB of data per day thanks to its lidar, radar, video cameras, and hundreds of sensors.

 

Let that sink in. Soon it will take more data to go on a Doritos snack run than it takes to unlock the mysteries of the universe.

Putting data storage into overdrive

The challenges that autonomous vehicles will present for data storage include:

  • Scale: One test vehicle currently produces up to 70 TB of data per day, while there are approximately 1 billion cars in the world. Convert all those vehicles to autonomous vehicles and that’s 70 zettabytes a day, or 25.55 yottabytes a year. A storage solution that can’t massively scale over the long-term simply isn’t a solution.
  • Policy-based management of data: All that data will be used constantly by multiple vehicle systems to check for upcoming obstacles, analyze traffic patterns, conserve energy, and parallel park. It will also be used by the car manufacturer to improve systems, by mapping apps to improve directions, and by traffic management systems to improve traffic flow. Tracking where this data is kept and how it is used will be vital. Enabling the manufacturer to keep the most important data most readily accessible for analysis is a matter of safety (and risk reduction).
  • Reliability: With self-driving cars, lives are on the line. A vehicle needs to have reliable access to the right data at the right moment so it knows what to do at all times. Being able to analyze and update control data and algorithms for vehicles in the field based on aggregated data lakes of sensor data will accelerate innovation and improve safety.
  • Tiered data: While the cloud will play an important role, the fact is cloud storage is very economical as cold storage when you don’t need access to the data. An auto manufacturer might not need access to most of its recorded data right up until the microsecond where it does. The ability to turn cold data hot and back to cold again will be key.
  • Compliance: From safety to liability to privacy to law enforcement, autonomous vehicles are going to represent an enormous compliance and cybersecurity issue that will be closely scrutinized by regulators, consumers, marketers, and businesses, not to mention hackers, cyberterrorists, and state-sponsored attackers.

And this is just one use case. We are seeing a similar challenge across industries like media, financial services, life sciences, healthcare, and retail. From digital animation to securities risk analysis to gene sequencing to IoT-enabled manufacturing and warehouses, it doesn’t matter what business you’re in: you’re in the data business or you will become a dinosaur.

 

NetApp® StorageGRID® was recently named by IDC as a leader in its MarketScape for object-based storage thanks to our technology’s ability to power use cases like autonomous vehicles, artificial intelligence, IoT, and more. Together with NetApp FabricPool technology, it can help you solve even your toughest unstructured data management challenges and fuel the future of your organization. Learn more about StorageGRID on netapp.com.

Duncan Moore

Duncan has spent the last 19 years at NetApp working to build solutions to solve customer problems in the areas of: Backup & Recovery, Disaster Recovery, Object Storage and Storage Security. He joined NetApp in 1999 to work on NetApp's first, and still industry leading, replication product SnapMirror and since then has been involved with many other industry firsts including: Unified primary/secondary storage deduplication, near-line storage appliances as well as integrations with leading data protection software vendors.

Duncan’s current focus is in the area of Object Storage, where he leads teams building NetApp’s StorageGRID enterprise object storage system.

Duncan holds a BS in Computer Science from California State University at San Jose as well as an MBA from the University of Kansas. He is based at the NetApp Technology Center in Research Triangle Park, North Carolina.