Clouds of Glass: The Future of Cloud Data Storage

Glass storage

Special thanks to Dr Paulo Costa Principal Researcher in the Systems and Networking Group of the Microsoft Research Lab in Cambridge and an Honorary Lecturer with the Department of Computing of Imperial College London who very kindly helped Jen Stirrup and Elizabeth Hannah with this post. 

Project Silica: Clouds of Glass

Our technology and all of our data is rapidly moving in the direction of being completely cloud-based, and Microsoft is at the heart of this development, constantly working on changing the tech world to keep up with this. Compute, networking and storage are slowing in terms of growth, but the data that we need to store is ever-growing and needs to be kept up with. Many of the core technologies that data centres are using were designed before the cloud was invented. The design choices that were made do not account for how rapidly and dramatically data infrastructure has evolved. There is scope to create cloud-specific technologies that no longer have to work around antiquated design and outdated features.

Optics for the Cloud is a research programme that aims to advance and enable the adoption of optical technologies in cloud computing. There are opportunities to change or create new storage, network, compute, memory, and specialized accelerators. The research also covers the vertical stack, from optical device fabrication, to optical system and sub-system design, to co-design and integration with the rest of the data centre infrastructure. There is also scope to create entirely new applications which are enabled by optical technologies.

Microsoft Research Cambridge are currently building a team to explore Optics for the Cloud. The research team will take a holistic view of the needs of the cloud, and ask the question – how can Optics for the Cloud change the cloud? The goal is to underpin the next generation of the cloud by inventing optical end-to-end systems across storage, network and compute.

Storage

In-cloud data storage is deployed in tiers based on the workload. There are usually at least three or four tiers, and the most important workload characteristics that can have an effect on the number of tiers are, 

  • Frequency of datum access
  • Permissible latency for providing data after the data has been requested
  • The size of the individual data units.

Tiers are given names which draw parallels with temperature. For example, a Hot tier will have a high access frequency and sub-millisecond latency; a Medium tier will have a medium access frequency and latency of tens of milliseconds, and a Cold tier will have a low access frequency and latency could be between minutes and hours. 

Because cloud computing is expanding so rapidly, there is now even more data than before, and this has resulted in a greater need for increased storage. The storage density is now approaching its fundamental physical limits. In the Cold tier, the limited storage lifetime of tape (around 3-5 years) is also becoming a problem, as data needs to be regularly transferred from old media to new. 

To keep up with the growing demand for archival storage, new approaches need to be investigated. These approaches should provide greatly extended data lifetimes and reduce the cost per byte. In Standard and Cold tiers, HDD-based systems are sometimes limited by the number of operations that can be performed per second, so it is also useful to look at approaches to storage that could improve this.

Network

Network requirements are expected to increase greatly in the next few years. This is because emerging workloads like large-scale machine learning demand this.

Optics for the Cloud aims to re-invent the network that will underpin the cloud by 2025. Optical technology should be able to meet these key demands. Optics are already being used to interconnect racks and data centres across the globe, but Optics for the Cloud aim to further improve this technology by innovating across the whole stack. This means that optics will be used for new network architectures to new transceivers and optical fibers to deliver a better performance at a lower cost. Use will also be extended to new areas, such as optical switches and board interconnects, to take advantage of their low and predictable latency, very high bandwidth, and low cost.

This will require a cross-disciplinary approach, with efforts required from people from all sorts of sectors, such as physics, software, hardware and network. If this is achieved, it could revolutionize cloud infrastructure — performance could become faster and more reliable, with a uniform high performance.

Compute

There are new research opportunities arising as a result of how data centres are expanding so rapidly. This is because techniques which were underdeveloped before, because they are only relevant at large scales, have become more realistic and are more attractive options in some cases.

Optics for the Cloud is looking to encourage cross-disciplinary research between computer scientists and optics researchers. There are exciting opportunities in this field to explore the relationships between computing and physics.

Project Silica

It’s expected that data will be stored in the cloud in zettabytes by next year. This incredible scale requires a complete re-think of how large-scale storage systems are used, and how the underlying technologies underpinning these systems are developed.

Project Silica is developing storage technology that is designed and built for the cloud. It is the first ever technology created for this purpose. The developers behind the project will leverage recent discoveries in ultrafast laser optics to store data in quartz glass, using femtosecond lasers. This will enable an entirely new type of storage system which has never been explored before, and will encourage a complete upheaval of how we view traditional storage systems and storage system design.

Femtosecond lasers have the potential to store data for up to a million years. This means that the need to regularly move data to different platforms in order to conserve it will be made redundant. Therefore, this technology has the potential to be extremely useful for archivists and museums, as it can hold information for centuries, or even thousands of years if needed.

The process involves data being stored using five dimensions of glass. This includes the traditional three dimensions (length, width and height), but also incorporates two new measurements; axis orientation, and an optical property called birefringence, which refers to how the material refracts light. The use of five dimensions means that each ‘spot’ on the glass can store three different ‘bits’ of information. In 2013, Southampton researchers tested glass which they found could store up to 360 terabytes – but there have been developments since, and potential for greater storage.

The process involves using a femtosecond laser to write the structures. This is extremely fast – reading and writing to the structures is faster than reading or writing to a Blu-ray disc. Femtosecond lasers are very expensive, and so are unlikely to replace hard drives in the near future, but they could prove an extremely effective way to handle the enormous amounts of data stored in the cloud.

Leave a Reply