When it comes to choosing the right database management system (DBMS), developers and data analysts today face considerable choices and opportunities, and many of them lie beyond the traditional on-premise options. There are best-of-breed DBMS products with freemium offerings for developers to get started on their laptop, or operate directly within a cloud platform, not to mention the vast amount of options available within the open source communities. I really think we’re living in the renaissance age with an embarrassment of riches.
Recently, much of the growth and attention has been focused on cloud-based database services. After all, cloud database services promise fewer headaches — easier starting, developing, versioning, maintaining — with the added benefit of pay-as-you-go. You just need to remember to shut it down when you don’t use it if this isn’t included as a built-in option (as it is in Oracle’s autonomous DB in the cloud, for example).
So how can developers or data analysts decide which approach (cloud, your own data center, laptop) and which product is right for their analytics or application use case? Looking at this from a user perspective, should this even be a choice up front?
Most importantly, besides looking for ways to further lower the costs of operation and usage, improve performance, add more specialized data engines for new data types, scale the compute engines and data pipeline massively across potentially multiple clouds and private clouds… What’s next?
Are there unforeseen and unique application scenarios we have yet to address by data being distributed across multiple private cloud and public cloud destinations? How do we further simplify the usage of the rapidly growing set of data processing technologies? How would the roles of Database Administrators(DBAs) evolve in the next few years?
The Advantages of the Cloud — the Basics: Scalability, Ease of Use, and Cost Savings
With the cloud, one can theoretically “scale out” to have better performance — assuming the application and platform stack you’re using supports this and you’ve architected your application appropriately based on a cloud-native architecture. For others, the more compelling advantage is ease of use. When you migrate over to the cloud environment, all that management can be available as a service for an additional fee (look out DBAs!! :-)).
So from a developer perspective and from the usage and trial perspective, it becomes a no-brainer — it’s much easier to do. From a cost perspective, the only thing you need to worry about is paying for the software stack above and the long set of infrastructure-as-a-service. If you don’t want to pay for any computing, networking, or storage resources, then you operate on your laptop or you can choose to operate your own data centers.
I don’t think the DBA’s job will go away. I do see DBAs taking on a much more important and business relevant roles within IT. (But that is the subject of another blog.)
One Step beyond the Basics: Cloud Flexibility Enables Faster Development & Encourages Experimentation
There is one more advantage that has really driven a lot of the acceleration of cloud adoption — it’s the availability of the entire application and integration platform stack in the cloud. This saves upfront hardware for experimental projects, making it very easy for users to try new things, toying around with new technologies, and start new development.
Ultimately, they may change their minds in terms of how they want the final production-ready architecture deployment to look. But they can do so without wasting time procuring and setting up HW/SW.
DBaaS Opportunities and Emerging Trends
When Amazon Web Services(AWS) began, it started by providing massive computing resources from an infrastructure-as-a-service (IaaS) standpoint. As other vendors followed, the first approach (and easiest thing to do) was to take existing software and put it into the cloud. But very quickly, DBMS providers realized this was an opportunity to rethink the architecture of their products. Database systems can run faster, easier, and cheaper if you re-architect the software for cloud operation versus on-premise.
It is obvious that DBaaS adoption up to now has come from just migration of an existing DBMS workload. Yes, DB in the cloud with portabilities for popular open source or best-of-breed options such MySQL, Cassandra, Mongo have been no-brainers for both users and public cloud providers to do this. I call this the GEN-1 cloud DB. I’m also eagerly anticipating a much faster convergence of the multitude of data processing techniques within a re-factored DBaaS, such as a DBaaS that also has query and process data stored within the cloud object store.
I look forward to the development of an improved unified data computing framework that expands directly from DBaaS to query data in shared storage such as cloud object store. This is already happening in some DW-as-a-service offer, which I refer to as GEN-2 cloud DB. This generation promises to strike the right balance between value, performance, and ease of access.
The Promise of Hybrid Cloud Environments and the Apps We Have Yet to Invent
One of the hypotheses I’ve been testing is the possibility of a new class of application scenarios that can naturally take advantage of the distributed nature of data and workload between on-premise and cloud. I see ample evidence in both enterprise and consumer companies dealing with distributed data everywhere.
I see first-hand examples of many of these application scenarios, ranging from marketing automation, service and support, customer and sales, or even in a business-to-business or a business-to-network collaboration.
For example, Google Photos was originally built to sync our photos to the cloud automatically, but it rapidly evolved to use machine learning and AI to search, detect, and display a photo when you search for a keyword. It also creates a montage of photos, with background music to share as a video, or as a memory/timeline. This is all possible based on the understanding of where the data is and what it must be used for.
Opportunities for Security, Data Privacy, and Data Anonymization
Security offers another area of opportunity for DBaaS. Increased data privacy regulations in the European Union are bringing even more pressure to bear on companies that handle consumer data. The EU’s General Data Protection Regulation (GDPR), which goes into effect May 25th, 2018, guarantees new levels of data privacy for customers by requiring companies who utilize any type of customer information to inform (explain why, where, and what their information is used for) and receive explicit consent.
This is good news for consumers, but it does pose some challenges for companies who want to gain business insights from data. Innovative companies are responding by heavily investing in the new technology called “data anonymization.” This technology enables businesses to completely anonymize the data itself so that you can’t tell where or whom the data is attributed to.
“You get the insights you want, but without the granualarity.”
Eventually, I see that the rise of data privacy and security regulation will become global. (For further information, here’s a webinar that dives deeper into security, data privacy, and anonymization.)
Cloud-Based Databases Have a Role to Play in Digital Transformation
For many companies, when digital transformation becomes the goal of the C-suite, adoption of DBaaS naturally follows. Digital transformation in its simplest terms is about the digitization of assets, relationships, and engagements in the context of bringing together customers, partners, and employees. Decisions around them and about them and the company must be highly data driven. To this end, databases become the critical component of facilitating the successful digital transformation of a company.