
Databases Are the Weak Point in Big Data Projects
Databases Are the Weak Point in Big Data Projects
By Karen A. Frenkel
Math Is Out of Date
“Modern” database algorithms are still based on 1970s technology, creating a need for updated math to deal with the scale and performance of big data.
MySQL Architecture Old
MySQL was built in 1995 when the fastest Intel processor was the Pentium Pro. MySQL needs an updated architecture to take advantage of modern hardware.
Images Depend on Configuration
Database deployment images are configuration-dependent, causing an explosion of packages. Businesses need databases that observe and adapt to any physical, virtual or cloud instances without installation packages for each platform.
Inflexible Systems
DBAs are on a time-consuming tuning treadmill due to inflexible systems, creating a need for databases with online self-optimization for the ebb and flow of data workloads.
Databases Not Concurrent
Databases typically only handle a single type of workload. Instead, enable the database to concurrently host multiple workloads (ingest, transactions, analytics) without destroying performance.
Multiple Algorithms Needed
Fixed algorithm choice only allows a single set of behaviors regardless of the workload. Today’s databases must offer multiple algorithms that can be switched on-the-fly based on workload requirements.
Compression Needed
Compression is needed to maximize storage capacity and minimize costs, but current architectures utilize IO, sacrificing performance and scale. Architectures that offload compression overhead are needed to separate CPU threads so the impact on performance is minimal.
No Predictable Performance Scale
Due to algorithm limitations, databases eventually run off the “performance cliff” despite the best possible configurations. It’s time databases show predictable performance at scale and allow for orderly capacity planning.
Inadequate Scaling
Database scaling hits a brick wall. Make scaling to billions normal with state-of-the-art algorithms that reduce IOPS through intelligent caching, thereby eliminating unneeded reads and writes.
Unpredictable Data Performance
Multiple issues coalescing in a single infrastructure cause unpredictable data performance. With a more streamlined infrastructure that eliminates production, backup and geo-location silos, companies can avoid platform fragmentation.
Require ETL Process
ETL process slows business analytics and reduces the depth of insights due to data size versus processing time trade-offs. Next-generation databases need to utilize the same production platform and data set in-place for analytics without ETLs.
Antiquated Data Processing Methods
Databases use antiquated techniques to process data, creating fragile environments. Today’s businesses need databases that separate memory and disk structures from each other to ensure database integrity.