Library of Congress introduces site to expand research through digital collections
September 19, 2017
The Library’s ‘labs’ site crowdsources information to grow data sets, making research easier and more thorough for educators and students.
Sponsored content: Data traditionally was used in a very transactional way; now it’s at the center of every decision, says Hortonworks' Shaun Bierweiler.
The data storage landscape is continually changing, and there are a few shifts driving that evolution.
One of those is culture, Shaun Bierweiler, vice president of public sector for Hortonworks, says in an interview with EdScoop's sister media site, FedScoop Radio. “We like to say that every [public] agency is a data agency, and that stems from the evolution and the significance that data has taken in the lives and in the missions of our customers.” That goes for schools and higher education institutions.
With traditional data warehouses in the past, Bierweiler explains, data was used in a very transactional way. But now it’s at the center of every decision, he says.
To start with, the structure of big data has evolved. “Previously, you knew what was going in and what was coming out. Today, now you have data from an infinite number of sources. You have images, you have videos, you have data encrypted within those items,” Bierweiler says in the interview. “The data itself has become very much more complex in terms of structure.”
The volume is, perhaps, the biggest change.
“[Public] agencies are drowning in data because there’s so much of it,” he says. “You have to be able to store it, you have to be able to process it. You have to be able to extrapolate the value from that data. And so that’s become much more complicated and complex.”
Finally, to top that all off, expectations for the use of that data have changed drastically, Bierweiler explains.
“Not only do you have more data that has more information that varies much more greatly, but now users expect to do more with it. And they not only expect to do valuable things with their data, but they expect to extrapolate information and sharing data from other users’ data. What used to be very traditionally stove-piped and siloed now is a mesh of data that’s expected to be shared.”
With such an array of data types, sizes and uses, Bierweiler advocates for enterprise open source platforms to address users’ many needs.
“If you look at a traditional proprietary technology, the lifecycle for them tends to be much longer, and the development cycle even longer,” Bierweiler says. “When you get a new release of a proprietary solution, it’s often with very old or antiquated solutions and it’s solving the problems that existed when the technology’s development model started.”
You’re also locked-in to the vendors roadmap, he says.
An enterprise open source platform like Hortonworks harnesses “the development model of community — people that aren’t paid by Hortonworks. What you get then is a very open solution that not only solves what people are trying to address today, but problems they foresee for tomorrow,” Bierweiler says. And because there aren’t barriers or proprietary interfaces, “it lends itself to a true best-of-breed solution.”
“Consider everything as possible,” he recommends to agencies and offices considering open source. “It’s often difficult to make that cultural shift from something that you’ve always done and you convince yourself that that’s the only way. Technology has come a very long way and there are creative ways to do things better, cheaper, faster, smarter. So oftentimes, the biggest challenge we have is not a technical hurdle — it’s a cultural shift.”
See more about how Hortonworks’ open source solutions can help you manage your data.
This podcast and article was produced by Scoop News Group for, and sponsored by, Hortonworks.