Smart Enterprise Magazine

Volume 8, Number 1, 2014

Issue link:

Contents of this Issue


Page 13 of 23

for advanced analytics," Jaffe says. "Big Data technology today is now being applied to more grown-up problems, not only social media analysis. And a lot of cool Big Data projects are being funded by offloading data workloads from more expensive legacy platforms." Stock Up on Data Skill Sets Projects concerning Big Data that the business will be requesting from IT will require IT to take an inventory of its staff and skills. Not only are there new technologies involved (Hadoop and NoSQL databases, for instance), but also the talent needed to make sense of the data isn't typically on hand in most IT departments. " There is a shor tage of really highly skilled Hadoop talent," Syncsort's Jaffe says. "The growth rate of the platform is outpacing the growth rate of the skills. Everyone is interested in learning Hadoop." A programmable language that can be used to build a framework, Hadoop allows for the distributed processing of large data sets across clusters of computers using simple programming tools, according to the Apache Software Foundation. But knowing Hadoop isn't the only skill needed, especially considering the fact that some companies might not choose that route for their database design. "The history of technology tells us the tools will change over time, but the ability to think about things outside of the normal realm of IT and the ability to experiment with this data running across 100 servers simultaneously is critical," comScore's Brown says. "One of the things we have done is developed a Data Quality Engineer, which involves a very important focus. We need people to look at the data, determine whether it makes sense, does it trend and is it doing what we want it to do." Joe Young, lead of the Enterprise Architecture and Strategy Group responsible for the EIM and data governance practice at Lexmark International, agrees that skills and roles come into play when analyzing large sets of data. And he says the skills might not be something typically found in the average IT department. "Many companies don't have the IT organiza- tion today that supports the type of analysis needed to pull real value from data. Those are data scientist skills, statistical analysis and predictive analysis skills, which blend the art and science of data analysis," Young says. "Funding for those types of environments doesn't come easy, and getting people to leverage those really compelling data sets is a challenge." Add Technology to the Mix First let's consider the database. There is no one answer; it will depend on the budget, in-house talent and scope of the Big Data project. If Apache's Hadoop—or a variant of it—is used, IT leaders will want to get in-house staff trained on the open source framework. But while Hadoop is the Big Data indus- try darling at the moment, database experts argue that the schema-less appeal of the language and other NoSQL offerings won't necessarily be enough to take on Big Data long term. "Companies are using Hadoop as a phase one, interim step. They need to be thinking of the next wave, such as an open source project from Google called word2vec, which is in its very infant stages," says Ravi Rajagopal, Vice President of Cloud Solutions and Service Providers at CA Technologies. Like newer database tools, SQL technologies are also capable of collecting unstructured data. But depending on the framework, unstructured data (such as comments from social networks like Twitter or Facebook) might not be easily pulled into existing database structures. Companies need to consider the current database infrastructure and how they can capitalize on existing investments, while also equipping the environment to handle all the data sets required for Big Data projects. It's likely that most organizations today already have a database infrastructure in place. But as CA Technologies CTO Michelsen points out, many organizations in recent years did not have, nor did they want to buy, the required disk space to store the data. "The biggest challenge that existing enterprises have with Big Data is that they currently throw it all away," he says. "We have to think in a completely different fashion. We have to forward-invest in the storage of data whose value we will not know until we have it." Most enterprise companies probably also have multiple, disparate databases all collecting data pertinent to the busi- ness. The goal of Big Data is to be able to capture meaning and value at high speeds (velocity) from large data sets (volume) across different types of sources (variety), Rajagopal says. That means most companies have to shift from a mindset of stripping data of its unique attributes to preserving those attributes to derive business value. "IT organizations need to think, 'Let's come up with a format that is preserving the data integrity and makes sense to structure it in such a way that we can take advantage of the attributes within the data sets,'" Rajagopal explains. "Veracity is a database term that means maintaining the integrity or preserving the truth of the data. Now IT needs to make sure all the multiple dimensions and characteristics are preserved." And that won't be easy, says Mike Skubisz, Vice President of Product Management and Strategy at Deep IS, a maker of Big Data database technology. "When we see companies struggling with Big Data projects, it's about the sheer size and volume and complexity. That tends to be the daunting task," Skubisz says. "When you look at the sources of this data, they are coming from different organizations in the business that didn't have a common vision of what they wanted to build. There are techno- logical discrepancies as well as the political and cultural hurdles." Sid Kumar, Global Head of Customer Lifecycle Solutions at CA Technologies, says his organization enables customers to deliver greater business value and maintain a competitive advantage by adopting new innovation with speed, while at the same time protecting their investments in existing software deployments. Kumar explains that by leveraging this industry- leading approach, which combines technology lifecycle planning with best practices, customers are able to drive value by taking advantage of cutting-edge technologies such as Big Data in a gradual and phased manner while minimizing the cost, risk and time typically associated with adopting a disruptive technology. 14 SMARTENTERPRISEMAG.COM Big Data, Big Innovation Read more:

Articles in this issue

Links on this page

Archives of this issue

view archives of Smart Enterprise Magazine - Volume 8, Number 1, 2014