A recently published article by MIT’s Technology Review says that today’s “big data,” that is, enormous data sets, is forcing researchers to find new techniques for knowledge discovery and data mining. Twenty years ago, big data and the motivation to try to process them were of interest mainly to the scientific community, Usama Fayyad, the executive chair of the Association for Computing Machinery’s Special Interest Group on Knowledge Discovery and Data Mining (and former chief data officer at Yahoo) is cited in the article as saying, ahead of the ACM’s 17th annual conference on Knowledge Discovery and Data Mining.
The Internet, of course, “changed everything.” Businesses started amassing huge volumes of data about their customers (and their behavior) as they shifted more towards online. As the power of data mining became clear, Fayyad is cited, so did economic motivations to invest in the field.
It’s become clear that Internet giants, the article says, make their profit from the information they collect about their users and the insights into how to grow their business they glean from it. Today’s data, however, comes in network, rather than a tabular, form; most often, it arrives in gigantic graphs.
Dealing with graphs of that size and scope, and applying modern analytic tools to them, calls for better algorithms and other innovations. Whereas 20 years ago, the scientists had their data sets in structured form, today’s data is “an unstructured mess.”
And, above the cloud…
The Charleston Gazette says West Virginia’s Office of Technology has been in talks to start utilizing cloud computing in order to deliver subscription-based software, document management and data storage services to state government agencies.
The move could save the state millions of dollars and increase worker productivity, state government leaders are predicting. Better still, using cloud computing services would not lead to job cuts, “although many state workers would have to sign up for training to learn new software and data management systems.”
As the state considers this move, the article points out that companies and organizations that buy cloud services “must be careful that they don’t lose control of their data when they terminate cloud service agreements or transfer the information to another cloud service provider,” Kyle Shaffer, the state chief technology officer, is quoted.
He adds about the precautions: “Cloud customers must be careful to include data ownership, security and audit requirements when negotiating contracts for cloud services.”
The federal government and other states have already adopted cloud computing services.