IBM Big Data Pilot

By Susan Nunziata  |  Posted 08-17-2010 Print Email

This summer, the department began piloting IBM's "Big Data" analytics technology, which mines large amounts of unstructured Web data. The analysis is based on factors such as business relevancy, government policies, market needs and trends. In the pilot phase, the tools unearthed hidden business opportunities that likely would not have emerged under the old triage system, notes Houghteling. In addition, the analytics tools condensed the triage process down to a seven- to 10-day period. "We do this by identifying key words, using speciic phrasing dedicated to key words, and by identifying specific documents," Houghteling says. "You can get these types of analytics tools to spit out a scored or ranked list of potential partners. That's attractive: to have a group of potential partners ranked based on how they match to our specs."

For example, a team of researchers at NC State is investigating new strains of Salmonella for use in vaccines. With IBM Big Data analytics technology, it took less than a week for the university to analyze 1.4 million Web pages, including opinion blogs, social networks and documents. The analytics technology sorted through a wide variety of information and analyzed the contents in real time to find relevant details, ultimately identifying potential investors and partners.

The pilots at NC State were conducted in collaboration with the university's College of Management Bioscience Management Group and its Center for Management Studies.

As part of this project, NC State is using:

  • IBM BigSheets, part of IBM's BigInsights portfolio, a software engine that helps get insights from really large data sets easily and quickly
  • IBM LanguageWare, a text analytics tool created by IBM's Dublin Software Lab in Ireland for the purpose of harnessing the wealth of unstructured data contained in text documents, Web site content and enterprise applications
  • IBM Cognos Content Analytics, an analytics software which gives organizations the necessary tools to access and analyze the volumes of unstructured content.

These three components were running on IBM Distribution of Apache Hadoop. The analytics solution interfaced with the University's TechTracS database, developed by Knowledge Sharing Systems. The proprietary database supports innovation m anagement functions and is used by the technology transfer offices of many universities to manage their IP portfolios. "It has its base modules that we have customized for ouruse at NC state," says Houghteling. "We license use of several seats of the TecTracS software and, in partnership with Knowledge Sharing Systems, we pay for process improvements particular to our processes and practices."The relational TechTracS database is used for all the agreement tracking, invention disclosure tracking, compliance, and patent management for the Unversity. It is also from that database that the department launches all of its  marketing activities to potential partners.



 

Submit a Comment

Loading Comments...