Here's something for anyone who suffered through the grammar-school exercise of diagramming sentences and wondered, "Where will all this ever get me?"
Attensity Corp.'s (Palo Alto, CA) text analytic tools work by diagramming sentences just like you did back in fifth grade. That's right, the software deconstructs text into subject, verbs, adjectives, and prepositional phrases and sorts them into a relational database.
Text analytics represents a leading-edge effort to slot unstructured data -- the written comments, emails, and repair notes that form a large part of the warranty reporting record -- into matrices that can be easily searched and mined for predictive analysis. Attensity has attracted customers as diverse as Whirlpool, Hill Air Force Base, and the Federal Bureau of Investigation. In addition to Attensity, SPSS Inc. (Chicago), ClearForest Corp. (Waltham, MA), and Teragram Corp. (Cambridge, MA) are active in this space.
"The information is available, but there's been no solution to get into a data warehouse and run analytics on it," says Michelle de Haaff, vice president of products and marketing at Attensity.
The Attensity system is based largely on linguistics, and essentially does away with human-generated pre-definition, the time-consuming process by which linguists and knowledge engineers pre-determine text parameters for text-oriented databases. Attensity's proprietary Exhaustive Extraction process creates these parameters automatically by recognizing and processing the key elements of grammar. Historical data determines how the matrix is populated, not the before-the-fact suppositions of knowledge engineers.
Hence, the Attensity system takes the note, "Black mold growing on foundation in high humidity," recognizes the elements of grammar, and breaks it into categorical components: Issue: Black Mold; Location: Foundation, Cause: Humidity.
If there are no pre-definitions for mold, the Attensity system creates them. The result is a system that can identify problems that knowledge engineers might never define or foresee. The system can also be geared to recognize anomalies before they become statistically significant.