> SPSS > Mastering New Challenges in Text Analytics.
Mastering New Challenges in Text Analytics. White Paper Published By:
SPSS
This paper briefly defines text analytics, describes various approaches to text analytics, and then focuses on the natural language processing techniques used by text analytics solutions.
Javascript Disabled To use our site, you must enable JavaScript.
Published:
Jun 30, 2009
Type:
White Paper
Length:
24 pages
Technical report
Mastering New
Challenges in Text Analytics
Making unstructured data ready for predictive analytics
Table of contentsIntroduction........................................................................................................................... 2What is text analytics and how is it used?.............................................................................. 3Approaches to understanding text......................................................................................... 4The SPSS text analytics process............................................................................................. 5Applying text analytics at the enterprise level...................................................................... 17Conclusion.......................................................................................................................... 17 SPSS products for text analytics........................................................................................... 18About SPSS Inc.................................................................................................................... 18Appendix A: An explanation of some text analytics terms.................................................... 19Appendix B: Algorithms used for assigning equivalence classes.......................................... 21Appendix C: Examples of Text Link Analysis......................................................................... 22Additional reading on text analytics..................................................................................... 23
SPSS is a registered trademark and the other SPSS products named are trademarks of SPSS Inc. All other names are trademarks of their respective owners. © 2008 SPSS Inc. All rights reserved. MCTWP-0408IntroductionIt's no secret that the world has seen an explosion of information in the past 15 years, an explosion that experts predict will continue as the millions of people who use online resources continue to expand their usage, and the millions of people who do not yet have access to such resources gain it. Similarly, information stored as text in both business and government organizations has grown exponentially.
To name just a few examples: n Opinion surveys are increasingly conducted online and results shared in real time n The boom in software applications supporting sales, customer service, or call center operations has led to massive amounts of text stored electronically in these applications' notes fields n Technology analysts at IDC estimate that 62 billion e-mails are sent every dayn Searchable Web sites generate enough information every day to fill millions of books n Web logs (blogs) and wikis, created by individuals and groups for personal and professional purposes are increasing exponentially: as of this writing, there may be more than 100 million blogs, with a new one created every second
Such a vast expansion of the scale of global information exchange would have been almost unimaginable 40 years ago, when most business and government communications, as well as news reports and advertising, were paper-based.
Yet it was 40 years ago that visionary researchers began to seek ways to enrich the knowledge of those working in medicine and other sciences, in government agencies, and in business by making it possible to uncover previously unknown connections in large collections of textual documents by using computer technologies. They created the discipline known as computational linguistics, which is now practiced at numerous universities and public and private research centers worldwide. Computational linguists initially focused their efforts on finding ways to categorize and explore concepts found in books, scholarly journals, legal briefs, patent applications, newspapers, reports, and other paper-based records that could be converted to digital formats. More recently, their efforts have expanded to include ways to "mine" the vast amount of textual information that is published digitally-online editions of newspapers, academic journals, and conference proceeding, for example. In addition, there is a wealth of content that originates in digital form-such as Web sites, blogs, wikis, e-mails, instant messaging (IM), as well as text embedded in forms, surveys, and in scientific, government, or corporate databases.
There is a growing recognition t... [download for more]
Browse Technology Topics
Application Integration ,
Analytical Applications ,
Business Intelligence ... more , Configuration Management , Database Development , Data Integration , Data Mining , Data Protection , Data Quality , Data Replication , Database Security , EDI , SOAP , Service Oriented Architecture , Web Service Management , Data Warehousing less Analog Communications ,
Digital Signal Processing ,
Electronic Design Automation ... more , System On A Chip , Electronic Test and Measurement , Embedded Design , Boards & Modules , Embedded Systems and Networking , Electromechanical & Mechanical , Optoelectonics & Displays , Packaging and Interconnects , Passive & Discrete Components , Power Sources & Conditioning Devices , Integrated Circuits and Semiconductors , Sensors & Actuators less Application Integration ,
Application Performance Management ... more , Best Practices , Business Activity Monitoring , Business Analytics , Business Integration , Business Intelligence , Business Management , Business Metrics , Business Process Automation , Business Process Management , Call Center Management , Call Center Software , Change Management , Corporate Governance , Customer Interaction Service , Customer Relationship Management , Customer Satisfaction , Customer Service , EBusiness , Enterprise Resource Planning , Enterprise Software , EProcurement , Extranets , Groupware Workflow , HIPAA Compliance , IP Faxing , IT Spending , Marketing Automation , Performance Testing , Product Lifecycle Management , Project Management , Return On Investment , Risk Management , Sales & Marketing Software , Sales Automation , Server Virtualization , Simulation Software , Supply Chain Management , System Management Software , Total Cost of Ownership , Video Conferencing , Voice Recognition , Voice Over IP , Workforce Management , Incentive Compensation , Spend Management , Manufacturing Execution Systems , International Computing less Human Resources Services ,
Payroll Software ,
Time and Attendance Software ... more , Workforce Management Software , Financial Management , Employee Monitoring Software , Employee Training Software , Recruiting Software/Services , Employee Performance Management , ELearning , Benefits Management , Expense Management less Collaboration ,
Collaborative Commerce ,
Contact Management ... more , Content Delivery , Content Integration , Content Management System , Corporate Portals , Customer Experience Management , Document Management , Information Management , Intranets , Messaging , Records Management , Search And Retrieval , Search Engines , Secure Content Management , SLA less Active Directory ,
Bandwidth Management ,
Convergence ,
Distributed Computing ... more , Ethernet Networking , Fibre Channel , Gigabit Networking , Governance , Grid Computing , Infrastructure , Internetworking Hardware , Interoperability , IP Networks , IP Telephony , Local Area Networking , Load Balancing , Migration , Monitoring , Network Architecture , Network Management , Network Performance , Network Performance Management , Network Provisioning , Network Security , OLAP , Optical Networking , Quality Of Service , Remote Access , Remote Network Management , Server Hardware , Servers , Small Business Networks , TCP/IP Protocol , Test And Measurement , Traffic Management , Tunneling , Utility Computing , VPN , Wide Area Networks , Green Computing , Cloud Computing , Power and Cooling , Data Center Design and Management , Colocation and Web Hosting less AS/400 ,
Domino ,
Linux ,
Microsoft Exchange ,
Oracle ,
PeopleSoft ... more , SAP , Siebel , Solaris , Tivoli , Unix , Web Sphere , Windows , Windows Server less Access Control ,
Anti Spam ,
Anti Spyware ,
Anti Virus ,
Application Security ... more , Auditing , Authentication , Biometrics , Business Continuity , Compliance , DDoS , Disaster Recovery , Email Security , Encryption , Firewalls , Hacker Detection , High Availability , Identity Management , Internet Security , Intrusion Detection , Intrusion Prevention , IPSec , Network Security Appliance , Password Management , Patch Management , Phishing , PKI , Policy Based Management , Security Management , Security Policies , Single Sign On , SSL , Secure Instant Messaging , Web Service Security , PCI Compliance , Vulnerability Management less .NET ,
C++ ,
Database Development ,
Java ,
Middleware ,
Open Source ... more , Software Outsourcing , Quality Assurance , Scripting , SOAP , Software Testing , Visual Basic , Web Development , Web Services , Web Service Security , XML less Backup And Recovery ,
Blade Servers ,
Clustering ,
IP Storage ... more , ISCSI , Network Attached Storage , RAID , Storage Area Networks , Storage Management , Storage Virtualization , Email Archiving , Data Deduplication less 802.11 ,
Bluetooth ,
CDMA ,
GPS ,
Mobile Computing ,
Mobile Data Systems ... more , Mobile Workers , PDA , RFID , Smart Phones , WiFi , Wireless Application Software , Wireless Communications , Wireless Hardware , Wireless Infrastructure , Wireless Messaging , Wireless Phones , Wireless Security , Wireless Service Providers , WLAN less