SAP Data Services Text Data Processing enables you to perform natural language processing and extraction processing on unstructured text. This capability was introduced in Data Services 4.0 and has been enhanced further in subsequent releases. Text Data Processing now supports extracting information from binary documents, such as Word and PDF, richer entity extraction in 31 different languages, and can be pushed down to execute directly in Hadoop. In the latest Data Services 4.2 release, the Entity Extraction transform has added language identification, pre-defined entity type support for Dutch and Portuguese, and sentiment analysis extraction in Simplified Chinese.
Search the EIM Wiki: