User manual BUSINESS OBJECTS THINGFINDER SDK 4.3 GETTING STARTED GUIDE

If this document matches the user guide, instructions manual or user manual, feature sets, schematics you are looking for, download it now. Lastmanuals provides you a fast and easy access to the user manual BUSINESS OBJECTS THINGFINDER SDK 4.3. We hope that this BUSINESS OBJECTS THINGFINDER SDK 4.3 user guide will be useful to you.

Lastmanuals help download the user guide BUSINESS OBJECTS THINGFINDER SDK 4.3.

Mode d'emploi BUSINESS OBJECTS THINGFINDER SDK 4.3

Download

Manual abstract: user guide BUSINESS OBJECTS THINGFINDER SDK 4.3GETTING STARTED GUIDE

Detailed instructions for use are in the User's Guide.

 [. . . ] BusinessObjects ThingFinderTM SDK Getting Started Guide

BusinessObjects ThingFinderTM SDK 4. 3

Copyright

 2008 Business Objects,  an SAP company. Business Objects owns the following U. S. patents,  which may cover products that are offered and licensed by Business Objects: 5, 295, 243; 5, 339, 390; 5, 555, 403; 5, 590, 250; 5, 619, 632; 5, 632, 009; 5, 857, 205; 5, 880, 742; 5, 883, 635; 6, 085, 202; 6, 108, 698; 6, 247, 008; 6, 289, 352; 6, 300, 957; 6, 377, 259; 6, 490, 593; 6, 578, 027; 6, 581, 068; 6, 628, 312; 6, 654, 761; 6, 768, 986; 6, 772, 409; 6, 831, 668; 6, 882, 998; 6, 892, 189; 6, 901, 555; 7, 089, 238; 7, 107, 266; 7, 139, 766; 7, 178, 099; 7, 181, 435; 7, 181, 440; 7, 194, 465; 7, 222, 130; 7, 299, 419; 7, 320, 122 and 7, 356, 779. Business Objects and its logos,  BusinessObjects,  Business Objects Crystal Vision,  Business Process On Demand,  BusinessQuery,  Cartesis,  Crystal Analysis,  Crystal Applications,  Crystal Decisions,  Crystal Enterprise,  Crystal Insider,  Crystal Reports,  Crystal Vision,  Desktop Intelligence,  Inxight and its logos,  LinguistX,  Star Tree,  Table Lens,  ThingFinder,  Timewall,  Let There Be Light,  Metify,  NSite,  Rapid Marts,  RapidMarts,  the Spectrum Design,  Web Intelligence,  Workmail and Xcelsius are trademarks or registered trademarks in the United States and/or other countries of Business Objects and/or affiliated companies.  [. . . ] For information about using the tf. langid-config,  see "Language and Encoding Settings" on page 24 for details. 

Byte Order Marks
In Unicode,  the scalar value "0xfeff" is the "zero-width,  no-break space" character. Under a little-endian serialization,  this value is "0xfffe",  which is not a legal Unicode character. This character is designated as a BOM only when it occurs at the very start of a Unicode input stream,  such as a stream encoded in UTF-8,  UTF-16,  UCS-2 or UCS-4. When encountered at any other location,  it is the ZWNBSP character. First,  the BOM may serve as a signature for Unicode streams. Second,  the BOM indicates the serialization of the Unicode input. In both cases,  ThingFinder handles the BOM in a straightforward way. When a BOM is detected,  it is used to ascertain the serialization of the input--as little-endian or big-endian. Then,  the BOM is stripped from the input and not processed any further. Input that doesn't include a byte order mark is assumed to have the byte order of the current machine. Note: ThingFinder serializes the output using the native endian architecture (little-endian or big-endian) of the host machine. 

20

Language Guide and Reference

Language Module Overview Document Properties

2

File Formats
ThingFinder processes text in HTML or plaintext. Text in other formats should be converted before processing,  using a separate conversion product. 

Language Guide and Reference

21

2

Language Module Overview Document Properties

22

Language Guide and Reference

Configuring ThingFinder

chapter

3

Configuring ThingFinder Language and Encoding Settings

This chapter describes the configurable features of ThingFinder. Information is presented in the following sections:

Language and Encoding Settings Text Processing Entity Type Weights Sub-entities Custom Extraction Rules Post-processing Configuration

Language and Encoding Settings
This section describes the configurable language and encoding settings. 

Detecting Language and Encoding
ThingFinder can automatically determine the language and encoding of input documents. To do this,  ThingFinder uses a matrix of encoding-language pairs,  listed in the tf. langid-config file,  during the language and encoding identification process. For example:
<encodings-languages-covered> <list key = "cp_1252"> <item key = "english" /> <item key = "french" /> <item key = "german" /> <item key = "spanish" /> </list> <list key = "cp_1256"> <item key = "arabic" /> </list> <list key = "iso_8859_6"> <item key = "arabic" /> </list> <list key = "utf_8"> <item key = "english" /> <item key = "french" /> <item key = "german" /> <item key = "spanish" /> <item key = "arabic" /> </list> </encodings-languages-covered>

24

Language Guide and Reference

Configuring ThingFinder Language and Encoding Settings

3

This list should include all languages for each encoding that could possibly occur in the input text. Encoding-language pairs not listed here are not considered during detection. For instance,  if this list only includes "cp_1252",  then,  regardless of what the input encoding is,  it will always be identified as "cp_1252". The Unicode encodings UCS-2,  UCS-4 or UTF-16 are not included by default because there are very few documents in these encodings. If you are processing documents in these encodings,  you should add them to the tf. langid-config file,  located in the lx-3/lang directory. Open the tf. langid-config file in a text editor,  and add lines to it,  using the format shown above. 

Configuring the Names of Languages and Encodings
You can configure variant names for languages and encodings. The tf. language-encoding-config configuration file,  located in the lx-3/ lang directory,  contains the standard language and encoding names in the <list> tag. Each of these has a corresponding list of accepted variant names in the <item key> tag.  [. . . ] The cgv utility enables you to display the linguistic analysis of a specific sentence,  including,  but not limited to,  the CGUL STEM,  POS,  NP,  TE,  and CL marker information for the input data. The cgv utility accepts input either directly from the console or from a file. Either way,  the cgv utility's output displays the analysis results of the entire input. The cgv utility is found in the same place as tfdemo (. \lx3\[platform]).  [. . . ]

DISCLAIMER TO DOWNLOAD THE USER GUIDE BUSINESS OBJECTS THINGFINDER SDK 4.3

Lastmanuals offers a socially driven service of sharing, storing and searching manuals related to use of hardware and software : user guide, owner's manual, quick start guide, technical datasheets...
In any way can't Lastmanuals be held responsible if the document you are looking for is not available, incomplete, in a different language than yours, or if the model or language do not match the description. Lastmanuals, for instance, does not offer a translation service.

Click on "Download the user manual" at the end of this Contract if you accept its terms, the downloading of the manual BUSINESS OBJECTS THINGFINDER SDK 4.3 will begin.