Endeca_15_DataIndexingAPIGuide

90
Endeca ® Navigation Platform Data Indexing API Guide

Transcript of Endeca_15_DataIndexingAPIGuide

Page 1: Endeca_15_DataIndexingAPIGuide

Endeca® Navigation PlatformData Indexing API Guide

Page 2: Endeca_15_DataIndexingAPIGuide

Copyright and DisclaimerProduct specifications are subject to change without notice and do not represent a commitment on the part of Endeca Technologies, Inc. The software described in this document is furnished under a license agreement. The software may not be reverse assembled and may be used or copied only in accordance with the terms of the license agreement. It is against the law to copy the software on any medium except as specifically allowed in the license agreement.

No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, for any purpose without the express written permission of Endeca Technologies, Inc.

Copyright © 2003-2005 Endeca Technologies, Inc. All rights reserved. Printed in USA.

Corda PopChart® and Corda Builder™ Copyright 1996-2005 Corda Technologies, Inc.

Outside In® SearchML © 1992-2005 Stellent Chicago, Inc. All rights reserved.

Rosette ® Globalization Platform Portions Copyright © Basis Technology Corp. 2003-2005. All rights reserved.

Teragram Language Identification Software Portions Copyright © 1997-2005 Teragram Corporation. All rights reserved.

TrademarksDon't Stop At Search, Endeca, Endeca InFront, Endeca Navigation Engine, Guided Navigation, and ProFind are registered trademarks, and Endeca Data Foundry and Endeca Latitude are trademarks of Endeca Technologies, Inc.

Basis Technology and Rosette are trademarks of Basis Technology Corp.

All other trademarks or registered trademarks contained herein are the property of their respective owners.

Endeca Data Indexing API Guide • August 2005

Page 3: Endeca_15_DataIndexingAPIGuide

Contents

Preface

Contacting Endeca Standard Customer Support . . . . . . . . . . . . . . . . xii

Chapter 1 Overview of the Data Indexing API

About the Data Indexing API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Data Indexing API Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

WSDL File for the Data Indexing API . . . . . . . . . . . . . . . . . . . . . . . 17Overview of Data Indexing Implementation Process . . . . . . . . . . . . . 18

Chapter 2 System Setup

Installing the Data Indexing API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Starting and Stopping the Web Service . . . . . . . . . . . . . . . . . . . . . . . . 20Changing the Web Service Permissions . . . . . . . . . . . . . . . . . . . . . . . 20

Web Service Role . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Web Service User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Creating the Update Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Creating the Partial Update Pipeline . . . . . . . . . . . . . . . . . . . . . . . 21Creating the Record Adapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Creating the Record Manipulator . . . . . . . . . . . . . . . . . . . . . . . . . . 24Creating the Update Adapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Creating the Dimension Components . . . . . . . . . . . . . . . . . . . . . . 26

Provisioning the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Chapter 3 Writing Java Client Programs

Java Client Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Page 4: Endeca_15_DataIndexingAPIGuide

iv

Using the Java WSDP Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Creating a Client Configuration File . . . . . . . . . . . . . . . . . . . . . . . . 31Generating Client Stubs with the wscompile Tool . . . . . . . . . . . . . 32Modifying the Stub Source File . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Using Apache Axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Generating Client Stubs with WSDL2Java . . . . . . . . . . . . . . . . . . . 35Generating Client Stubs with an Ant Task . . . . . . . . . . . . . . . . . . . 36

Writing the Java Client Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Sample Java Application Program . . . . . . . . . . . . . . . . . . . . . . . . . 36Invoking the Data Indexing Web Service . . . . . . . . . . . . . . . . . . . . . 36Location of the Source Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Format of the Source Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Note on Formats of Input and Output Record Files . . . . . . . . . 40Creating Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41Queueing the Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43Starting the Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Monitoring the Update Progress . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Catching Data Indexing Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . 49Clearing the Update Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Chapter 4 Writing .NET Client Programs

.NET Client Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Creating the DataIndexingService Library . . . . . . . . . . . . . . . . . . . . . . 54

Producing the Client Stub Class . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Building the DataIndexingService Library . . . . . . . . . . . . . . . . . . . 55

Writing the .NET Client Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . 56Adding Reference Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56Sample .NET Client Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Connecting to the Data Indexing Web Service . . . . . . . . . . . . . 58Starting the Baseline Update . . . . . . . . . . . . . . . . . . . . . . . . . . . 59Monitoring and System Error Methods . . . . . . . . . . . . . . . . . . . 60

Catching Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Page 5: Endeca_15_DataIndexingAPIGuide

v

Chapter 5 Endeca Data Indexing API Reference

DataIndexing Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

addContent(String handle, Record[] records) . . . . . . . . . . . . . 66clearContent(String[] handles) . . . . . . . . . . . . . . . . . . . . . . . . . 67getSystemStatus() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67startBaselineUpdate() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68startPartialUpdate() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68stopBaselineUpdate() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

PVal Class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

PVal(String name, String value) . . . . . . . . . . . . . . . . . . . . . . . . 70PVal() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70getName() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70getValue(). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70setName(String name) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71setValue(String value) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Record Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Record(PVal[] values) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Record() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72getValues(). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72setValues(PVal[] pval). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Status Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

getSystemErrors() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73getSystemState(). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

SystemError Class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

getComponent() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74getErrorMsg() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Page 6: Endeca_15_DataIndexingAPIGuide

vi

getRecordSpec() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75getSeverity() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

DIException Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

getMessage() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76DIInvalidOperation Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77getMessage() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

DIInvalidParameter Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

getMessage() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77DISystemOperation Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78getMessage() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Appendix A Sample Java Client Code

Client.java Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Client2.java Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Index

Page 7: Endeca_15_DataIndexingAPIGuide

Preface

The Endeca® Navigation Platform is the foundation for building applications based on Endeca Navigation Engine® technology. With the Endeca Navigation Platform, you can build solutions that allow your users to quickly, precisely, and easily search and navigate through large data sets, avoiding all the traditional problems associated with information overload and finding information online. Endeca applications generate precise, relevant results with sub-second response times, even across very large data sets.

The Endeca Navigation Platform allows you to build Guided Navigation® functionality into your Web applications. The Endeca Guided Navigation solution puts the results of all search, navigation, and analytic queries in an organized context that shows users precisely how to refine and explore further. This helps solve the problems associated with information overload by guiding users as they quickly, precisely, and easily navigate through large data sets. The Endeca Navigation Platform is based on technology that makes it possible to scale to very large data sources and user loads while running on low-cost hardware.

Page 8: Endeca_15_DataIndexingAPIGuide

viii

About This Guide

This guide describes the classes and methods of the Endeca Data Indexing API, and how to use them to implement baseline and partial updates for your Endeca system.

Who Should Use This Guide

This guide is intended for developers who are building applications using the Endeca Navigation Platform.

Symbols and Conventions

IMPORTANT: Text marked as important requires special attention.

Note: Notes provide related information, recommendations, and suggestions.

The Endeca documentation set uses the following symbols and conventions:

1. Numbered lists, when the order of the items is important.

a. Alphabetical lists, when the order of secondary items is important.

• Bulleted lists, when the order of the items is unimportant.

Data Indexing API Guide Endeca Confidential

Page 9: Endeca_15_DataIndexingAPIGuide

ix

Italic text represents variables you should substitute a value for, such as:

C:\RootDirectory\MyDirectory\MyFile

Italic text may also indicate new terms that appear in the Endeca Glossary.

Courier text indicates code snippets or commands that you should enter exactly as they are written in the documentation.

Endeca Documentation Set

Note: In addition to the documentation deliverables listed below, you can find useful information, including the Endeca Performance Tuning Guide, in the knowledge base on the Endeca Customer Support site at https://customers.endeca.com.

The Endeca documentation set consists of the following:

• Endeca Installation Guide for UNIX and Endeca Installation Guide for Windows describe how to install Endeca software.

• Endeca Migration Guide provides information on migrating from previous versions of Endeca software.

• Endeca Concepts Guide introduces the critical concepts you should understand before learning how to build an Endeca application. The information in this guide is the foundation upon which all the other Endeca documentation depends.

Endeca Confidential Preface

Page 10: Endeca_15_DataIndexingAPIGuide

x

• Endeca Developer's Guide for Java, Endeca Developer's Guide for COM, and Endeca Developer's Guide for .NET provide an overview of the Endeca development process as well as procedures and code snippets for all non-advanced Endeca development tasks.

• Endeca Advanced Features Guide provides procedures for implementing advanced Endeca features such as the Content Acquisition System and partial updates.

• Endeca Administrator's Guide for UNIX and Endeca Administrator's Guide for Windows provide information on using Endeca's administrative and logging tools to configure and manage your Endeca implementation, and create logging reports.

• Endeca Tools Guide provides information on configuring and administering Endeca tools, including the Endeca Manager, Endeca Developer Studio, and Endeca Web Studio.

• Endeca Developer Studio Help provides online information for developing data pipelines using the Endeca Developer Studio.

• Endeca Web Studio Help provides online information for the administrative tasks, as well as search and merchandising configuration, that you can do using Endeca Web Studio.

• Endeca Javadocs provide online access to class and method descriptions for the Java version of the Presentation and Logging APIs.

• Endeca API Guide for COM, Endeca API Guide for Perl, and Endeca API Guide for .NET provide class and

Data Indexing API Guide Endeca Confidential

Page 11: Endeca_15_DataIndexingAPIGuide

xi

method descriptions for the COM, Perl, and .NET versions of the Presentation and Logging APIs. The Endeca API Guide for .NET is in an online format.

• Endeca Security Guide for Java and Endeca Security Guide for .NET and COM describe how to implement user authentication and how to structure your data to limit access to only those users with the correct permissions. The Java version of this guide also provides information on using SSL certificates and encryption to secure your Endeca application.

• Endeca Performance Tuning Guide provides guidelines on monitoring and tuning the performance of the Endeca Navigation Engine. It also contains tips on resolving associated operational issues.

• Endeca Content Adapter Developer's Guide describes the Content Adapter Development Kit (CADK), a framework that provides developers with a flexible and simple mechanism to extract data from a data source and load it into Forge. The CADK is only available from Endeca customer support.

• Endeca Data Indexing API Guide provides class and method descriptions of the Data Indexing API and describes how to use the API to move source data to the Forge directory and run updates.

• Endeca Forge API Guide for Perl provides online information for the class and method descriptions of the Perl Manipulator component. You can use a Perl manipulator within a data pipeline to perform record manipulation.

Endeca Confidential Preface

Page 12: Endeca_15_DataIndexingAPIGuide

xii

• Endeca XML Reference provides detailed, online reference information for the XML files used in a Data Foundry pipeline.

• Endeca Glossary defines terms used in the Endeca Navigation Platform documentation set.

• Release Announcement describes the major new features changes for the release.

• Release Notes detail the changes specific to the release, including bug fixes and new features.

• Endeca Third-Party Software Usage and Licenses provides copyright, license agreement, and/or disclaimer of warranty information for the third-party software packages that Endeca incorporates.

Contacting Endeca Standard Customer Support

You can contact Endeca Standard Customer Support through the online Endeca Support Center (<https://customers.endeca.com>).

The Endeca Support Center provides registered users with important information regarding Endeca software, implementation questions, product and solution help, training and professional services consultation as well as overall news and updates from Endeca.

Data Indexing API Guide Endeca Confidential

Page 13: Endeca_15_DataIndexingAPIGuide

Chapter 1

Overview of the Data Indexing API

This chapter provides an overview of the Data Indexing API, a framework that provides developers with a flexible mechanism to move data from a data source to the Forge incoming directory and to stop and start updates programmatically.

The chapter contains the following sections:

• About the Data Indexing API

• Data Indexing API Components

• Overview of Data Indexing Implementation Process

IMPORTANT: This document assumes that you are already familiar with Endeca components and terminology as discussed in the Endeca Concepts Guide and are comfortable programming in languages that can access Web services, such as Java and C#.

Page 14: Endeca_15_DataIndexingAPIGuide

14

About the Data Indexing API

The Endeca Navigation Engine platform was designed from the beginning to support rapid application development and easy integration. To that end, the platform is based on open standards. End-user applications are easily built and integrated around multiple well-defined APIs, such as the Data Indexing API.

Among the open standards supported by Endeca is the support for XML-based Web Services standards, including Simple Object Access Protocol (SOAP) and Web Services Description Language (WSDL). Endeca’s Web services support makes system-to-system integration easier than ever before, enabling customers to build innovative distributed applications that can be shared between and within enterprises in a way that is easy to maintain as business processes change.

The Data Indexing API allows users to invoke the Endeca Data Indexing Web service to programmatically modify the content of an Endeca system, without going through the overhead of a baseline update each time a change is made. Because this Web service is automatically installed with the Endeca Manager, users are spared the trouble of setting up the service.

Data Indexing API Guide Endeca ConfidentialChapter 1

Page 15: Endeca_15_DataIndexingAPIGuide

15

Among the major tasks that can be accomplished via the Data Indexing API are the following:

• Adding source data records to the system so that they can be processed by Forge and uploaded to the Navigation Engine.

• Starting a partial update with the newly-added records. The new records are automatically loaded in the Navigation Engine by the update process.

• Deleting records from the Navigation Engine via a partial update.

• Modifying records in the Navigation Engine via a partial update.

• Starting a baseline update, using the existing source records in the Forge incoming directory.

• Monitoring the progress of a baseline or partial update, including the retrieval of detailed system error information if the update fails.

• Retrieving system status from the Endeca Manager, such as whether the system is idle or is processing an update.

Because it is defined by a WSDL file, the Data Indexing API is language-agnostic. That is, it can be used with any programming language that has Web services support.

The API thus lets software developers choose their favorite development environment (Java, Visual Studio .NET, etc.) on which applications can be written to consume their data update functions as a Web service.

Endeca Confidential Overview of the Data Indexing API

Page 16: Endeca_15_DataIndexingAPIGuide

16

Samples of writing client applications in the Java and C# languages are provided in Chapter 3 (“Writing Java Client Programs”) and Chapter 4 (“Writing .NET Client Programs”).

Data Indexing API Components

When you install the Endeca software, the components that comprise the Data Indexing API are the following:

• Endeca Data Indexing Web Service. This service, which is installed as part of the Endeca Manager, runs automatically under the Endeca Manager and does not need to be configured.

• DataIndexing.wsdl file, which defines the Data Indexing API. See the next section for more details.

• This guide, which describes the API and provides instructions for writing programs to add or modify data content for the system.

The tomcat-users.xml file lets you change the default user permissions for the Endeca Data Indexing Web service. See Chapter 2 for details on changing this file.

In addition, you use these tools to set up the implementation components:

• Endeca Developer Studio allows you to create and modify the baseline and partial update pipelines that are used to perform updates.

Data Indexing API Guide Endeca ConfidentialChapter 1

Page 17: Endeca_15_DataIndexingAPIGuide

17

• Endeca Web Studio lets you provision the Endeca Manager with the components that are utilized by Data Indexing applications.

WSDL File for the Data Indexing API

To create any kind of application that consumes a Web service, you need the Data Indexing WSDL file, which describes the API. The file is named DataIndexing.wsdl and is located as follows:

• On UNIX: $ENDECA_ROOT/lib/services

• On Windows: %ENDECA_ROOT%\lib\services

The WSDL file specifies value types, exceptions, and available methods in a Web service in a programmatic fashion. Typically, what a client developer will do is use a tool that parses the WSDL file and generates client-side stubs (also called proxy classes) and value types. These generated files include all the code necessary to serialize and deserialize SOAP messages and make the SOAP layer transparent to the client developer.

The DataIndexing.wsdl file can be used with any language that has Web services support.

Endeca Confidential Overview of the Data Indexing API

Page 18: Endeca_15_DataIndexingAPIGuide

18

Overview of Data Indexing Implementation Process

To use the Data Indexing API, you need to follow these steps:

1. Install the Endeca Navigation Platform, making sure that you select the “Endeca Manager and Web Studio” feature.

2. Use Developer Studio to create the project for your Endeca implementation, including baseline and partial update pipelines. When you finish, send the instance configuration to the Endeca Manager.

3. Use Web Studio to provision the system with the location and configurations of your implementation.

4. Use Web Studio to run a baseline update. This ensures that the baseline pipeline and the provisioned components have been set up successfully. Alternatively, you can wait until after Step 5 to run the baseline update with the client application.

5. Write the client-side code, using the Data Indexing classes and methods.

6. Invoke the API to queue records that will be added to the data content of the system or otherwise modified.

7. Invoke the API to start a partial update, using the queued records.

8. Invoke the API to query the system for the status of the update, including any failed records.

Data Indexing API Guide Endeca ConfidentialChapter 1

Page 19: Endeca_15_DataIndexingAPIGuide

Chapter 2

System Setup

This chapter describes how to set up your Endeca implementation so that you can write programs that utilize the Data Indexing API methods. It contains the following sections:

• Installing the Data Indexing API

• Starting and Stopping the Web Service

• Changing the Web Service Permissions

• Creating the Update Pipelines

• Provisioning the System

Installing the Data Indexing API

The Data Indexing API (including the Endeca Data Indexing Web Service) is automatically installed as part of the Endeca Manager package. The DataIndexing.wsdl file is installed in the $ENDECA_ROOT/lib/services directory (%ENDECA_ROOT%\lib\services on Windows).

After installation, you do not need to configure the Web service in order for it to start up.

Page 20: Endeca_15_DataIndexingAPIGuide

20

Starting and Stopping the Web Service

When you start the Endeca Manager, it automatically starts the Data Indexing Web service. Therefore, this service will always be running whenever the Endeca Manager is running. Likewise, when you shut down the Endeca Manager, it automatically shuts down the Web service.

In other words, you cannot stop or start the Data Indexing Web service programmatically or from Web Studio.

Changing the Web Service Permissions

You set access to the Data Indexing Web service with the tomcat-users.xml file, located in the $ENDECA_CONF/conf directory (%ENDECA_CONF%\conf on Windows).

Web Service Role

The Web service uses the ewebservices role to determine which users have access to it. This role is defined in the tomcat-users.xml file as follows:

Do not change the name of this role because the Endeca Manager expects this name.

<!-- ewebservices : Controls access to webservices --><role rolename="ewebservices"/>

Data Indexing API Guide Endeca ConfidentialChapter 2

Page 21: Endeca_15_DataIndexingAPIGuide

21

Web Service User

By default, the tomcat-users.xml file assigns the ewebservices role to the webservices user, with a password of webservices.

It is highly recommended that you change the password for the sake of security.

When you instantiate an instance of the Web service in your application, you will be setting the username and password as properties on the object.

Creating the Update Pipelines

If you will be using your Data Indexing application to run both baseline and partial updates, you must ensure that the Endeca Developer Studio project has two pipelines: a baseline pipeline and a partial update pipeline.

Creating the Partial Update Pipeline

Developer Studio allows you to create both pipelines in the same project. If your project has only the baseline pipeline, Developer Studio will open the project with an empty Partial Pipeline Diagram. Use this diagram to create the partial update pipeline.

<!-- webservices : User is permitted to access webservices --><user username="webservices" password="webservices" roles="ewebservices"/>

Endeca Confidential System Setup

Page 22: Endeca_15_DataIndexingAPIGuide

22

An example of a partial update pipeline (as shown by Developer Studio’s Partial Pipeline Diagram) is as follows:

Data Indexing API Guide Endeca ConfidentialChapter 2

Page 23: Endeca_15_DataIndexingAPIGuide

23

In the example, the pipeline components are as follows:

• LoadUpdateData is a record adapter.

• PropDimMapper is a property mapper.

• UpdateManipulator is a record manipulator.

• UpdateAdapter is an update adapter.

• Dimensions and TypeDimension are dimension adapters.

• DimensionServer is a dimension server.

The following sections provide more details on creating these components. See the “Implementing Partial Updates” section in the Endeca Advanced Features Guide for more information on creating partial update pipelines.

Creating the Record Adapter

When you create the record adapter, the General tab of the Record Adapter editor should have these settings:

• Direction – Must be Input.

• Format – Must be XML. Regardless of the format of the source records, they are transformed into an XML format by the Data Indexing API.

• URL – Enter an input URL as a path with the filename being a pattern. For example, a URL pattern of ../incoming/updates/partial_data_*.xml means that Forge will read any file in its updates directory whose name begins with “partial_data_” and has the xml

Endeca Confidential System Setup

Page 24: Endeca_15_DataIndexingAPIGuide

24

suffix. Each file that matches the pattern will be read in sequence.

Note that the DataIndexing.addContent() method, as used in the client application, will use a handle that maps to the file path specified by this URL.

• Multi File – Check this box to specify that Forge can read data from more than one input file and that the input URL is to be interpreted as a pattern.

You can leave the other tabs (Sources, Record Index, and so on) in their default state.

Creating the Record Manipulator

The “Implementing Partial Updates” chapter in the Endeca Advanced Features Guide has full details on how to create this component.

In particular, you must pay close attention to the UPDATE_RECORD expressions in the record manipulator of your partial update pipeline. The record keys (properties) that these expressions expect must match the keys of the records to be added or modified by a partial update operation. For example, to delete records, an expression may be looking for a record key named “Remove” with a value of 1. Any record to be deleted, therefore, must have this key with a value of 1.

Data Indexing API Guide Endeca ConfidentialChapter 2

Page 25: Endeca_15_DataIndexingAPIGuide

25

Creating the Update Adapter

The update adapter is the component that writes out partial update files that will be loaded into a running Navigation Engine. The Update Adapter editor must have at least these settings:

• Output URL (General tab) – Enter the directory to which Forge writes the partial update files and processed records. This will typically be the Dgraph input directory, such as:

../partition0/dgraph_input/updates/

• Output prefix (General tab) – Enter the filename prefix for the Forge output files. Use the same prefix as in the indexer adapter for the baseline pipeline.

• Filter unknown properties (General tab) – Set this so it matches the Filter Unknown Properties setting in the indexer adapter of the baseline pipeline.

• Record source (Sources tab) – Enter the name of the record manipulator.

• Dimension sources (Sources tab) – Enter the name of the dimension server. You need a dimension source if you are updating dimensions.

• Enable Agraph support (Agraph tab) – Set this so it matches the Agraph tab settings in the indexer adapter of the baseline update pipeline

Endeca Confidential System Setup

Page 26: Endeca_15_DataIndexingAPIGuide

26

The following is an example of an update adapter for the partial update pipeline:

Creating the Dimension Components

For information on creating dimension adapters and dimension servers for the partial update pipeline, see the “Implementing Partial Updates” chapter in the Endeca Advanced Features Guide.

Data Indexing API Guide Endeca ConfidentialChapter 2

Page 27: Endeca_15_DataIndexingAPIGuide

27

Provisioning the System

Use Web Studio’s Provisioning System page to provision the Endeca Manager with the Endeca resources that will be used to perform update operations on your Endeca implementation.

Make sure that the “Incoming Directory” field of the Hosts section points to the location of the source data for baseline updates, not for partial updates. The location of the source data for your partial updates will be specified in the Data Indexing client application.

Refer to the Endeca Tools Guide for full details on provisioning your system.

Endeca Confidential System Setup

Page 28: Endeca_15_DataIndexingAPIGuide

28

Data Indexing API Guide Endeca ConfidentialChapter 2

Page 29: Endeca_15_DataIndexingAPIGuide

Chapter 3

Writing Java Client Programs

This chapter describes how you write, compile, and build a program, using the methods in the Data Indexing API. It contains the following sections:

• Java Client Requirements

• Using the Java WSDP Software

• Using Apache Axis

• Writing the Java Client Application

Java Client Requirements

This chapter describes how to write a Java application that consumes the Endeca Data Indexing Web service. This chapter assumes that you are writing the application using one of two sets of Java Web services tools:

• Java WSDP. For starting information, see the section “Using the Java WSDP Software” on page 30.

• Apache Axis. For starting information, see the section “Using Apache Axis” on page 35.

After you have read one of these sections, you can continue with the section “Writing the Java Client Application” on

Page 30: Endeca_15_DataIndexingAPIGuide

30

page 36. The section describes a generic Java sample program that should apply to both Java WSDL environments.

It is up to you to select the IDE with which you will develop the application. For example, you can use the Eclipse Platform available from the www.eclipse.org Web site.

Using the Java WSDP Software

If you will be using the Java WSDP software, download the Java Web Services Developer Pack (Java WSDP), Version 1.4 or later. This integrated toolkit allows Java developers to build and test XML applications, Web services, and Web applications with the latest Web services technologies and standards implementations.

Note: It is recommended that you create an environment variable that refers to the directory in which the Java WSDP is installed. Throughout this chapter, we will use JWSDP_HOME as the name of the variable

Make sure the JWSDP_HOME/jwsdp-shared/bin directory has been added to the PATH variable. This allows you to run Java WSDP programs without having to specify their complete path.

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 31: Endeca_15_DataIndexingAPIGuide

31

Creating a Client Configuration File

You must create an XML configuration file that will be the primary source of information for the wscompile tool. You can name the file as you wish, so long as it has the .xml extension.

Note that the JWSDP_HOME/jwsdp-shared/bin directory contains a client-config.xml template file.

The following example of a client configuration file (named config.xml) will work with the Data Indexing API WSDL file:

The configuration file begins with the standard XML prolog, using UTF-8 for the encoding.

All the elements must be within the main configuration element. The meanings of the configuration subelements are as follows:

<?xml version="1.0" encoding="UTF-8" ?><configuration xmlns="http://java.sun.com/xml/ns/jax-rpc/ri/config">

<wsdl location="DataIndexing.wsdl"packageName="endeca"/>

</configuration>

Element/Attribute Purpose

xmlns Sets the namespace of the configuration file. You should use the java.sun.com namespace that is shown in the example.

wsdl Defines the attributes of the WSDL file.

Endeca Confidential Writing Java Client Programs

Page 32: Endeca_15_DataIndexingAPIGuide

32

Other configuration elements are described in the JWSDP documentation in the JWSDP_HOME/jaxrpc/docs directory. However, the above elements are all you will need to create a Data Indexing Services client.

Generating Client Stubs with the wscompile Tool

The wscompile tool can generate stubs, ties, serializers, and other files used in JAX-RPC clients and services. For the Data Indexing API, the tool uses the above configuration file as input and generates .class and .java files based on the WSDL definitions.

The wscompile tool is named wscompile.sh on UNIX and wscompile.bat on Windows, and is located in the JWSDP_HOME/jaxrpc/bin directory.

You can display a usage list with the -help option:

wscompile -help

location The location of the DataIndexing.wsdl file. You can specify a path or a URL.

packageName The name of the package that will contain the generated stub classes.

Element/Attribute Purpose

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 33: Endeca_15_DataIndexingAPIGuide

33

Of the various options, the two that you will use are the following:

The following is an example of using this tool with the sample config.xml file as input:

wscompile –gen:client –keep config.xml

The tool will create a directory with the name that was specified with the packageName element in the client configuration file. The directory will contain a number of files, including the following:

• The Stub class, named DataIndexing_Stub.class, representing the Web service proxy and implementing the DataIndexing interface.

• The Service implementation class, named DataIndexingService_Impl.class.

• SOAP serializers and deserializers for every method in the interface.

The next step is to modify the Stub generated source file.

Option Purpose

-gen:client Generates client artifacts, such as stubs.

-keep Keeps the source files (.java files) for the stubs and class files. This is necessary because you will modify one of the generated source files. If you do not use this option, only .class files will be produced

Endeca Confidential Writing Java Client Programs

Page 34: Endeca_15_DataIndexingAPIGuide

34

Modifying the Stub Source File

Because multirefs are turned off in Axis (which is shipped with the product), you will need to manually edit the DataIndexing_Stub.java file to fix a problem that will occur if an exception is thrown.

To modify the Stub source file:

1. Use an editor to open the DataIndexing_Stub.java file.

2. Search for the _readBodyFaultElement method.

3. Add the following call before the switch statement:

The resulting code should look like this:

4. Add the following call to the default block of the switch statement:

The resulting code should look like this:

deserializationContext.pushEncodingStyle("http://schemas.xmlsoap.org/soap/encoding/");

Object faultInfo = null;int opcode = state.getRequest().getOperationCode();deserializationContext.pushEncodingStyle(

"http://schemas.xmlsoap.org/soap/encoding/");switch (opcode) {...

deserializationContext.popEncodingStyle();

default: deserializationContext.popEncodingStyle();return super._readBodyFaultElement(bodyReader,

deserializationContext, state);...

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 35: Endeca_15_DataIndexingAPIGuide

35

5. Recompile the DataIndexing_Stub.java file.

It is important to keep in mind that the above procedure modifies a generated file. This means that if you run wscompile again, it will overwrite the Stub source file unless you first moved it elsewhere. If the file is overwritten, you will need to repeat the procedure.

Using Apache Axis

To create a client using Axis, you must download and install the Apache Axis Java distribution. Make sure that you put the JAR files into your classpath. If you choose to use ant, install the axis-ant.jar appropriately by placing it in <ant_home>/lib.

Generating Client Stubs with WSDL2Java

The Axis WSDL2Java tool can generate stubs based on the Data Indexing WSDL file as input, as shown in the following example of running the tool:

The -p option maps all namespaces in the WSDL file to the same Java package name.

The tool will create a directory with the name that was specified with the -p option.

java org.apache.axis.wsdl.WSDL2Java -p endeca DataIndexing.wsdl

Endeca Confidential Writing Java Client Programs

Page 36: Endeca_15_DataIndexingAPIGuide

36

Generating Client Stubs with an Ant Task

You can also use the Axis-wsdl2java task to create Java classes from the Data Indexing WSDL file. To do so, you must build an Ant script that defines this task. Refer to the Axis and Ant documentation for details.

Writing the Java Client Application

After you have generated the client-side class files, you can write the client application, using the Data Indexing API classes and methods described in Chapter 5, “Endeca Data Indexing API Reference”.

Sample Java Application Program

The complete source code for the sample program used in this chapter is in Appendix A of this guide. The program adds new records to the system and then calls the Endeca Manager to start a partial update.

It is assumed that the Endeca Manager is running and has been provisioned with Web Studio. After the update is begun, the application checks the system status and displays a message when the update finishes.

Invoking the Data Indexing Web Service

Your application must connect to the Data Indexing Web service. The sample program instantiates a Web service

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 37: Endeca_15_DataIndexingAPIGuide

37

object (DataIndexing object) by calling the initService private method:

The initService method is defined as follows:

The javax.xml.rpc.Stub interface provides a property mechanism for the dynamic configuration of a stub instance for authentication purposes. The static constants Stub.USERNAME_PROPERTY and Stub.PASSWORD_PROPERTY are set with the Web service’s username and password that are defined in the tomcat-users.xml file.

The location of the Web service is set with the Stub.ENDPOINT_ADDRESS_PROPERTY constant. The location begins with the machine name (which can be localhost for a local machine) and port on which the Endeca Manager is running (default port is 8888), plus the /services/DataIndexing directory.

DataIndexing server = initService();

private static DataIndexing initService() throws Exception{

DataIndexingService_Impl locator = new DataIndexingService_Impl();DataIndexing server = locator.getDataIndexing();

//set the address of our service((Stub)server)._setProperty(Stub.ENDPOINT_ADDRESS_PROPERTY,

"http://localhost:8888/services/DataIndexing");//set the user name and password for our service//uses tomcat container authentication((Stub)server)._setProperty(Stub.USERNAME_PROPERTY, "webservices");((Stub)server)._setProperty(Stub.PASSWORD_PROPERTY, "K07YZ17MP1945Q");

return server;}

Endeca Confidential Writing Java Client Programs

Page 38: Endeca_15_DataIndexingAPIGuide

38

Note that the locator class may be different depending on which Web services technology you are using. The above sample client uses the DataIndexingService_Impl locator class that was produced by the Java WSDP he wscompile tool. In contrast, an Axis client might use the DataIndexingServiceLocator class produced by the Axis WSDL2Java tool.

Location of the Source Records

The directory where the source data resides is specified by a private String constant:

The name of the file that contains the source records, prepended with its directory path, is also specified by a private String constant:

The c_strAddDataInput value will then be passed to the addContentHelper private method to read in the file.

Format of the Source Records

The sample program expects a Delimited format for the source records to be added. The records are in a text file named mexico.txt (which contains information about airports in Mexico).

private static final String c_strPartialUpdateDataDir = "C:\\Projects\\partial_updates_data\\";

private static final String c_strAddDataInput = c_strPartialUpdateDataDir+"mexico.txt";

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 39: Endeca_15_DataIndexingAPIGuide

39

The file begins with a header row of property names (i.e., keys), with each property being delimited by the pipe (|) character:

AirportCode|CityOrAirportName|Country|...|

Each source record is delimited by pipe/sharp (|#) characters, with its property values delimited by the pipe character:

|#ACA|ACAPULCO ALVRZ INTL|Mexico|...|

Keep in mind that the source record format must conform to what is expected by the UPDATE_RECORD expressions in the partial update pipeline’s record manipulator. In our sample project, the record manipulator:

• Deletes records that have a Remove key with a value of 1.

• Updates records that have an Update key with a value of 1.

• Adds records that do not have Remove or Update keys.

The addContentHelper private method reads in and parses the content of the source record text file. The method application uses a java.io.BufferedReader wrapped

Endeca Confidential Writing Java Client Programs

Page 40: Endeca_15_DataIndexingAPIGuide

40

around a java.io.FileReader to read in the stream of characters from the file:

The BufferedReader.readLine() method actually reads in the file content as one line, into the line String variable. This variable will then be parsed for the property values for each record.

Note on Formats of Input and Output Record Files

You should keep in mind that there is a difference between the format of a client program’s input file (which contains the source data) and the resulting output file produced by the Data Indexing API.

The input file in our sample client is a simple text file (.txt extension) that contains delimited records. That is, delimiter characters separate each record from the next and also separate the property keys and values. However, you can use other input formats, such as JDBC records or XML files. In all cases, it is up to your application to read and parse the source records.

When the records are transferred via the addContent() method, the resulting output file will always be in an XML format, regardless of the format of the input file. You do not have to worry about this XML transformation because

BufferedReader in = new BufferedReader(new FileReader(strInputfile));String line;int iCtr = 0;String[] strKeys=null;while ((line = in.readLine()) != null) {

...

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 41: Endeca_15_DataIndexingAPIGuide

41

it is done automatically by the Data Indexing API. The resulting XML output is also why you must specify XML as the format of the record adapter in the partial update pipeline, as explained in “Creating the Update Adapter” on page 25.

Creating Records

The heart of the application is the creation of the Record objects that will be added to the system by the partial update process. Each Record object will consist of an array of multiple PVal objects. In turn, each PVal object will consist of a property name (such as “AirportCode”) and its corresponding value (such as “ACA”).

Each Record object is created as follows:

The strKeys variable is a String array that contains the names of the properties (keys), while strVals is a String array that contains the property values.

PVal[] keyvals = new PVal[strKeys.length];for (int j=0; j<strKeys.length; j++) {

keyvals[j] = new PVal();keyvals[j].setName(strKeys[j]);keyvals[j].setValue(strVals[j]);

}...Record r = new Record();r.setValues(keyvals);Record[] recs = {r};

Endeca Confidential Writing Java Client Programs

Page 42: Endeca_15_DataIndexingAPIGuide

42

The record-building procedure in the sample program is as follows:

1. Create an array (named keyvals) of PVal objects:

PVal[] keyvals = new PVal[strKeys.length];

The size of the array is the number of elements in the strKeys variable (that is, the number of properties).

2. Start a For loop that will execute once for each property:

for (int j=0; j<strKeys.length; j++)

3. Construct an empty PVal object:

keyvals[j] = new PVal();

4. Use the PVal.setName() method to set the name of the property (such as “CityOrAirportName”) in the PVal object:

keyvals[j].setName(strKeys[j]);

and the PVal.setValue() method to set the value of that property (such as “ISLA MUJERES”):

keyvals[j].setValue(strVals[j]);

5. Continue the For loop to construct the remaining PVal objects.

6. After the For loop finishes, construct an empty Record object (named r):

Record r = new Record();

7. Use the Record.setValues() method to set the array of PVal objects in the Record object:

r.setValues(keyvals);

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 43: Endeca_15_DataIndexingAPIGuide

43

8. At this point, you have a populated Record object. However, you cannot queue an individual record; you must send an array of records. Therefore, you build the array of records (named recs) as follows:

Record[] recs = {r};

Note that the above sample queues a Record array consisting of only one record. The reason is to concentrate on showing how to build Record and PVal objects. However, a more efficient way is illustrated by the Client2.java program (see “Appendix A”), which inserts all the records into the recs array.

Queueing the Records

You use the DataIndexing.addContent() method to queue the source records to the Endeca Manager for an update, as in the example from the sample program:

The method takes two parameters:

• An output file handle (strHandler in the example).

• An array of Record objects (recs in the example). The previous section describes how to create this array.

The file handle value must map to the file path that is specified in the URL field in the record adapter of the partial pipeline. For example, the record adapter in the

server.addContent(strHandler, recs);

Endeca Confidential Writing Java Client Programs

Page 44: Endeca_15_DataIndexingAPIGuide

44

sample project used by this application has this URL field setting:

The partial_data_*.xml part of the file path means that any file handle that beings with partial_data_ will map to this record adapter. The sample application defines these three file handles:

All three will map to the partial_data_*.xml URL path of the record adapter.

private static final String c_strAddHandler = "partial_data_add_data";private static final String c_strDelHandler = "partial_data_del_data";private static final String c_strModHandler = "partial_data_mod_data";

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 45: Endeca_15_DataIndexingAPIGuide

45

Starting the Update

Use the appropriate method to start the update:

• DataIndexing.startBaselineUpdate() will start a baseline update.

• DataIndexing.startPartialUpdate() will start a partial update.

Neither method takes any arguments.

To start the partial update, the sample program calls a private method (named doUpdate), which starts the appropriate update:

After a baseline update begins, it can be stopped with the DataIndexing.stopBaselineUpdate() method. Partial updates, however, have no corresponding stop method.

doUpdate(server, "baseline");...private static void doUpdate(DataIndexing server, String strUpdate) throws Exception{

System.err.println("Starting "+strUpdate+" update");if (strUpdate.equals("baseline"))

server.startBaselineUpdate();else

server.startPartialUpdate();}

Endeca Confidential Writing Java Client Programs

Page 46: Endeca_15_DataIndexingAPIGuide

46

Monitoring the Update Progress

The following methods allow you to monitor the progress of the update and to detect system errors:

• DataIndexing.getStatus() retrieves the system status (as a Status object) from the Endeca Manager. The returned information includes the status of the update operation and any error messages relating to data that is being updated in the system.

• Status.getSystemState() returns a string that represents the state of the system (that is, what the system is currently doing):

− “UPDATING” means a baseline or partial update is in progress.

− “IDLE” means that the system is not performing an update operation. The system, however, may be performing other types of operations, such as searches.

− “SYSTEM_ERROR” means the system is in an error state caused by an update operation.

• Status.getSystemErrors() returns an array of SystemError objects.

• SystemError class methods, such as getErrorMsg(), allow you to get the information in a SystemError object. This information includes the name of the Endeca component that reported the error (such as “FORGE”), the error message, the identifier of the record that was in error, and the severity level of the error (which is “ERROR”, “WARNING”, or “FATAL”).

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 47: Endeca_15_DataIndexingAPIGuide

47

Keep in mind that a system status of “IDLE” does not mean that all records were successfully updated. That is, SystemError objects may be returned even if the system status is “IDLE” and not “SYSTEM_ERROR”. This happens when an overall error does not occur, but some records failed. For example, if a partial update operation successfully adds 48 out of 50 records but fails to add two records, a Status.getSystemState() method would return a status of “IDLE” but Status.getSystemErrors() would return two SystemError objects.

The sample program uses a while statement to monitor the update:

The two important monitoring actions of the while statement are:

1. The system status is retrieved with the DataIndexing.getStatus() method.

2. The system state is retrieved (as a string) with the Status.getSystemState() method and compared to the string “UPDATING”. If the two strings are equal, the loop continues and the system state is displayed; if they are not equal, the control flow breaks to the next statement.

while (true) {System.err.println("Checking status...");status = server.getStatus();if (!"UPDATING".equals(status.getSystemState()))

break;System.err.println("Status is "+status.getSystemState());Thread.sleep(3000);

}

Endeca Confidential Writing Java Client Programs

Page 48: Endeca_15_DataIndexingAPIGuide

48

The while statement breaks when the system state is either “IDLE” (which signifies the completion of a successful update) or “SYSTEM_ERROR” (which means the update was unsuccessful).

However, because a system status of “IDLE” does not mean that all records were successfully updated, the program first prints out the update status and then prints out the contents of SystemError objects, if any exist:

If an error did occur, the Status.getSystemErrors() method will get the complete error state and the SystemError methods will get the error details, such as the component that reported the error. If no there are no of SystemError objects, the For loop will not be executed.

By using these methods, you can quickly ascertain what caused the update to fail.

System.err.println("Update status: " + status.getSystemState());SystemError[] errors = status.getSystemErrors();for (int i=0; i<errors.length; i++) {

System.err.println("Error msg: "+errors[i].getErrorMsg()+",component: "+errors[i].getComponent()+", record spec: "+errors[i].getRecordSpec()+", severity: "+errors[i].getSeverity());

}

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 49: Endeca_15_DataIndexingAPIGuide

49

Catching Data Indexing Exceptions

An exception represents a more serious problem than the SYSTEM_ERROR status or SystemError objects. Therefore, the Data Indexing API provides four Data Indexing exception classes:

• DIException represents all exceptions thrown by Data Indexing related classes. This class extends the Java Exception class (java.lang.Exception). Note that the DISystemException and DIInvalidOperation exceptions inherit from DIException.

• DIInvalidParameter represents exceptions thrown because of parameters that were not valid for a Data Indexing method.

• DIInvalidOperation represents exceptions thrown from operations that were not valid for the state in which the system was in. For example, calling the addContent() method when a partial update is in progress will throw this exception. Typically, users can programmatically recover from this exception (by waiting for the update to finish, for example).

• DISystemOperation represents exceptions that are more serious than DIInvalidOperation exceptions. Typically, you cannot recover programmatically from this exception, but instead must use some manual intervention, such as reprovisioning the system.

Each class has a getMessage() method that retrieves a String object that describes the exception.

Endeca Confidential Writing Java Client Programs

Page 50: Endeca_15_DataIndexingAPIGuide

50

See Chapter 5, “Endeca Data Indexing API Reference”, to find out which exceptions are thrown by the Data Indexing API methods.

The sample program uses a try block in its main procedure and four catch clauses for the exceptions:

When the JVM confronts this sequence of multiple catch clauses, it searches for the appropriate clause from the sequence's top to its bottom. If it finds a match, that clause executes.

Note that the final catch clause is for a standard Java Exception object. This clause should catch exceptions that are not Data Indexing specific exceptions.

try {DataIndexing server = initService();...doClearContent(server);

}catch (DIInvalidOperation de) {

System.err.println("DIInvalidOperation caught.");System.err.println(de.getMessage());

}catch (DIInvalidParameter de) {

System.err.println("DIInvalidParameter caught.");System.err.println(de.getMessage());

}catch (DISystemException de) {

System.err.println("DISystemException caught.");System.err.println(de.getMessage());

}catch (Exception e) {

System.err.println("Exception caught.");System.err.println(e.getMessage());

}

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 51: Endeca_15_DataIndexingAPIGuide

51

Although the sample program just prints out the exception’s message, you can code your application to examine the exception and attempt to programmatically recover from the error. As mentioned above, you might be able to recover from a DIInvalidOperation exception, such as attempting to start an update while one is already in progress.

Clearing the Update Records

After an update ends successfully, it is recommended that your application clear the records that were previously enqueued by DataIndexing.addContent() methods.

Use the DataIndexing.clearContent() method to clear the records. The record files are deleted locally (in the Endeca Manager's update directory) as well as in the Forge directory.

The sample application clears the records as follows:

Because the DataIndexing.clearContent() method takes a String array as an argument, you can clear multiple file handles, even those that were not used. The sample program, for example, clears all three file handles, even though only the c_strAddHandler was used.

server.clearContent(new String[] {c_strAddHandler,c_strDelHandler, c_strModHandler});

Endeca Confidential Writing Java Client Programs

Page 52: Endeca_15_DataIndexingAPIGuide

52

Data Indexing API Guide Endeca ConfidentialChapter 3

Page 53: Endeca_15_DataIndexingAPIGuide

Chapter 4

Writing .NET Client Programs

This chapter describes how you write, compile, and build a program, using the C# language with the methods in the Data Indexing API. It contains the following sections:

• .NET Client Requirements

• Creating the DataIndexingService Library

• Writing the .NET Client Application

.NET Client Requirements

This chapter describes how to write a .NET application that consumes the Endeca Data Indexing Web service.

The chapter assumes that you will be writing your .NET application in C# using Microsoft’s Visual Studio .NET development tool. You must also install the Microsoft .NET Framework and the Framework SDK.

Page 54: Endeca_15_DataIndexingAPIGuide

54

Creating the DataIndexingService Library

Your first task is to create a DataIndexingService.dll file that you will reference in your application project. The creation of this DLL file involves two steps:

1. Produce a client proxy stub class.

2. Compile the class into the DLL file.

Producing the Client Stub Class

The Microsoft .NET Framework SDK includes the Web Services Description Language Tool (wsdl.exe) that can generate code for XML Web service clients from WSDL contract files (.wsdl files).

You can display a usage list with the /? option:

wsdl /?

The tool’s /language option (abbreviated as /l) lets you specify the language of the generated client stub:

The following is an example of using this tool with the DataIndexing.wsdl file as input:

Language Option Purpose

/l:CS Generates a C# client stub. This is default.

/l:VB Generates a Visual Basic client stub.

/l:JS Generates a JScript client stub.

Data Indexing API Guide Endeca ConfidentialChapter 4

Page 55: Endeca_15_DataIndexingAPIGuide

55

The above example will produce a client stub proxy class named DataIndexingService.cs in the C# language. The next step is compile the class to produce a library.

Building the DataIndexingService Library

Use the Microsoft C# compiler (csc.exe) to build the DataIndexingServices DLL library from the C# source code in the DataIndexingService.cs file.

The compiler’s /target:library option (abbreviated as /t:library) builds a DLL library file. You can use the /out option to specify the exact location where the DLL should be stored.

The following is an example of using the compiler with the DataIndexingService.cs file as input:

C:\NetClient> wsdl /language:CS DataIndexing.wsdlMicrosoft (R) Web Services Description Language Utility[Microsoft (R) .NET Framework, Version 1.0.3705.0]Copyright (C) Microsoft Corporation 1998-2001. All rights reserved.

Writing file 'C:\NetClient\DataIndexingService.cs'.C:\NetClient>

C:\NetClient>csc /t:library DataIndexingService.csMicrosoft (R) Visual C# .NET Compiler version 7.00.9951for Microsoft (R) .NET Framework version 1.0.3705Copyright (C) Microsoft Corporation 2001. All rights reserved.

C:\NetClient>

Endeca Confidential Writing .NET Client Programs

Page 56: Endeca_15_DataIndexingAPIGuide

56

The example will generate the DataIndexingService.dll library in the current directory. You can then include this library in your Data Indexing API client application.

Writing the .NET Client Application

The application example in these sections assumes that you are using the Visual C# .NET development tool to write and build your client application. Because this Visual C# project is fairly simple, it uses the Console Application template.

Adding Reference Libraries

Besides the DataIndexingService.dll library, you will typically will need to include some extra libraries to your project in order for it to compile.

To add the libraries to your .NET project:

1. Use Visual C# .NET to open your project.

2. From the Project menu, select Add Reference. This command opens the Add Reference dialog where you can add library references to your project.

3. From the .NET tab of the Add Reference dialog, select the System.Web.dll, System.Web.Services.dll, and System.Web.RegularExpressions.dll, and click OK.

4. Use the Add Reference dialog to add the DataIndexingService.dll library to the project.

Data Indexing API Guide Endeca ConfidentialChapter 4

Page 57: Endeca_15_DataIndexingAPIGuide

57

Sample .NET Client Application

The C# code for the sample application is as follows.

using System;

namespace ConsoleApplication1{

/// <summary>/// Class to start a baseline update./// </summary>class Class1{/// <summary>/// The main entry point for the application./// </summary>[STAThread]static void Main(string[] args){

DataIndexingService di=new DataIndexingService();di.Url="http://localhost:8888/services/DataIndexing";di.PreAuthenticate=true;di.Credentials=new System.Net.NetworkCredential("webservices",

"K07YZ17");

try {

di.startBaselineUpdate();Status s=di.getStatus();System.Console.WriteLine(s.systemState);

} catch (System.Web.Services.Protocols.SoapException e){

System.Console.WriteLine(e.Detail.InnerXml);}System.Console.ReadLine();

}}

}

Endeca Confidential Writing .NET Client Programs

Page 58: Endeca_15_DataIndexingAPIGuide

58

Connecting to the Data Indexing Web Service

Your application must connect to the Data Indexing Web service using code similar to the following example:

In the DataIndexingService.dll library, the DataIndexingService class is a descendant of the Microsoft .NET Framework System.Web.Services.Protocols.WebClientProtocol

class. This means that the DataIndexingService class inherits the WebClientProtocol class members.

Three WebClientProtocol properties are set in the Web service instantiation for identification and authentication purposes:

• The Url property sets the base URL of the XML Web service the client is requesting. The base URL begins with the machine name (which can be localhost for a local machine) and port on which the Endeca Manager is running (default port is 8888), plus the /services/DataIndexing directory.

• The PreAuthenticate property is set to true to enable pre-authentication, which means that the client must be authenticated (and subsequently authorized) in order to access the XML Web service. If the client cannot be authenticated (for example, if the password

DataIndexingService di=new DataIndexingService();di.Url="http://localhost:8888/services/DataIndexing";di.PreAuthenticate=true;di.Credentials=new System.Net.NetworkCredential("webservice", "K07YZ17");...

Data Indexing API Guide Endeca ConfidentialChapter 4

Page 59: Endeca_15_DataIndexingAPIGuide

59

for the Web service is incorrect), no service methods can be executed.

• The Credentials property sets the security credentials for Web service client authentication. Because the Endeca Manager uses a password-based authentication scheme, the credentials are set with an instantiation of the System.Net.NetworkCredential class. The username and password in the credentials must match those set in the tomcat-users.xml file for the Data Indexing Web service user.

For details on these properties, see the Microsoft .NET Framework documentation.

Starting the Baseline Update

The sample program starts a baseline update. Unlike the partial update started by the Java sample program, you do not need to enqueue records for the update. Instead, the Endeca Manager will use the source data from the incoming directory that you provisioned in Web Studio.

The sample program starts the update with the DataIndexingService.startBaselineUpdate() method:

After a baseline update begins, it can be stopped with the DataIndexingService.stopBaselineUpdate() method.

di.startBaselineUpdate();

Endeca Confidential Writing .NET Client Programs

Page 60: Endeca_15_DataIndexingAPIGuide

60

Monitoring and System Error Methods

The following methods allow you to monitor the progress of the update and to detect system errors:

• DataIndexingService.getStatus() retrieves the system status (as a Status object) from the Endeca Manager.

• Status.systemState returns a string that represents the state of the system:

− “UPDATING” means a baseline or partial update is in progress.

− “IDLE” means that the system is not performing an update operation. The system, however, may be performing other types of operations.

− “SYSTEM_ERROR” means the system is in an error state caused by an update operation.

• Status.systemErrors returns an array of SystemError objects.

• Four SystemError class properties, such as SystemError.errorMsg, returns the information in a SystemError object. The information includes the name of the Endeca component that reported the error, the error message, the identifier of the record that was in error, and the severity level of the error.

Note that SystemError objects may be returned even if the system status is “IDLE” and not “SYSTEM_ERROR”. This happens when an overall update error does not occur, but some records failed. For example, a partial update operation may successfully add 48 out of 50 records but

Data Indexing API Guide Endeca ConfidentialChapter 4

Page 61: Endeca_15_DataIndexingAPIGuide

61

fail to add two records. In this case, the status will be “IDLE” but two SystemError objects will be returned.

The sample program uses the following code to get the system status and print it to the console:

You could add a while loop that checks the system state and breaks when either “IDLE” or “SYSTEM_ERROR” is returned. If one or more SystemError objects are returned (which, as noted above, can happen even with an “IDLE” status), the Status.systemErrors property will get the complete error state and the SystemError properties will get the error details.

Catching Exceptions

In a .NET client application that is calling XML Web service methods over SOAP, use the .NET Framework System.Web.Services.Protocols.SoapException class to catch and handle exceptions. When the client accesses a method over SOAP, the exception is caught on the server and wrapped inside a new SoapException.

The SoapException object will contain a Data Indexing API exception if one was caught on the server. These exceptions (DIException, DIInvalidParameter, DIInvalidOperation, and DISystemOperation) are described in Chapter 5, “Endeca Data Indexing API Reference”.

Status s=di.getStatus();System.Console.WriteLine(s.systemState);

Endeca Confidential Writing .NET Client Programs

Page 62: Endeca_15_DataIndexingAPIGuide

62

The sample program uses a try block and a catch clause for the exceptions:

The SoapException.Detail property gets a .NET Framework System.Xml.XmlNode object that represents the application-specific error information detail. The XmlNode.InnerXml property gets the markup representing only the child nodes of this node.

By examining the XML fields, you can find out error information such as the name of the exception and the error message returned by the Endeca Manager. For example, if you start a partial update before a baseline update has been run, the XML looks similar to this:

try {di.startBaselineUpdate();...

}catch (System.Web.Services.Protocols.SoapException e){

System.Console.WriteLine(e.Detail.InnerXml);}

<ns1:DISystemException xsi:type="ns1:DISystemException" xmlns:ns1="urn:com.endeca.service.dataindexing" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><message xsi:type="xsd:string">Cannot 'start update' in state 'Needs Baseline Update'.</message></ns1:DISystemException><ns2:exceptionName xmlns:ns2="http://xml.apache.org/axis/">com.endeca.service.dataindexing.DISystemException</ns2:exceptionName><ns3:hostname xmlns:ns3="http://xml.apache.org/axis/">doc-004</ns3:hostname>

Data Indexing API Guide Endeca ConfidentialChapter 4

Page 63: Endeca_15_DataIndexingAPIGuide

63

In this example, a DISystemException has been thrown with the error message:

Cannot ‘start update’ in state ‘Needs Baseline Update’.

The host name of the machine on which the Endeca Manager is running (doc-004) is also given.

By using other SoapException class members, you can display more details, including a stack trace. For example:

See the Microsoft .NET Framework documentation for details on these properties.

try {di.startBaselineUpdate();...

}catch (System.Web.Services.Protocols.SoapException e){

Console.WriteLine("SoapException occurred at " + DateTime.Now);Console.WriteLine("Source: " + e.Source);Console.WriteLine("Message: " + e.Message);Console.WriteLine("Code: " + e.Code);Console.WriteLine("Actor: " + e.Actor);Console.WriteLine("Detail: " + e.Detail.InnerXml);Console.WriteLine("StackTrace:");Console.WriteLine(e.StackTrace);

if(e.InnerException != null){

Console.WriteLine("InnerException:");Console.WriteLine(e.InnerException.ToString());

}}

Endeca Confidential Writing .NET Client Programs

Page 64: Endeca_15_DataIndexingAPIGuide

64

Data Indexing API Guide Endeca ConfidentialChapter 4

Page 65: Endeca_15_DataIndexingAPIGuide

Chapter 5

Endeca Data Indexing API Reference

This chapter describes the Endeca Data Indexing API objects and methods.

In general, the syntax descriptions in this chapter follow Java conventions. However, the exact syntax of a class member depends on the output of the WSDL tool that you are using. For example, a Java WSDL tool will generate a public interface named DataIndexing while the Microsoft .NET tool produces the DataIndexingService public class.

Likewise, the Java tool outputs a Status.getSystemState() method, while the Microsoft .NET tool will produce a Status.systemState property. Therefore, be sure to check the client stub classes that are generated by your WSDL tool for the exact syntax of the Data Indexing API class members.

Page 66: Endeca_15_DataIndexingAPIGuide

66

DataIndexing Interface

The DataIndexing interface provides for adding data content to the Endeca implementation, retrieving status, and running partial and baseline updates on the data.

Methods

addContent(String handle, Record[] records)

Queues source records for a partial update operation. The records will be written out in XML format in a file specified by the handle parameter. When the update runs, the source records will be transformed into Endeca records by Forge.

Parameters:

• handle – The handle to the record adapter that will handle the data, as specified in the partial update pipeline. The value must map to the file path specified by the URL in the adapter (relative to Forge’s incoming directory).

• records – An array of one or more records (Record objects) to be passed into the system.

Throws:

• DIInvalidParameter if the handle or records parameter is null or otherwise invalid.

• DIInvalidOperation if the Endeca Manager is currently performing an update.

Data Indexing API Guide Endeca ConfidentialChapter 5

Page 67: Endeca_15_DataIndexingAPIGuide

67

• DISystemException if the output XML file could not be written or if some other error occurred.

clearContent(String[] handles)

Clears records that have been enqueued by one or more addContent() methods. The record files are deleted locally (in the Endeca Manager's update directory) as well as in the Forge directory.

Parameters:

• handle – The handles whose data needs to be cleared. Developers typically call this method after a successful update and before they start adding more content.

Throws:

• DIInvalidParameter if the handle parameter is null or otherwise invalid.

• DIInvalidOperation if the Endeca Manager is currently performing an update.

• DISystemException if some of the files could not be deleted or if some other error occurred.

getSystemStatus()

Gets the system status from the Endeca Manager. Includes status of the update operation and any error messages relating to data updated in the system.

Endeca Confidential Endeca Data Indexing API Reference

Page 68: Endeca_15_DataIndexingAPIGuide

68

Returns:

Status—A Status object, indicating the state of the system (whether it is in the middle of an update) and a collection of errors reported by the components on individual records in the system.

Throws:

• DISystemException if an error occurred in trying to obtain the status from the Endeca Manager.

startBaselineUpdate()

Starts a baseline update. The call returns as soon as the update begins.

Throws:

• DIInvalidOperation if the Endeca Manager is currently performing an update.

• DISystemException if the baseline update cannot be started or if some other error occurred.

startPartialUpdate()

Starts a partial update. The call returns as soon as the update begins. Note that there is no method to stop a partial update, because partial updates are typically smaller and faster than baseline updates and because the Navigation Engine cannot be asked to terminate a partial update.

Data Indexing API Guide Endeca ConfidentialChapter 5

Page 69: Endeca_15_DataIndexingAPIGuide

69

Throws:

• DIInvalidOperation if the Endeca Manager is currently performing an update.

• DISystemException if the partial update cannot be started or if some other error occurred.

stopBaselineUpdate()

Stops a running baseline update. The call returns as soon as the update stops.

Throws:

• DIInvalidOperation if the Endeca Manager is not currently performing an update.

• DISystemException if the baseline update cannot be stopped or if some other error occurred.

PVal Class

A key/value pair, a collection of which constitutes a Record object. The key is the name of an Endeca property or dimension, and the value is the String value of that key.

Endeca Confidential Endeca Data Indexing API Reference

Page 70: Endeca_15_DataIndexingAPIGuide

70

Constructors

PVal(String name, String value)

Constructs a PVal object from a name and a value.

Parameters:

• name – The name of the property or dimension.

• value – The value of the property or dimension.

PVal()

Constructs a PVal object with no data. Use the setName() and setValue() methods to add a name and value.

Methods

getName()

Gets the name of this PVal object.

Returns:

String—The name of this PVal object.

getValue()

Gets the value of this PVal object.

Returns:

String—The value of this PVal object.

Data Indexing API Guide Endeca ConfidentialChapter 5

Page 71: Endeca_15_DataIndexingAPIGuide

71

setName(String name)

Sets the name of this PVal object.

Parameters:

• name – The name assigned to this PVal object.

setValue(String value)

Sets the value of this PVal object.

Parameters:

• value – The value assigned to this PVal object.

Record Class

A source record, which is a collection of data in the form of PVal objects (key/value pairs). The record is added to the Endeca system with the EndecaSystem.addContent() method.

Constructors

Record(PVal[] values)

Constructs a new Record object from an array of PVal objects.

Endeca Confidential Endeca Data Indexing API Reference

Page 72: Endeca_15_DataIndexingAPIGuide

72

Parameters:

• values – A collection of PVal objects.

Record()

Constructs a new Record object containing no data.

Methods

getValues()

Gets the entire collection of PVal objects from this record.

Returns:

PVal[]—The array of PVal objects.

setValues(PVal[] pval)

Adds a collection of PVal objects to this record.

Parameters:

• pval – The collection of PVal objects to be added to the record.

Status Class

An object that represents the state of the system, including failed records and associated error messages.

Data Indexing API Guide Endeca ConfidentialChapter 5

Page 73: Endeca_15_DataIndexingAPIGuide

73

Methods

getSystemErrors()

Gets all the error messages.

Returns:

SystemErrors[]—The array of SystemError objects.

getSystemState()

Gets the state of the Endeca system, which is represented by one of the following string messages:

• IDLE – The system is not performing an update operation and none of the components are in error.

• UPDATING – The system is performing a baseline or partial update.

• SYSTEM_ERROR – The system is in an error state caused by an update operation. This state typically means that the system was not configured or provisioned correctly and the update failed completely (that is, none of the update records were propagated through the system). This is a severe state, and usually requires manual intervention to fix the problem (for example, using Web Studio’s Configuration page or Provisioning System page).

Note that an update can finish successfully even though some records may not have been added to the system. In this case, the system status upon completion of the

Endeca Confidential Endeca Data Indexing API Reference

Page 74: Endeca_15_DataIndexingAPIGuide

74

update will be IDLE (not SYSTEM_ERROR). However, there will be a SystemError object for each failed record.

Returns:

String—The status of the system, as indicated by one of the above messages.

SystemError Class

An object that represents a system error. Typically, the error involves a record that failed to be added to the system. However, the system error may be caused by a condition that was not related to a failed Record object.

SystemError objects may be returned even if the system status is IDLE and not SYSTEM_ERROR. This happens when an overall error does not occur, but some records failed. For example, if 98 out of 100 records succeeded in being added, but two failed (because those records already existed), a Status.getSystemState() method would return IDLE but Status.getSystemErrors() would return two SystemError objects.

Methods

getComponent()

Gets the name of the Endeca component that reported the error.

Data Indexing API Guide Endeca ConfidentialChapter 5

Page 75: Endeca_15_DataIndexingAPIGuide

75

Returns:

String—The name of the Endeca component, which is one of the following:

• MANAGER (for the Endeca Manager)

• FORGE

• DGIDX

• DGRAPH (for the Endeca Navigation Engine)

getErrorMsg()

Gets the error message.

Returns:

String—The error message, as returned from the Endeca Manager. If the error is due to a failed Record object, the message may include the reason why the record could not be added or updated.

getRecordSpec()

Gets the record specifier of the record that was not added.

Returns:

String—The record specifier of the Record object that was not added. Returns null if the SystemError is not related to a particular record.

Endeca Confidential Endeca Data Indexing API Reference

Page 76: Endeca_15_DataIndexingAPIGuide

76

getSeverity()

Gets the severity level of the error.

Returns:

String—The severity level, which is one of the following:

• ERROR

• WARNING

• FATAL

DIException Class

The DIException represents all exceptions thrown by Data Indexing related classes. The DIInvalidOperation, DIInvalidParameter, and DISystemException classes all inherit from this class.

Methods

getMessage()

Gets the error message of the exception.

Returns:

String—The error message.

Data Indexing API Guide Endeca ConfidentialChapter 5

Page 77: Endeca_15_DataIndexingAPIGuide

77

DIInvalidOperation Class

The DIInvalidOperation exception is thrown for operations that were not valid for the state in which the system was in. For example, calling the addContent() method when a partial update is in progress will throw this exception. Typically, users can programmatically recover from this exception (by waiting for the update to finish, for example).

Methods

getMessage()

Gets the error message of the exception.

Returns:

String—The error message.

DIInvalidParameter Class

The DIInvalidParameter exception is thrown for parameters that were not valid for a method.

Methods

getMessage()

Gets the error message of the exception.

Endeca Confidential Endeca Data Indexing API Reference

Page 78: Endeca_15_DataIndexingAPIGuide

78

Returns:

String—The error message.

DISystemOperation Class

DISystemOperation indicates a much more serious problem than DIInvalidOperation. Typically, users cannot programmatically recover from this exception, but instead must use manual intervention (such as reprovisioning the system).

Methods

getMessage()

Gets the error message of the exception.

Returns:

String—The error message.

Data Indexing API Guide Endeca ConfidentialChapter 5

Page 79: Endeca_15_DataIndexingAPIGuide

Appendix A

Sample Java Client Code

This appendix contains the code for the Java client discussed in Chapter 3, “Writing Java Client Programs”.

Client.java Example

The Client.java sample code is a basic example of a partial update application. It extracts source records from a text file and constructs Data Indexing API Record objects. An array of these records is added to the system with the DataIndexing.addContent() method.

The records are used in a partial update that is performed by the Endeca Manager. The partial update is started with the DataIndexing.startPartialUpdate() method.

Page 80: Endeca_15_DataIndexingAPIGuide

80

package com.endeca.service.dataindexing;

import javax.xml.rpc.Stub;import java.io.*;import endeca.*;

// client for the Data Indexing API

public class Client{// output file handlesprivate static final String c_strAddHandler = "partial_data_add_data";private static final String c_strDelHandler = "partial_data_del_data";private static final String c_strModHandler = "partial_data_mod_data";

//this is the directory from which we parse data to send to the APIprivate static final String c_strPartialUpdateDataDir = "C:\\Projects\\partial_updates_data\\";

// this is the file that contains data we want to add private static final String c_strAddDataInput = c_strPartialUpdateDataDir+"mexico.txt";

// this is the file that contains data we want to delete private static final String c_strDelDataInput = c_strPartialUpdateDataDir+"deletes.txt";

// this is the file that contains data we want to modify private static final String c_strModDataInput = c_strPartialUpdateDataDir+"updates.txt";

public static void main(String [] args) {try {DataIndexing server = initService();

doClearContent(server);

doAddContent(server);

Data Indexing API Guide Endeca ConfidentialAppendix A

Page 81: Endeca_15_DataIndexingAPIGuide

81

Status status = server.getStatus();String s = status.getSystemState();doUpdate(server, "partial");waitForUpdate(server);doClearContent(server);

}catch (DIInvalidOperation de) {System.err.println("DIInvalidOperation caught.");System.err.println(de.getMessage());

}catch (DIInvalidParameter de) {System.err.println("DIInvalidParameter caught.");System.err.println(de.getMessage());

}catch (DISystemException de) {System.err.println("DISystemException caught.");System.err.println(de.getMessage());

}catch (Exception e) {System.err.println("Exception caught.");System.err.println(e.getMessage());

}}

private static DataIndexing initService() throws Exception{DataIndexingService_Impl locator = new DataIndexingService_Impl();DataIndexing server = locator.getDataIndexing();

//set the address of our service((Stub)server)._setProperty(Stub.ENDPOINT_ADDRESS_PROPERTY,

"http://localhost:8888/services/DataIndexing");//set the user name and password for our service//uses tomcat container authentication((Stub)server)._setProperty(Stub.USERNAME_PROPERTY, "webservices");((Stub)server)._setProperty(Stub.PASSWORD_PROPERTY, "K07YZ17MP1945Q");

return server;}

Endeca Confidential Sample Java Client Code

Page 82: Endeca_15_DataIndexingAPIGuide

82

// add content to the systemprivate static void doAddContent(DataIndexing server) throws Exception{// Data to be added.addContentHelper(c_strAddDataInput, c_strAddHandler, "Adding record ",

server);// Data to be deleted//addContentHelper(c_strDelDataInput, c_strDelHandler, "Deleting record ",

server);// Data to be modified//addContentHelper(c_strModDataInput, c_strModHandler,

"Modifying record ", server);}

// remove content from the systemprivate static void doClearContent(DataIndexing server) throws Exception{System.err.println("Clearing contents");server.clearContent(new String[] {c_strAddHandler, c_strDelHandler,

c_strModHandler});}

private static void doUpdate(DataIndexing server, String strUpdate) throws Exception{System.err.println("Starting "+strUpdate+" update");if (strUpdate.equals("baseline"))server.startBaselineUpdate();

elseserver.startPartialUpdate();

}

private static void stopUpdate(DataIndexing server) throws Exception{System.err.println("Stopping baseline update");server.stopBaselineUpdate();

// make sure we're not updatingSystem.err.println("Checking status...");Status status = server.getStatus();if ("UPDATING".equals(status.getSystemState()))

Data Indexing API Guide Endeca ConfidentialAppendix A

Page 83: Endeca_15_DataIndexingAPIGuide

83

throw new Exception("Stop update unsuccessful. We're still in UPDATEING state");

System.err.println("Successfully stopped update.");}

private static void waitForUpdate(DataIndexing server) throws Exception {Status status;while (true) {System.err.println("Checking status...");status = server.getStatus();if (!"UPDATING".equals(status.getSystemState()))break;

System.err.println("Status is "+status.getSystemState());Thread.sleep(3000);

}System.err.println("Update status: " + status.getSystemState());SystemError[] errors = status.getSystemErrors();for (int i=0; i<errors.length; i++) {System.err.println("Error msg: "+errors[i].getErrorMsg()+",

component: "+errors[i].getComponent()+ ", record spec: "+errors[i].getRecordSpec()+ ", severity: "+errors[i].getSeverity());

}System.err.println("Completed update");

}

private static void addContentHelper(String strInputfile, String strHandler, String strMsg,

DataIndexing server) throws Exception{// read data from file and create records to be addedBufferedReader in = new BufferedReader(new FileReader(strInputfile));String line;int iCtr = 0;String[] strKeys=null;

Endeca Confidential Sample Java Client Code

Page 84: Endeca_15_DataIndexingAPIGuide

84

while ((line = in.readLine()) != null) {// first line is the keys - separator is pipe characterString[] lines=line.split("\\|#");for (int i=0; i<lines.length; ++i){line=lines[i];if (0 == iCtr) { strKeys = line.split("\\|");System.err.println("strKeys "+strKeys.length);

}else {// each subsequent line represents a recordString[] strVals = line.split("\\|");System.err.println("strVals "+strVals.length);if (strVals.length != strKeys.length) throw new Exception("Invalid input: number of vals not equal to

number of keys.");PVal[] keyvals = new PVal[strKeys.length];for (int j=0; j<strKeys.length; j++) {keyvals[j] = new PVal();keyvals[j].setName(strKeys[j]);keyvals[j].setValue(strVals[j]);

}System.err.println(strMsg+iCtr);Record r = new Record();r.setValues(keyvals);Record[] recs = {r};server.addContent(strHandler, recs);

}iCtr++;

}}

}}

Data Indexing API Guide Endeca ConfidentialAppendix A

Page 85: Endeca_15_DataIndexingAPIGuide

85

Client2.java Example

The Client2.java sample code is identical to Client.java except that the addContentHelper() private method uses the DataIndexing.addContent() method only once. That method is listed below.

private static void addContentHelper(String strInputfile, String strHandler, String strMsg, DataIndexing server) throws Exception{// read data from file and create records to be addedBufferedReader in = new BufferedReader(new FileReader(strInputfile));String line;String[] strKeys=null;while ((line = in.readLine()) != null) {// first "record" is the list of keys// record delimiter = "|#"// key delimiter = "|"String[] lines=line.split("\\|#");// since first "record" is the list of keys// adjust the size of the record array accordinglyRecord[] recs = new Record[lines.length - 1];for (int i=0; i<lines.length; ++i){line=lines[i];if (0 == i) {strKeys = line.split("\\|");

}else {// each subsequent line represents a recordString[] strVals = line.split("\\|");if (strVals.length != strKeys.length)throw new Exception("Invalid input: number of vals not equal to number

of keys.");

Endeca Confidential Sample Java Client Code

Page 86: Endeca_15_DataIndexingAPIGuide

86

PVal[] keyvals = new PVal[strKeys.length];for (int j=0; j<strKeys.length; j++) {keyvals[j] = new PVal();keyvals[j].setName(strKeys[j]);keyvals[j].setValue(strVals[j]);

}System.err.println(strMsg+i);Record r = new Record();r.setValues(keyvals);// remember that the index i is 1 ahead the index// into the record arrayrecs[i-1]=r;

}}server.addContent(strHandler, recs);

}}

Data Indexing API Guide Endeca ConfidentialAppendix A

Page 87: Endeca_15_DataIndexingAPIGuide

Index

AAnt task for creating client stubs 36authentication for Web .NET client 59Axis

Ant task for creating client stubs 36downloading 35WSDL2Java tool 35

Bbaseline updates

monitoring status from .NET client 60monitoring status from Java client 46provisioning resources 27starting from .NET client 59starting from Java client 45stopping from .NET client 59stopping from Java client 45

Cclearing update records 51client stubs

creating with .NET wsdl tool 54creating with Axis WSDL2Java tool 35

creating with wscompile tool 32components of Data Indexing API 16

DData Indexing API

clearing records 51components 16creating DataIndexingService.dll 54creating PVal objects 42creating Record objects 41exception classes 49functionality 15generating client stubs with .NET

wsdl tool 54generating client stubs with Axis tool

35generating client stubs with

wscompile 32implementation overview 18installing 19overview 14sample Java client 79starting baseline updates from .NET

Page 88: Endeca_15_DataIndexingAPIGuide

88

client 59starting baseline updates from Java

client 45starting partial updates 45stopping baseline updates from .NET

client 59stopping baseline updates from Java

client 45WSDL file 17

Data Indexing Web serviceewebservices role 20invoking in .NET client 58invoking in Java client 36stopping and starting 20webservices user 21

DataIndexing interfaceaddContent method 43, 66clearContent method 51, 67description 66getStatus method 46, 60getSystemStatus method 67startBaselineUpdate method 45, 59,

68startPartialUpdate method 45, 68stopBaselineUpdate method 45, 59,

69DataIndexing_Stub.java file, editing 34DataIndexingService class

See DataIndexing classDataIndexingService.dll file for .NET

client, creating 54DIException class

description 49, 76getMessage method 76, 77

DIInvalidOperation classdescription 49, 77

DIInvalidParameter classdescription 49, 77getMessage method 77

dimension adapter and dimension server for partial update pipeline 26

DISystemOperation classdescription 49, 78getMessage method 78

EEndeca Developer Studio

creating dimension components for pipeline 26

creating record adapter 23creating record manipulator 24creating update adapter 25

Endeca Web Studio, provisioning the system 27

error messagesgetting from exceptions 49system status 46, 60

ewebservices role 20exceptions

catching in .NET client 61catching in Java client 49

Ffile handle for queueing records 43

IIDLE system state 46, 60implementation overview for the Data

Indexing API 18incoming directory for baseline updates

27

Data Indexing API Guide Endeca Confidential

Page 89: Endeca_15_DataIndexingAPIGuide

89

installing the Data Indexing API 19

JJava client

creating Record objects 41creating stubs 32creating stubs with Axis tool 35format of incoming source records 38sample code 79

Java WSDPdownloading 30wscompile tool 31

Llocation of Data Indexing Web service,

specifying 37, 58

MMicrosoft .NET Framework

installing 53wsdl tool 54

N.NET client

adding reference libraries 56creating stubs 54development environment 53sample code 57

Ooverview of Data Indexing API 14

Ppartial updates

clearing records 51dimension pipeline component 26

format for source records 38monitoring status from .NET client 60monitoring status from Java client 46project pipeline 21queueing records 43record adapter component 23record adapter for pipeline 23record manipulator for pipeline 24starting 45update adapter for pipeline 25UPDATE_RECORD expression 24

permissions for Data Indexing Web service 20

pipeline for partial updates 21provisioning the Endeca system 27PVal class

constructor 70creating objects 42description 69getName method 70getValue method 70setName method 42, 71setValue method 42, 71

Qqueueing records for partial updates 43

Rrecord adapter

file handle 44for partial update pipeline 23

Record classconstructor 71creating objects 41description 71getValues method 72

Endeca Confidential Index

Page 90: Endeca_15_DataIndexingAPIGuide

90

setValues method 42, 72record manipulator for partial update

pipeline 24record spec of failed record, getting 48reference libraries for .NET client 56role for Data Indexing Web service 20

Ssecurity credentials for Web client

authentication 59severity level of error messages, getting

46SoapException object 61source records for partial updates 38Status class

description 72getSystemErrors method 46, 60, 73getSystemState method 46, 60, 73

system stategetting error messages from .NET

client 60getting error messages from Java

client 46IDLE 46, 60retrieving from .NET client 60retrieving from Java client 46SYSTEM_ERROR 46, 60UPDATING 46, 60

SYSTEM_ERROR system state 46, 60SystemError class

description 74getComponent method 48, 74getErrorMsg method 48, 75getRecordSpec method 48, 75getSeverity method 48, 76information in objects 46, 60

Ttomcat-users.xml file for Web service

permissions 20

Uupdate adapter for partial update

pipeline 25UPDATE_RECORD expression

expected source record format 39in partial update pipeline 24

UPDATING system state 46, 60

WWeb service, Data Indexing

See Data Indexing Web servicewebservices user for Data Indexing Web

service 21wscompile tool 31WSDL file

generating stubs with .NET wsdl tool 54

generating stubs with wscompile 32generating stubs with WSDL2Java 35location 17

WSDL2Java tool 35

Data Indexing API Guide Endeca Confidential