Return to Home Page
Overview
    History
    Acknowledgements
    Podcasts
    Notification Form
    Feedback Form
    Press Release #1
    Press Release #2
    Press Release #3

Master SOA Design
Pattern Catalog
    Master Pattern List (alphabetical)
    Master Pattern List (by category)
    Master Pattern List (Text)
    Pattern Notation
    Pattern Profiles
    Symbol Legend
    Pattern Contribution Form

SOA Candidate Patterns
    SOA Patterns Review Committee
    Candidate Patterns Overview
    Candidate Patterns List
    Candidate Pattern Contribution Form
    Candidate Pattern
Feedback Form
    SOA Pattern Template

Design Pattern Basics
    What's a Design Pattern?
    What's a Design Pattern Language?
    What's a Compound Pattern?

Supplemental
    SOA Patterns and Application Technologies
    SOA Design Patterns Historical Influences
    SOA Design Patterns and Design Principles
    SOA Design Patterns and Design Granularity
    Legal

Resources
    Design Patterns Publications
    Reference Posters
    SOAPrinciples.com
    WhatIsSOA.com
    SOA Visio Stencil


Resource Crawling (Candidate)


Home> Candidate Patterns List >Resource Crawling

How can information available across multiple services be effectively queried?  

Problem

The service consumer of a single service can rely on specific service capabilities to query relevant data, but data distributed across multiple services can be difficult to consistently query.

Solution

Develop program logic to "crawl" through the services within a service inventory by invoking fetch capabilities and following links from resource to resource. Collate the retrieved data in a centralized indexing service to support queries.

Application

This patterns is typically applied together with Entity Linking [SDP], Reusable Contract [SDP], and Lightweight Endpoint [SDP] in order to create a system of navigable resources.

Fetch methods that are included in the reusable contract are typically defined as "safe" so that services can guarantee that these fetches are read-only and do not have unanticipated side-effects.

Crawling occurs only over resources the indexing service considers relevant to its consumers. The crawling activity is seeded with known resource identifiers to ensure that it is able to locate all relevant resources. Services may provide explicit lists of resources that should be included in the crawl to ensure adequate indexing coverage. Services may also include information that excludes particular sets of resources from being included in the query.

Impacts

Service consumers are able to go to a single indexing or search service to locate information about any resource in a service inventory. Services are decoupled from query processing load and implementation details.

Indexed information may not always be current and will therefore need to be periodically refreshed. Fetches used to index resource data can place additional load on services.

Data with security requirements must be treated as such by the indexing service, or must be excluded from the scope of the crawl.

Crawling techniques can be used to pre-cache information that a consumer is likely to need next after processing information at the current resource. This can improve consumer latency for safe requests.

Principles

Standardized Service Contract , Service Composability

Architecture

Inventory, Composition, Service

Status

Under Review

Contributors

Balasubramanian, Carlyle, Pautasso
 
Crawling is a common technique to support loosely coupled indexing of resources on the Web. Search engines in particular will periodically query resources they consider relevant and index them for inclusion in search queries.

Related Patterns in This Catalog

Entity Linking (Balasubramanian, Webber, Erl, Booth), Lightweight Endpoint (Balasubramanian, Carlyle, Pautasso), Reusable Contract (Balasubramanian, Carlyle, Pautasso)



Related Service-Oriented Computing Goals

Increased Organizational Agility, Reduced IT Burden

SOA with REST This page contains excerpts from:

SOA with REST: Principles, Patterns & Constraints
by Raj Balasubramanian, Benjamin Carlyle, Thomas Erl, Cesare Pautasso





(ISBN: 0137012510, Hardcover, 400+ pages)

For more information about this book, visit
www.soabooks.com.
The Prentice Hall Service-Oriented Computing Series from Thomas Erl
Home    SOA Books    SOA Magazine    What is SOA?    SOA Principles    SOASchool.com    SOA Glossary Copyright © 2007-2011
SOA Systems Inc.