Find White Papers
Home About Contact Help
Free Membership Member Login
Search the Library                  Advanced Search

Data Replication and Data Sharing: Integrating Heterogeneous Spatial Databases

Safe Software
By : Safe Software
INFORMATION
Published : Nov 07, 2005
Length : 12
Type : White Paper
 
Download Now
Save for Later
  Email This Page
Overview :
Spatial data warehouses are becoming more common as government agencies, municipalities, utilities, telcos and other spatial data users start to share their data. This paper illustrates some of the issues that arise when undertaking data replication and data sharing.
View All Items By This Company
Browse Related Categories :

Data Replication

,

Data Warehousing

 
Spatial data warehouses are becoming more common as government agencies, municipalities, utilities, telcos and other spatial data users start to share their data. Data sharing is driven by the need to maintain more accurate and up-to-date spatial databases, but at the same time reduce data acquisition and maintenance costs. In other cases, organizations may maintain identical databases at different locations in order to reduce network loads and improve response times for the data users who are spread over a wide area. In this case, data replication is used to ensure all users are working from identical and most current data. This paper illustrates some of the issues that arise when undertaking data replication and data sharing.

Introduction

Location and spatial data is becoming a core part of business databases and decision-making. This growth in the use of spatial data has increased the need to share data with other organizations. Data sharing is driven by the need to maintain more accurate and up-to-date spatial databases, but at the same time reduce data acquisition and maintenance costs. In other cases, organizations may maintain identical databases at different locations in order to reduce network loads and improve response times for the data users who are spread over a wide area. In these cases, data replication is used to ensure all users are working from the most current data. Data may also be shared by linking several heterogeneous spatial databases through a common data access portal over a LAN or intranet. In all cases, the goal is to improve the accessibility of the spatial data, improve data quality and reduce the cost of maintaining the datasets involved.

The three broad approaches to sharing data are:

- Data Sharing. Data sharing is a data warehousing approach to making data available to a wider range of users. Data is acquired from several data owners and loaded into a centralized warehouse. Data can then be distributed to members of the data-sharing consortium through

Web-Based data viewers, or delivered in different formats to the various data users.

- Data Replication. Data replication is used generally used where large numbers of data users who are spread over a wide geographic area require real-time access to the same data. To reduce network loads and improve data access performance the data is replicated over several databases at different locations. The databases are synchronized on a regular basis, usually nightly.

- Distributed Data Access. In this case the data warehouse simply acts as a node for data distribution. Data is held on the data owner's server, and the data warehouse acts as a live link to the data provider's datasets across a LAN, WAN or Intranet. Since the different databases may be in different formats (ESRI ArcSDE, Oracle Spatial, etc.) the Spatial ETL tool must be capable of reading all the formats to be accessed and served. There is no need to maintain multiple copies of the data, as is the case in data replication and data sharing.

This paper illustrates some of the issues that arise when undertaking data replication, data sharing or distributed data access.

The Challenge of Sharing Data and Replication

Once organizations agree to share or replicate their spatial data, they face the challenge of maintaining up-to-date datasets. Spatial data is changing continuously as new infrastructure, subdivisions or more accurate data is collected. To maintain up-to-date databases the various data "owners" must exchange their most current datasets with those they share their data with. This can be done in one of two ways: - Complete data load. This is the most straightforward approach. The current dataset is removed and completely replaced with the new dataset. However, this approach is often impractical due to volume of data, which may be difficult to distribute and take a prohibitively long time to reload, resulting in the database being inaccessible to the users for extended periods of time.

- Change only updates. This approach requires smaller data volumes to be distributed as only the records that have changed (modifies, deletions and additions) are exchanged. Change only updates also reduce the time for the data load because of the smaller data volume. The update process is more complex than the complete data load approach.
Search the Library                  Advanced Search
About Us Contact Us List Your Papers Partner With Us Site Map