GIS-Enabling Your Oracle Data Warehouse
Using Oracle Locator and Oracle Map Viewer, it’s easy to add GIS analysis to your Oracle data warehouse.
Introduction
Two of the most common dimensions in Oracle data warehouses are time and geography. The time dimension allows us to look for trends and compare facts over different time spans, and geography tells us where our customers and suppliers are located, and where our best and worst performing markets are. Most of the information that organizations hold contain some form of geographic data in the form of postcodes, zip codes, city names, county names and so on; and, in some cases, this data is augmented with demographic data sets from suppliers such as Experian and CACI.
Most DBAs will be aware of GIS, or Geographic Information Systems, that typically extract data out of data warehouses, take the limited amount of geographic information that is stored in the warehouse, enhance it to include full latitude and longitude map reference, and display the data set in a series of maps. Such GIS systems are typically used in government departments, defense, homeland security, retail and store planning, in fact any type of organization that wishes to understand spatial relationships within their data.
Knowing the location of a transaction, customer, or retail store not only provides us with a billing or mailing address, but it helps us reveal trends around market share, population, or customer concentration, and helps us understand where our next potential market might be. Indeed, there are significant parallels between GIS and business intelligence, as highlighted in “Real-Time Data Warehousing: GIS” by Simon Terr: both GIS and BI feature the concept of “drill-down,” with GIS allowing you to “zoom” to see more detail in a map, and BI allowing you to “drill in to” lower-level transaction detail in a report. Both feature parameterization, and both provide the ability to “drill-across, invoking reports from other reports based on common information, and invoking maps from reports based on a common piece of location data.
The historical problem with GIS systems is, however, that they tend to create “stovepipes” of information. Applications built using GIS applications from vendors such as ESRI and MapInfo usually use the vendor’s own proprietary database, locking the data in and only providing access through the vendors’ own API. DBAs who’ve looked at this area might also be vaguely be aware that Oracle has for several years had a GIS capability within the Oracle RBDMS, known as the Spatial Option; but, what isn’t so widely known is that all recent versions of the Oracle RDBMS have had a basic, free, GIS ability within both the standard and enterprise edition which can be quickly used to enhance existing databases and data warehouses with GIS, spatial and location-based features. Such GIS-enabled data warehouses can then be used as the data store for third party GIS and mapping applications — removing the dependence on proprietary, “stovepipe” GIS database. Or, you can use Oracle MapViewer, a component of Oracle Application Server 10g, to easily add mapping functionality to your existing Oracle BI and warehousing applications.
GIS Within the Oracle Database
Moving your GIS data from proprietary databases to the Oracle database has several benefits. It allows you to integrate spatial or location-based data with your core business data, provides a common environment for spatial and tabular-based data, and provides standard-based access through SQL, ODBC and JDBC. It allows you to deploy your applications using a single business architecture, and lets your GIS applications take advantage of the security, scalability and resilience of the Oracle database. Best of all, all of the key GIS application vendors such as ESRI, Intergraph, Autodesk, and MapInfo natively support the Oracle database as a data source for their geographic data, using the two key GIS features in the Oracle database, Oracle Locator and Oracle Spatial.
Oracle Locator, free within both standard and enterprise editions of Oracle 9i and Oracle 10g, provides core location-based functionality needed for GIS applications. Developers can add GIS functionality to existing Oracle application since, with Locator, they can easily incorporate location information directly in their applications and services. This is possible because location data is fully integrated in the Oracle server itself. Geographic and location data are manipulated using the same semantics applied to the CHAR, DATE, or INTEGER types that are familiar to all users of SQL. Oracle Locator is usually sufficient for most departmental GIS applications, and allows you to perform queries such as “which customers are within 10 miles of this store,” and “which households would be affected by the relaying of this piping.”
Oracle Spatial is a pay-extra option for Oracle Enterprise Edition that extends the functionality of Oracle Locator, and provides a robust foundation for complex GIS applications, which require more spatial analysis and processing in Oracle Database. It includes spatial functions (including area, buffer, centroid calculations), advanced coordinate systems support, linear referencing system, and aggregate functions. Significant new capabilities in the latest 10g release address the particular high-end requirements from the public sector, defense, logistics, energy exploration, and business geographic domains such as geocoding, topology and network modeling, spatial analytics, and georaster management.
The key feature with both Oracle Locator and Spatial is that they store spatial data directly in the database, using native spatial datatypes, spatial indexes, and an open SQL interface.
Figure 1: Spatial Access through SQL.
Importantly, Oracle implements these features so that they comply with industry standards — so that you don’t go from one lock-in to another — and supports the new OGC Geographic Markup Language (GML) as well as Open Location Service interfaces. The object-relational model used for geometry storage by Oracle Spatial and Oracle Locator also conforms to the specifications associated with SQL92 representation of points, lines, and polygons.
Remember, Oracle Locator is free, so you can start using it now without having to worry about Oracle Spatial licenses; you’ll only need licenses for the latter if your GIS application depends on specific high-end spatial database features.
GIS-Enabling Your Data
So, starting with an example that uses just Oracle Locator, how do we start to take advantage of these features?
On point to note before we start is that both the Oracle Locator and Oracle Spatial features rely on latitude and longitude data being available for the records we wish to analyze. As most databases only store data using zip or post codes, some means of converting these into map references is needed (usually referred to as Geocoding). This can be accomplished in several ways; a common way of doing this is to purchase datasets, sometimes together with demographic information, from suppliers such as Navteq, Tele Atlas, Experian, CACI, or Trillium. Also, Oracle 10g Spatial provides an embedded Geocoding engine and an open geocode schema, which are currently used by Navteq and Tele Atlas, and can be accessed using Oracle Warehouse Builder.
Alternatively, there are resources on the Internet that convert zip or post codes to latitude and longitude, including the U.S. Census Bureau Gazatteer files for the USA and sites such as this free site for UK postcodes.
Moving on to the example, you can have a look at using Oracle Locator by using the sample data within the Oracle 9i and 10g OE schema. This example looks at a scenario where an organization has several warehouses and needs to locate those customers who are near a given warehouse to inform them of new promotions. To locate its customers and perform GIS analysis, it must store location data for both its customers and warehouses.
The example uses the CUSTOMERS and WAREHOUSES tables within the OE schema, which already have columns for location data.
SQL> DESC CUSTOMERS; NAME NULL? TYPE ----------------------------- -------- ---------------------- CUSTOMER_ID NOT NULL NUMBER(6) CUST_FIRST_NAME NOT NULL VARCHAR2(20) CUST_LAST_NAME NOT NULL VARCHAR2(20) CUST_ADDRESS CUST_ADDRESS_TYP PHONE_NUMBERS PHONE_LIST_TYP NLS_LANGUAGE VARCHAR2(3) NLS_TERRITORY VARCHAR2(30) CREDIT_LIMIT NUMBER(9,2) CUST_EMAIL VARCHAR2(30) ACCOUNT_MGR_ID NUMBER(6) CUST_GEO_LOCATION MDSYS.SDO_GEOMETRY SQL> DESC WAREHOUSES NAME NULL? TYPE ----------------------------- -------- ---------------------- WAREHOUSE_ID NOT NULL NUMBER(3) WAREHOUSE_SPEC SYS.XMLTYPE WAREHOUSE_NAME VARCHAR2(35) LOCATION_ID NUMBER(4) WH_GEO_LOCATION MDSYS.SDO_GEOMETRY SQL>
Taking a closer look at the CUST_GEO_LOCATION column from the CUSTOMERS table, we see that it is described by the following:
CUST_GEO_LOCATION(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO,
SDO_ORDINATES)
From this description, we can determine several characteristics of the data being stored:
- SDO_GTYPE describes the type of geometry (point, line string, or polygon)
- SDO_SRID is used to identify the coordinate system
- SDO_POINT describes the latitude and longitude
- SDO_ELEM_INFO describes how to use the numbers stored in the SDO_ORDINATES section
- SDO_ORDINATES describes the coordinate values that make up the boundary of a spatial object
Examining a single column's data (SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(6.1667, 46.2, NULL), NULL, NULL)), note that only the first three SDO_GEOMETRY attributes are mandatory.
Next, we have to update the USER_SDO_GEOM_METADATA view, which tells a feature known as the “Spatial Index” what tables are going to contain location-based data.
DELETE FROM USER_SDO_GEOM_METADATA WHERE TABLE_NAME = 'WAREHOUSES' AND COLUMN_NAME = 'WH_GEO_LOCATION' ; INSERT INTO USER_SDO_GEOM_METADATA (TABLE_NAME, COLUMN_NAME, DIMINFO, SRID) VALUES ('WAREHOUSES', 'WH_GEO_LOCATION', MDSYS.SDO_DIM_ARRAY (MDSYS.SDO_DIM_ELEMENT('LONG', -180.0, 180.0, 0.005), MDSYS.SDO_DIM_ELEMENT('LAT', -90.0, 90.0, 0.005) ), 8307); COMMIT; DELETE FROM USER_SDO_GEOM_METADATA WHERE TABLE_NAME = 'CUSTOMERS' AND COLUMN_NAME = 'CUST_GEO_LOCATION' ; INSERT INTO USER_SDO_GEOM_METADATA (TABLE_NAME, COLUMN_NAME, DIMINFO, SRID) VALUES ('CUSTOMERS', 'CUST_GEO_LOCATION', MDSYS.SDO_DIM_ARRAY (MDSYS.SDO_DIM_ELEMENT('LONG', -180.0, 180.0, 0.005), MDSYS.SDO_DIM_ELEMENT('LAT', -90.0, 90.0, 0.005) ), 8307); COMMIT;
We then create the Spatial Indexes:
DROP INDEX WAREHOUSES_SIDX; CREATE INDEX WAREHOUSES_SIDX ON WAREHOUSES(WH_GEO_LOCATION) INDEXTYPE IS MDSYS.SPATIAL_INDEX; DROP INDEX CUSTOMERS_SIDX; CREATE INDEX CUSTOMERS_SIDX ON CUSTOMERS(CUST_GEO_LOCATION) INDEXTYPE IS MDSYS.SPATIAL_INDEX;
We can now issue a simple query, for example, to show us all the customers within 20km of a particular latitude and longitude.
SELECT A.CUST_LAST_NAME FROM CUSTOMERS A WHERE SDO_WITHIN_DISTANCE(A.CUST_GEO_LOCATION, MDSYS.SDO_GEOMETRY (2001, 8307, MDSYS.SDO_POINT_TYPE (6.1667, 46.2, NULL), NULL, NULL), 'DISTANCE=20000 UNIT=METER') = 'TRUE'; CUST_LAST_NAME -------------------- SIEGEL BARKIN DREYFUSS
Or, to make things more interesting, display a list of the ten closest customers to our warehouse, where the warehouse ID is 2, and display exactly how far each customer lives from the given warehouse in metres.
The preceding simple query is interesting, but what we really want to extract from the location and attribute information is a result set that tells us how our customers might interact with our warehouses. Therefore, the following SQL query finds the names of the ten closest customers to the warehouse where WAREHOUSE_ID=2, and exactly how far customers live from the given warehouse (in metres):
SELECT /*+ORDERED*/ C.CUSTOMER_ID, C.CUST_FIRST_NAME, C.CUST_LAST_NAME, SDO_NN_DISTANCE (1) DISTANCE FROM WAREHOUSES W, CUSTOMERS C WHERE W.WAREHOUSE_ID = 2 AND SDO_NN (C.CUST_GEO_LOCATION, W.WH_GEO_LOCATION, 'SDO_NUM_RES=5', 1) = 'TRUE' ORDER BY DISTANCE; CUSTOMER_ID CUST_FIRST_NAME CUST_LAST_NAME DISTANCE ----------- -------------------- -------------------- ---------- 258 Ellen Palin 2341418.86 255 Brooke Shepherd 2353370.47 254 Bruce Bates 2356071.22 140 Claudia Kurosawa 2356686.58 169 Dheeraj Davis 2502955.93
So, given this ability for the basic Oracle RBDMS to carry out location-based analysis, and store geographic data natively within the database, how can we use it to enhance our BI and warehousing applications
Location-Enabling Your Reporting
Seeing as Oracle Locator and Oracle Spatial have an SQL interface, it is relatively easy to add basic GIS functionality to Oracle Reports, Oracle Portal and Oracle Discoverer, as each of these allow you to submit your own custom query SQL that can take advantage of Locator and Spatial constructs.
For example, in Oracle Discoverer, you can create a custom folder based on an SQL statement such as this:
SELECT /*+ORDERED*/ C.CUSTOMER_ID, C.CUST_FIRST_NAME, C.CUST_LAST_NAME, C.ACCOUNT_MGR_ID, W.WAREHOUSE_ID, SDO_NN_DISTANCE (1) DISTANCE FROM WAREHOUSES W, CUSTOMERS C WHERE SDO_NN (C.CUST_GEO_LOCATION, W.WH_GEO_LOCATION, 'SDO_NUM_RES=10', 1) = 'TRUE' ORDER BY DISTANCE;
By adding the warehouse ID and the account manager ID to the query, this lets us run it dynamically based on any warehouse or account manager. This custom folder can then be joined to any other folder, to allow us to incorporate this location-aware data into any report.
For more details, take a look at “Discoverer Administrator — How to Use Discoverer with Oracle Spatial” on OTN which looks in more detail at adding spatial queries to Discoverer.
Adding Mapping to Your Data Warehouse
Probably the most significant enhancement that you can make when using location-based data is to present data to users in the form of maps. Maps make it very easy to spot clusters of customers and transactions, to see how your market is defined, and when used in conjunction with themes and layers, to quickly visualize hot spots, areas of high and low sales and areas with untapped markets.
With your location data held within the Oracle database, you can use either one of Oracle’s third-party GIS partner mapping applications to analyze your data, or you can use the new Oracle Application Server MapViewer service in Oracle Application Server 10g. Oracle MapViewer is an Oracle Application Server Java component and JDeveloper extension, used for map rendering and viewing geospatial data managed by Oracle Spatial or Locator.
MapViewer communicates with clients via XML, and it works on the principle that anything that can issue a request and receive a response in XML can essentially become a MapViewer client. Furthermore, because MapViewer is a J2EE service, it comes with a Java API as well as a JSP tag library.
Figure 2: Oracle Map Viewer.
Like other map-rendering tools, MapViewer uses the concepts of styles (colors, markers, lines, areas, text, symbolism, and advanced styles) and themes (sometimes called layers) to create dynamic maps. The definitions of these attributes, as well as actual map definitions, are stored as XML in the database along with the location information.
Building Map Viewer maps is beyond the scope of this article; however, there are plenty of good resources on building Oracle Map Viewer maps available on the Internet, including Justin Lokitz and Madeline Alameda’s “Use Location Information in Enterprise Reporting” article and Justin’s Oracle GIS Weblog. You can also view screenshots of sample Oracle Map Viewer maps on the OTN Map Viewer Web site.
Conclusions
Virtually all customer and transaction databases, data warehouses and data marts contain location-based data. It is everywhere, and can be used to provide valuable insight into market trends, availability of new markets and opportunities, and can serve as a common data set to weave together diverse data sets.
The Oracle RDBMS can server as a secure, scalable and open repository for your location-based data, with most organisation’s needs met by the free Oracle Locator database feature and with Oracle Spatial for more demanding spatial analysis needs. It is easy to enhance your datasets with location-based data, and if you have licensed Oracle Application Server and JDeveloper, it is a straightforward process to add mapping functionality to your business intelligence applications.
For more details on Oracle Locator and Oracle Spatial, be sure to check out my Oracle Weblog and Justin Lokitz’s Oracle GIS Weblog, take a look through the OTN Locator and Spatial and Map Viewer homepages, and check out the range of GIS partners who support these Oracle technologies.
--
Mark Rittman is a Certified Oracle Professional DBA and works as a Consulting Manager at SolStonePlus, specializing in developing BI and DW applications using the Oracle Database, Application Server, Discoverer, Warehouse Builder, and Oracle OLAP. Outside of Plus Consultancy, Mark chairs the UKOUG BI and Reporting Tools SIG, and runs a Weblog dedicated to Oracle BI and Data Warehousing technology. Mark recently was awarded an OTN Community Award for contributions to the Oracle developer community, and is a regular speaker at Oracle User events in both the UK and Europe.
Mark would also like to thank Justin Lokitz for his collaboration with this article. Justin is a Senior Sales Consultant at Oracle Corporation specializing in GIS and J2EE development.
Contributors : Mark Rittman
Last modified 2006-02-08 02:54 PM