:: Site map:: Intranet
:: Research Areas:: Projects :: About CfPC:: News & Events:: Publications
  Data Management Support for
Location Based Services

Data Warehousing in Relation to LBS

Representation of Geo-Related Content in Location-Based Services

Efficient Update and Advanced Querying of Geo-Related Content

Data Warehousing in Relation to LBS
It is important for the success of a business to secure increased customer loyalty and lower customer turnover rates (known as churn-rates). Thus, it is of interest to create improved knowledge of the factors that influence the churn-rate - this leads to better possibilities for reacting to customer behavior. For example, pricing models for mobile phone usage may be designed based on detailed knowledge about customer behavior.

Mobile services of the future are expected to know and use the customers' needs and usage patterns. Today, project participant Sonofon has a service called Mobilportalen, the customers' gateway to the mobile Internet. This service may be developed to use the customer location, the time of day, and the previous queries by the user to offer services that are useful to the customer here and now. Sonofon is already logging information about the customers' use of cell phones, e.g., locations at dial-up, call frequency, call length as well as administrative customer data, providing a solid foundation for substantial Business Intelligence analyses for both short- and long-term decisions.

Existing data models used in Business Intelligence (BI) only support relatively simple data structures that are unable to capture the complex nature of advanced data types such as geographical data and sequential web- and tele-logs. The challenges in modeling geo-related data are both the representation of data as graphs, maps, and coordinates, and the imprecision that the data representation has, due to imprecision of positioning techniques, the variable temporal validity, and the varying level of detail used for positioning in different areas. The need for efficient processing of advanced queries must also be taken into account.

BI analyses have typically been used for traditional business analysis such as sales by product, area, price, customer type, etc, However, the mobile telephony business has huge and rapidly growing amounts of data that can support more advanced analyses. Long-term analyses are used for consolidation and enlargement of business areas. Short-term analyses are used in sales and call-centers, where the information is used to offer the customer exactly the offers and services that will sustain or increase customer satisfaction. Finally, mobile service analyses can be used to customize services on-the-fly based on the customer's actual situation and past behavior.

The above-mentioned analyses typically require fast response times. A well-known solution to this is to use precomputed data for faster response times. However, precomputation has so far only been used to support relatively simple queries. The future use for the complex, dynamic data types found in location-based services render the known methods for precomputation inapplicable. The project aims to develop new methods that can handle both complex and dynamic data.

Representation of Geo-Related Content in Location-Based Services
In order to being able to effectively and efficiently offer new location-based services, it is important to avoid a software development strategy and software architecture where a new, monolithic, stove-pipe-like system is developed for each new service. With such systems, there is little reuse when a new service is developed.

To obtain reuse of data across location-based services, an integrated representation, or data model, of all relevant geo-referenced content will be developed in the project. Such a data model will promote reuse of content and lower-level services when new location-based services are developed.

Project participant Euman has years of experience with the Danish (and Nordic) transportation infrastructure. The Danish Road Directorate maintains more than 1000 attribute values for each position on each road in Denmark. A substantial fraction of these need to be reflected in an integrated data model. In addition to these attributes, which are closely related to the roads themselves, the data model must capture the ``real content,'' which is much more voluminous and open-ended. For example, such content includes information about stores, e.g., their opening hours, available inventory, and current sales, and about cultural events, e.g., the artists, attendance prices, and seat availability. While most geo-related content is stationary or changes location only at discrete times, some content changes continuously. The locations of the service users is an example of the latter. Such content must also be captured in the data model.

As a futher complicating factor, it turns out that it is beneficial to maintain at least two types of representations of the same geo-referenced content: a two-dimensional native space representation where coordinates are associated with the content, and a representation of the geo-referenced content in multiple one-dimensional spaces determined by the existing transportation network.

Yet another complication occurs because uncertainty is inherent to all geo-related content and must be taken into account. For example, user locations are sampled according to a variety of protocols. Due to the sampling, complete traces of the users' movements are unavailable; rather, the service only knows the locations of the users at discrete times. Additionally, the samples themselves are imprecise. The sample imprecision is dependent on the technology used and the circumstances under which a specific technology is used.

If a precise (but incorrect) trace is maintained for each user, queries may return suboptimal results. On the other hand, if an overly imprecise record of the positions is kept, query results will also be suboptimal. Maintaining a very accurate record of each user's trace will yield the best query results, but may also lead to poor query performance and large volumes of updates. Thus, the model must maintain a representation that is adequately precise, and it must be able to maintain content with different precisions.

The envisioned data model is both conceptually simple and also permits the provision of efficient services, which involves the processing of large loads of updates and complex queries.

Efficient Update and Advanced Querying of Geo-Related Content
In location-based service scenarios that involve large amounts of content, that rely on access to up-to-date information, and where continuous variables (e.g., the locations of users) are monitored, updates represents a very substantial challenge. Due to the volumes of data, the data must be assumed to be disk resident; and to obtain adequate query performance, some form of indexing must be employed. Existing hardware and software solutions for this type of scenario can accommodate relatively few updates. This presents a serious problem for location-based services (and other services that rely on the monitoring of continuous variables via some forms of sensors).

Advanced queries include, e.g., different types of nearest-neighbor queries, monochromatic and bichromatic reverse nearest neighbor queries, and queries that retrieve travel plans based on arbitrary content. In location-based services, it is natural to not simply compute such queries once, but instead to activate such queries and then update the results when the underlying data (locations of users and other relevant content) changes.

Traditional indexing techniques necessitate explicit index updates when changes occur in the data. This renders the use of indices for moving objects either impractical or totally impossible. Two general approaches may be taken towards accommodating continuous change in the indexed data. Techniques may be applied that (i) create less updates, or (ii) the existing techniques may be enhanced to support rapid, non-bulk update. Techniques for advanced query processing, and that exploit indexing techniques, are largely unexplored. The use of approximation techniques in relation to both updates and queries also represents a very interesting direction.



   CfPC©, updated: 14-nov-05