POSTECH Electronics

Exploiting parallelism in processing large scale multi-dimensional datasets

2017-06-09

▣ Title : Exploiting parallelism in processing

large scale multi-dimensional datasets

▣ Speaker

: Beomseok Nam (UNIST Assistant Professor)

▣ Date

& Time : Friday, March 22 (2:00 ~ 3:30pm)

▣ Place

: LG Research Building, Room #101

▣ Host

: Prof. Sungjoo Yoo (Tel. 2379)

▣ Abstract :

This

talk will present two different ways of exploiting parallelism in processing

large-scale multi-dimensional datasets. The general purpose computing on

graphics processing unit (GP-GPU) has emerged as a new cost effective parallel

computing paradigm in high performance computing research that enables large

amount of scientific data to be processed in parallel. A common access pattern

into such scientific data analysis applications is multi-dimensional range

query, but inherently multi-dimensional indexing trees such as R-Trees are not

well suited for GPU environment because of their irregular tree traversal

patterns. Traversing irregular tree search path makes it hard to maximize the

utilization of massively parallel processing units in GPU. In this talk, I

would introduce two novel R-tree traversal algorithms for traversing

multi-dimensional indexes, which convert recursive access to sequential access

into hierarchical tree nodes.

The

second half of this talk would discuss how to leverage cached data in

distributed cache infrastructure using task parallelism. As more servers are added to distributed and

parallel systems, larger memory space becomes available for caching data

objects. However the cached objects are dispersed and traditional query

scheduling policies that take into account only load balancing do not

effectively utilize the increased cache space. This talk would introduce and

compare batch job scheduling policies that employ statistical prediction

methods and probability distribution estimations derived from recent queries in

order to improve both load balancing and cache hit ratio in shared-nothing

environment.

List