Meeting 30 May: “Distributed Algorithm for Large-Scale Generalized Matching”

May 30th, 2013

On May 30th, we have our next DBRG meeting.
Julian Mestre will talk about his recently accepted VLDB2013 paper “A Distributed Algorithm for Large-Scale Generalized Matching”

You can access his paper here:

http://sydney.edu.au/engineering/it/~mestre/papers/massive_matching.pdf

Meeting 23 May: “Identifying Hot and Cold Data in Main-Memory Databases”

May 22nd, 2013

Our next DBRG meeting is Thursday, 23 May, at 3pm in SIT room 459.

Paul will present the partner-paper from last week’s talk also from ICDE2013:
“Identifying Hot and Cold Data in Main-Memory Databases”
by Justin J. Levandoski, Per-Ake Larson, and Radu Stoica

available here:

http://research.microsoft.com/apps/pubs/default.aspx?id=176690

Meeting 16 May: BW-Tree

May 16th, 2013

Our next DBRG meeting is Thursday, 16 May, at 3pm in SIT room 459.

Ian will present an interesting paper from ICDE2013:
“The Bw-Tree: A B-Tree for New Hardware Platforms”, in Proceedings of the IEEE International Conference on Data Engineering, ICDE 2013,
Brisbane Australia, April 2013.
available here: http://research.microsoft.com/en-us/um/people/justinle/publications.html

The Bw-Tree is similar to a traditional B+-Tree indexing structure, however it is more suited to modern hardware. The structure is latch-free meaning that threads do not block waiting to enter critical sections, it does not involve in-place updates which helps preserve processor caches, and it is designed to work well with flash storage as the secondary storage medium so that slower disk storage is not needed.

Meeting 18 Apr 2013

April 16th, 2013

The next DBRG meeting is on 18 April, 3pm, in room 459.
Uwe and Alan will give a trip report for last week’s ICDE 2013.

Further, Uwe will also give an overview of the following paper:
Database engines on multicores, why parallelize when you can distribute?” by Tudor Salomie, Ionut E. Subasu, Jana Giceva, and Gustavo Alonso, Euro-Par 2011.

CU on Thursday

The Adaptive Radix Tree: ARTful Indexing for Main-Memory Database

April 4th, 2013

We have another visitor talk today at 11am in room 459:

“The Adaptive Radix Tree: ARTful Indexing for Main-Memory Database”

Viktor Leis
(TU Munich)

Abstract:
Main memory capacities have grown up to a point where most databases fit into RAM. For main-memory database systems, index structure performance is a critical bottleneck. Traditional in-memory data structures like balanced binary search trees are not efficient on modern hardware, because they do not optimally utilize on-CPU caches. Hash tables, also often used for main-memory indexes, are fast but only support point queries.
To overcome these shortcomings, we present ART, an adaptive radix tree (trie) for efficient indexing in main memory. Its lookup performance surpasses highly tuned, read-only search trees, while supporting very efficient insertions and deletions as well. At the same time, ART is very space efficient and solves the problem of excessive worst-case space consumption, which plagues most radix trees, by adaptively choosing compact and efficient data structures for internal nodes. Even though ART’s performance is comparable to hash tables, it maintains the data in sorted order, which enables additional operations like range scan and prefix lookup.

Massively Parallel Stream Processing with Stratosphere in the Cloud

April 4th, 2013

Our next meeting is a visitor talk, this Thursday at 11am in the SIT boardroom (note the different time and location):

“Massively Parallel Stream Processing with Stratosphere in the Cloud”

Matthias Sax
(Humboldt University Berlin)

Abstract:
“Big Data” was recently characterized by Stronebraker by the 3 Vs: Volume, Velocity, and Variety. One requirement of velocity is low latency processing. MapReduce systems which are very popular in the big data domain are a good solution to tackle high data volumes. However, being batch oriented they do not provide low latency.

In our work we want to extend MapReduce-like Stratosphere system for stream processing. In a first step we extend the PACT programming model with sliding-window semantics (including hybrid operators, e.g., for joining batch with streaming data). Next, we want to exploit the flexibility in stream processing for support dynamic scaling (elasticity) and adaptive optimization in a compute cloud. Additionally, the current fault-tolerance mechanism do not work in the streaming case and need to be adopted.

Meeting 28 Mar 2013

March 26th, 2013

Our next meeting is this Thursday at 3pm in SIT room 459.
Michael Cahill will talk about an interesting paper from the
CIDR 2013 conference earlier this year:

“Executing Long-Running Transactions in Synchronization-Free Main Memory Database Systems”
by Henrik Mühe (TU Munich); Alfons Kemper (TU Munich); Thomas Neumann (TU Munich)

http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper49.pdf

Hope to CU on Thursday

Meeting 19 Aug 2010

August 17th, 2010

Dear everyone,
 
The next DBRG meeting will be held on Thursday, 19 Aug, around 3pm in SIT Boardroom 124. The speaker will be Uwe, and he will discuss the following paper that will appear at VLDB 2010:
MRShare: Sharing Across Multiple Queries in MapReduce

Please feel free to join and hope to see you all.

Meeting 21 April 2010

April 20th, 2010

The next DBRG meeting is on Wednesday, 21 April, at 10am in room 459.
Uwe will present a paper from the upcoming 2010 SIGMOD on benchmarking cloud infrastructures, which investigates the performance, scalability and costs of several big cloud infrastructures, including Amazon’s S3, Google AppEng and Microsoft’s Azure. I think it will be quite an interesting read and would expect quite a controversial discussion at SIGMOD about it.

Please feel free to join

Meeting 30 October

October 29th, 2009

The next DBRG meeting is on Friday, 30 October, at 11am in room 459.
Akon Dey will present the following CACM paper, which describes an attempt to use a declarative language (and db techniques of optimizing the processing of such) to describe network protocols. This paper was in the most recent CACM as a “research highlight”.

Please feel free to join