Call for Grand Challenge Solutions

The DEBS Grand Challenge is a series of competitions that started in 2010, in which both participants from academia and industry compete with the goal of building faster and more scalable distributed and event-based systems that solve a practical problem. Every year, the DEBS Grand Challenge participants have a chance to explore a new data set and a new problem and can compare their results based on the common evaluation criteria. The winners of the challenge are announced during the conference. The 2021 DEBS Grand Challenge focuses on environmental data from a large number of geo-distributed air quality sensors. The goal of the challenge is to detect geographical areas where the air quality has changed most significantly due to lock-down measures during the first wave of SARS-CoV-2 infections.

Topic

The outbreak of COVID-19 (caused by the SARS-CoV-2 virus) has disrupted the world. In March, significant parts of the world came to a standstill when governments announced their lockdown. This lockdown had multidimensional effects, and it affected everyone and everything differently. One rather surprising effect of the lockdown is that it improved the air quality due to reduced mobility and traffic. Air Quality Index (AQI) was detected to be improved throughout the world [1][2] during the lockdown. In this year’s Grand Challenge, we will use a air quality dataset to find which areas improved the most in terms of AQI index compared to the average AQI of the previous year.

Participation

Participation in the DEBS 2021 Grand Challenge consists of three steps: (1) registration, (2) iterative solution submission, and (3) paper submission. The first step is to pre-register your submission by registering your abstract at EasyChair (https://easychair.org/my/conference?conf=debs2021) in the "Grand Challenge Track" and send an e-mail to one of the Grand Challenge Co-Chairs (see https://2021.debs.org/organizing-committee/). Solutions to the challenge, once developed, must be submitted to the evaluation platform in order to get it benchmarked in the challenge. The evaluation platform provides detailed feedback on performance and allows to update the solution in an iterative process. A solution can be continuously improved until the challenge closing date. Evaluation results of the last submitted solution will be used for the final performance ranking. The last step is to upload a short paper (minimum 2 pages, maximum 6 pages) describing the final solution to the EasyChair system (a link will be provided at a later point in time). All papers will be reviewed by the DEBS Grand Challenge Committee to assess the merit and originality of submitted solutions. All solutions of sufficient quality will be presented during the poster session at the DEBS 2021 conference.

Dataset

DEBS Grand Challenge 2021 uses Luftdaten dataset (https://luftdaten.info/). Luftdaten (German for “air data”) is a community-driven network of sensors that measures environmental data. The Luftdaten network consists of more than 11,000 sensors installed in 70+ countries around the globe, which has generated around 10B measurements to date. We preprocessed the raw data by filtering events which have missing values, such as missing location information or missing particle measurements. There might be gaps in the data, but in general the data quality is very high. The raw data is mirrored on the challenge website for download. The data should be consumed through an API. The API-Key is provided through the grand challenge platform.

message Measurement {
	Timestamp timestamp = 1; 
	float latitude = 2;
	float longitude = 3;
	float p1 = 4; //Particles < 10µm (particulate matter)
	float p2 = 5; //Particles < 2.5µm (ultrafine particles)
}

The API provides the data in batches. For testing, you can choose a batch size which fits your connection speed up to a maximum size of 20k. A batch contains two fields with lists of Measurements. The field current contains the measurements from this year and lastyear contains the corresponding measurements 365 days ago. Keep in mind that 2020 is a leap year. The amount of events from current and lastyear are equal to the batchsize. The flag “last” is set to true when the last event was sent.

message Batch {
	int64 seq_id = 1;
	bool last = 2; //Set to true when it is the last batch
	repeated Measurement current = 3;
	repeated Measurement lastyear = 4;
}

Further, we provide a dataset of zip codes and the corresponding polygons of Germany through our API.

message Point { double longitude = 1; double latitude = 2;}
message Polygon { repeated Point points = 1; }
message Location {
	string zipcode = 1;
	string city = 2;
	double qkm = 3;
	int32 population = 4;
	repeated Polygon polygons = 5;
}

The final evaluation will be conducted on VMs provided by the challenge organizers with the same configuration (batch size, date from and to) for every team.

Query 1

The query returns the top 50 cities in terms of air quality improvement and additionally their current air quality indices. First we introduce the watermark for the time-based windows, then the Air Quality Index (short AQI), then how the improvement compared to the previous year is calculated, finally which additional data should be added to the result.
The watermarks for the result at the end of the batch is the timestamp of the last measurement from the current values. The results which are returned are taken from windows sized relative to the watermark.
The AQI for each city is calculated from the average particles over a sliding window of 24h. The windows are sized relative to the watermark for the result of the batch, or, when a snapshot is taken, relative to the point in time of a snapshot. For the current window this means that the latest event is watermark - 24h and for the previous year this means that measurements < watermark - 365days - 24h fall out. The average particles over the sliding window are then mapped to the corresponding AQIp1 and AQIp2 using a lookup table and a formula. The table differs for p1 and p2 values. The AQI is then the maximum of the AQIp1 and AQIp2. Further details about the value and the resources on how to calculate it, see “How is the AQI being calculated?”
To compute the improvement, every 5 minutes, a snapshot of the AQI of the current year and the AQI of the previous year is taken and inserted in a separate window. The final average AQI for the final result is derived from windows sized relative to the watermark. For the current AQI average this would be the time of the snapshot - 5 days, for the previous AQI average window this would be time of snapshot - 365 days - 5 days.
A batch can contain less than a few seconds of data or more than 5 minutes of data depending on the batch size. The snapshots are taken independently of the batch size every 5 minutes based on the timestamps of the current measurements, so at 8:00, 8:05, 8:10, … Every time the maximum of AQIp1 and AQIp2 of both the historic and the current windows are taken from windows resized according to the snapshot time. For the current window this means snapshot time - 24h, for the last year window snapshot time - 365D - 24h.
Finally, when all measurements of the batch are processed, the results are summarized. Only active cities should be in the result. The criteria for active is that at least one measurement (current) has been received in the last 10 minutes. Then rank the cities according to the improvement of the 5-Day Average AQI compared to the previous year (derived from a window sized relative to the last measurement). Additionally, the current AQIp1 and AQIp2 are added per city. In case no result can be computed since less than 5 minutes of events are processed, then an empty list is expected; the same if no sensor is active.

message TopKCities {
 int32 position = 1; //begin with 1
 string city = 2; //full name of the city according the locations
 int32 averageAQIImprovement = 3; //5D average improvement compared to previous year
 int32 currentAQIP1 = 5; //current AQI for P1 values (based on 24h average)
 int32 currentAQIP2 = 6; //current AQI for P2 values (based on 24h average)
} //(all AQI values are first rounded to 3 digits, then multiplied by 1000)
message ResultQ1 {
 int64 benchmark_id = 1;
 int64 batch_seq_id = 2;  //both are used to correlate the latency
 repeated TopKCities topkimproved = 3; //top 50 improved cities
}

Query 2

The result of this query is a histogram of the longest streaks of good air quality. When the histogram is shown on a dashboard, one might conclude that for example 4.3% of the cities have had good air quality for the last 7 days. A streak is defined as the time span in seconds since a city flipped to a “good” value according to the AQI index calculation (or in case the city is good from the beginning, then the start of the operator). In case a city has bad air quality, then goes “good” again, a new streak begins. By that definition, only cities which are currently “good” are included. All others are considered bad.
The histogram has 14 buckets of equal length between 0 and the maximum length. Since the query should output values from the start, the maximum length is initially the time span between the first measurement received and the last measurement processed. After one week of data is processed, the length is limited to 7 days or 604800 seconds.
Only cities which are active should be considered in the buckets (see: “When is a region or city active?”) For input, only consider the measurements from the field current from the batch.

message TopKStreaks {
 int32 bucket_from = 1; //First bucket begins with 0
 string bucket_to = 2; //First bucket ends with maximum_length/20
 int32 percent = 3; //Percent of active cities in this bucket *1000
} //(percent is first rounded to 3 digits, then multiplied by 1000)
message ResultQ2 {
 int64 benchmark_id = 1;
 int64 batch_seq_id = 2;
 TopKStreaks topkstreaks = 3;
}

Both queries Q1 and Q2 should run in parallel. Additional clarifications to queries will be provided here: https://challenge.msrg.in.tum.de/documentation

Definitions

What is a city? Over the API you can retrieve a list of Locations. A city has multiple zip codes. Each zip code has multiple polygons. A polygon is a list of points that make an enclosed region and a point is a pair of longitude and latitude values. As each polygon is closed, it implies that the first point is the same as the last point. The polygons only cover Germany since most sensors are deployed in that region. If a sensor reading is not in a polygon, it is discarded.

When is a city or region active? A city is active when in the last 10 min at least one measurement was received for the current year.

What exactly is the watermark? The watermark is the maximum timestamp received in the current events. In case no current events are available (this can be the case when small batch sizes are chosen) and only lastyear event are available, the watermark is the maximum timestamp of lastyear + 365 days. Keep in mind that the timestamp includes two fields, seconds and nanoseconds since epoch. Parts of the raw data do not contain nanoseconds, some may. In the conversion from the timestamp in the events to the date time data structure, also consider nanoseconds.

How is the AQI calculated? The AQI can be calculated through multiple pollutants like O3 parts per billion (ppb), the density of suspended particulate matter with the diameter of 2.5 µm (PM2.5 or simply P2), the density of suspended particulate matter with the diameter of 10 µm (PM10 or simply P1), CO parts per million (ppm), SO2 ppb and NO2 ppb. If there are multiple pollutants values available, you should calculate AQI from all pollutants and take the value which is highest. In this challenge, we will only use P2 and P1 to calculate AQI using the following formula devised by the United States Environmental Protection Agency (EPA). In the link you find the lookup tables and the formula to determine the AQI.

AQI Equation and lookup tables:
https://www.airnow.gov/sites/default/files/2018-05/aqi-technical-assistance-document-may2016.pdf

Awards and Selection Process

Participants of the challenge compete for two awards: (1) the performance award and (2) the audience award. The winner of the performance award will be determined through the automated evaluation platform, according to the evaluation criteria specified below. Evaluation criteria measure the speed and correctness of submitted solutions. The winning team of the performance award will receive 1000 USD as prize money. The winner of the audience award will be determined amongst the finalists who present in the Grand Challenge session of the DEBS conference. In this session, the audience will be asked to vote for the solution with the most interesting concepts. The solution with the highest number of votes wins. The intention of the audience award is to highlight the qualities of the solutions that are not tied to performance. Specifically, the audience and challenge participants are encouraged to consider not only the functional correctness but also the practicability of the solution.

Regarding the practicability of a solution, we want to encourage participants to push beyond solving a problem correctly only in the functional sense. Instead, the aim is to provide a reusable and extensible solution that is of value beyond this year’s Grand Challenge alone. In particular, we will reward those solutions that adhere to a list of non-functional requirements (see this document). These requirements are driven by industry use cases and scenarios, so that we make sure solutions are applicable in practical settings.

Thus, every submission will be evaluated for how it addresses our non-functional requirements in its design and implementation while assuming that all submitted solutions strive for maximum performance. In this regard, we distinguish between hard and soft non-functional requirements.

  • Hard criteria are must-haves that have to be addressed in the design and must be implemented; while
  • Soft criteria do not necessarily have to be implemented but must be covered in the submission's description
  • Examples for hard criteria are: Configurability, Scalability (horizontal scalability is preferred), operational reliability/resilience, accessibility of the solution's source code, integration with standard (tools/protocols), documentation.

    Examples for soft criteria are: Security measures implemented / addressed, deployment support, portability / maintainability, support of special hardware (e.g., FPGAs, GPUs, SDNs,...).

    In order to achieve all non-functional requirements, we want to encourage participants to build their solution not from scratch, but instead consider the widely-used industry-strength open-source platforms like those curated by recognized Open Source foundations (https://opensource.com/resources/organizations). We promote these platforms not only because they already have a diverse user base and ecosystem, but because most of them already solve non-functional requirements that are paramount for a use in practice.

    There are two ways for teams to become finalists and get a presentation slot in the Grand Challenge session during the DEBS Conference: (1) up to two teams with the best performance (according to the final evaluation) will be nominated; (2) the Grand Challenge Program Committee will review submitted papers for each solution and nominate up to two teams with the most novel concepts. All submissions of sufficient quality that do not make it to the finals will get a chance to be presented at the DEBS conference as posters. The quality of the submissions will be determined based on the review process performed by the DEBS Grand Challenge Program Committee. In the review process, the non-functional requirements will receive special attention. Furthermore, participants are committed to present details on which non-functional requirements are indeed implemented and to what level of sophistication in their presentation slot. The audience will be encouraged to pay special attention to these details when making their votes for the best solution.

    Important Dates

  • Release of the challenge, initial data set, and a reference implementation December 15th, 2020
  • API Endpoint for development February 15th, 2021
  • Announcement of challenge parameters (batchsize, latency/throughput) February 15th, 2021
  • Evaluation platform (VMs) February 15th, 2021
  • Deadline for uploading the final solution to the evaluation platform April 5th, 2021 April 12th, 2021
  • Deadline for short paper submission April 19th, 2021 April 26th, 2021
  • Notification of acceptance May 3rd, 2021 May 10th, 2021
  • Evaluation for Performance Award

    The evaluation of both queries addresses two aspects: (1) correctness of results and (2) processing performance. The first is taken into account by comparing the results of a proposed solution with that of our baseline. Only solutions that produce correct results are considered as valid. The performance is captured with two metrics: the total runtime (rank_th) and the latency (rank_q1, rank_q2). The total runtime (rank_th) is the time span between the RPC startBenchmark and endBenchmark. Within start and end, each Batch and the corresponding results need to be submitted. The results of each query must be submitted in order.

    For each query, the latency is derived from the timespan of retrieving a batch and submitting the results. From this timespan, we take the 90 percentile. Each query is measured and ranked individually. The lower the latency, the higher the position in the ranking.

    The final rank is calculated based on the rank of (rank_th x 4 + rank_q1 + rank_q2) / 6. The lowest final rank wins the performance award. The paper review score will be used in case of ties.

    Evaluation for Performance Award

    We provide a gRPC based API for the challenge. The documentation and example code is up on the challenge platform. Also a dashboard is available.

    Send us an e-mail and we will grant you access to the platform. The platform is reachable under http://challenge.msrg.in.tum.de. Additionally, you can find further documentation and example code to get started.

    Important Dates

    Events Dates (AoE)
    Abstract Submission for Research Track February 26th, 2021 March 15th, 2021
    Submission Dates
    Research Paper Submission March 5th, 2021 March 22nd, 2021
    Industry Paper Submission March 15th, 2021 March 29th, 2021
    Tutorial Proposal Submission Apr 12th, 2021
    Grand Challenge Solution Submission April 19th, 2021 April 26th, 2021
    Doctoral Symposium Submission May 10th, 2021 May 12th, 2021
    Poster & Demo Submission May 10th, 2021 May 12th, 2021
    Notification Dates
    Author Notification Research Track April 19th, 2021 May 3rd, 2021
    Author Notification Industry Track April 19th, 2021 May 3rd, 2021
    Author Notification Tutorials Apr 26th, 2021
    Author Notification Grand Challenge May 3rd, 2021 May 10th, 2021
    Author Notification Doctoral Symposium May 3rd, 2021 May 24th, 2021
    Author Notification Poster & Demo May 10th, 2021 May 24th, 2021
    Conference
    Camera Ready for All Tracks May 14th, 2021 May 28th, 2021
    Conference 28th June – 2nd July 2021

    Sponsored by

    Euranova

    and

    SIGHUB

    and

    Infront