research-article

Public Access

ReLAQS: Reducing Latency for Multi-Tenant Approximate Queries via Scheduling

Authors:
Logan Stafman

Princeton University

Princeton University
View Profile

,
Andrew Or

Princeton University

Princeton University
View Profile

,
Michael J. Freedman

Princeton University

Princeton University
View Profile

Middleware '19: Proceedings of the 20th International Middleware ConferenceDecember 2019Pages 280–292https://doi.org/10.1145/3361525.3361553

Published:09 December 2019Publication History

Middleware '19: Proceedings of the 20th International Middleware Conference

Pages 280–292

ABSTRACT

Approximate Query Processing has become increasingly popular as larger data sizes have increased query latency in distributed query processing systems. To provide such approximate results, systems return intermediate results and iteratively update these approximations as they process more data. In shared clusters, however, these systems waste resources by directing resources to queries that are no longer improving the results given to users.

We describe ReLAQS, a cluster scheduling system for online aggregation queries that aims to reduce latency by assigning resources to queries with the most potential for improvement. ReLAQS utilizes the approximate results each query returns to periodically estimate how much progress each concurrent query is currently making. It then uses this information to predict how much progress each query is expected to make in the near future and redistributes resources in real-time to maximize the overall quality of the answers returned across the cluster. Experiments show that ReLAQS achieves a reduction in latency of up to 47% compared to traditional fair schedulers.

References

Databricks. URL: http://databricks.com/.Google Scholar
Ooyala Job Server. URL: https://github.com/ooyala/spark-jobserver.Google Scholar
S. Agarwal, H. Milner, A. Kleiner, A. Talwalkar, M. Jordan, S. Madden, B. Mozafari, and I. Stoica. Knowing when you're wrong: Building fast and reliable approximate query processing systems. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, pages 481--492, New York, NY, USA, 2014. ACM.Google ScholarDigital Library
S. Agarwal, B. Mozafari, A. Panda, H. Milner, S. Madden, and I. Stoica. BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data. In ACM EuroSys, 2013.Google ScholarDigital Library
Y. Ahmad, O. Kennedy, C. Koch, and M. Nikolic. Dbtoaster: Higherorder delta processing for dynamic, frequently fresh views. Proceedings of the VLDB Endowment, 5(10):968--979, 2012.Google ScholarDigital Library
G. Ananthanarayanan, M. C.-C. Hung, X. Ren, I. Stoica, A. Wierman, and M. Yu. GRASS: Trimming Stragglers in Approximation Analytics. In USENIX NSDI, 2014.Google ScholarDigital Library
M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan, M. J. Franklin, A. Ghodsi, et al. Spark sql: Relational data processing in spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1383--1394. ACM, 2015.Google ScholarDigital Library
B. Babcock, S. Chaudhuri, and G. Das. Dynamic Sample Selection for Approximate Query Processing. In ACM SIGMOD, 2003.Google ScholarDigital Library
A. A. Bhattacharya, D. Culler, E. Friedman, A. Ghodsi, S. Shenker, and I. Stoica. Hierarchical Scheduling for Diverse Datacenter Workloads. In ACM SoCC, 2013.Google ScholarDigital Library
B. Efron and R. J. Tibshirani. An introduction to the bootstrap. CRC press, 1994.Google ScholarCross Ref
A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In USENIX NSDI, 2011.Google ScholarDigital Library
I. Goiri, R. Bianchini, S. Nagarakatte, and T. D. Nguyen. Approxhadoop: Bringing approximations to mapreduce frameworks. In ACM SIGARCH Computer Architecture News, volume 43, pages 383--397. ACM, 2015.Google ScholarDigital Library
M. Habib, C. McDiarmid, J. Ramirez-Alfonsin, and B. Reed. Probabilistic methods for algorithmic discrete mathematics, volume 16. Springer Science & Business Media, 2013.Google Scholar
Capacity Scheduler. Retrieved 04/20/2017, URL: https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html.Google Scholar
J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online Aggregation. In ACM SIGMOD, 1997.Google ScholarDigital Library
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A Platform for Fine-grained Resource Sharing in the Data Center. In USENIX NSDI, 2011.Google ScholarDigital Library
M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: Fair Scheduling for Distributed Computing Clusters. In ACM SOSP, 2009.Google ScholarDigital Library
C. Jermaine, S. Arumugam, A. Pol, and A. Dobra. Scalable Approximate Query Processing with the DBO Engine. ACM Transactions on Database Systems, 33(4):23, 2008.Google ScholarDigital Library
Jin, Li. Preemptive scheduling in mesos framework.Google Scholar
R. Johari and J. N. Tsitsiklis. Efficiency Loss in a Network Resource Allocation Game. Math. Oper. Res., 29:407--435, 2004.Google ScholarDigital Library
A. John. Mathematical statistics and data analysis. Wadsworth & Brooks/Cole, 1988.Google Scholar
F. P. Kelly, A. K. Maulloo, and D. K. H. Tan. Rate Control for Communication Networks: Shadow Prices, Proportional Fairness and Stability. The Journal of the Operational Research Society, 49:237--252, 1998.Google ScholarCross Ref
S. H. Low and D. E. Lapsley. Optimization Flow Control---I: Basic Algorithm and Convergence. IEEE/ACM Transactions on Networking, 7(6):861--874, 1999.Google ScholarDigital Library
N. Pansare, V. R. Borkar, C. Jermaine, and T. Condie. Online Aggregation for Large MapReduce Jobs. Proceedings of the VLDB Endowment, 4(11), 2011.Google ScholarDigital Library
A. Parameswaran, N. Polyzotis, and H. Garcia-Molina. Seedb: Visualizing database queries efficiently. Proc. VLDB Endow., 7(4):325--328, Dec. 2013.Google ScholarDigital Library
Y. Park, M. Cafarella, and B. Mozafari. Visualization-aware sampling for very large databases. In Data Engineering (ICDE), 2016 IEEE 32nd International Conference on, pages 755--766. IEEE, 2016.Google ScholarCross Ref
S. Rahman, M. Aliakbarpour, H. K. Kong, E. Blais, K. Karahalios, A. Parameswaran, and R. Rubinfield. I've seen "enough": Incrementally improving visualizations to support rapid decision making. Proc. VLDB Endow., 10(11):1262--1273, Aug. 2017.Google ScholarDigital Library
S. Venkataraman, A. Panda, G. Ananthanarayanan, M. J. Franklin, and I. Stoica. The Power of Choice in Data-aware Cluster Scheduling. In USENIX OSDI, 2014.Google ScholarDigital Library
E. Wu, L. Jiang, L. Xu, and A. Nandi. Graphical perception in animated bar charts. arXiv preprint arXiv:1604.00080, 2016.Google Scholar
S. Wu, B. C. Ooi, and K.-L. Tan. Continuous sampling for online aggregation over multiple queries. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD '10, pages 651--662, New York, NY, USA, 2010. ACM.Google ScholarDigital Library
Apache Hadoop YARN. Retrieved 02/08/2017, URL: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html.Google Scholar
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing. In USENIX NSDI, 2012.Google ScholarDigital Library
K. Zeng, S. Agarwal, A. Dave, M. Armbrust, and I. Stoica. G-ola: Generalized on-line aggregation for interactive analysis on big data. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 913--918. ACM, 2015.Google ScholarDigital Library
K. Zeng, S. Agarwal, and I. Stoica. iOLAP: Managing Uncertainty for Efficient Incremental OLAP. In ACM SIGMOD, 2016.Google ScholarDigital Library
H. Zhang, G. Ananthanarayanan, P. Bodik, M. Philipose, P. Bahl, and M. J. Freedman. Live Video Analytics at Scale with Approximation and Delay-Tolerance. In USENIX NSDI, 2017.Google Scholar
H. Zhang, L. Stafman, A. Or, and M. J. Freedman. Slaq: Quality-driven scheduling for distributed machine learning. In Proceedings of the 2017 Symposium on Cloud Computing, SoCC '17, pages 390--404, New York, NY, USA, 2017. ACM.Google ScholarDigital Library

Index Terms

ReLAQS: Reducing Latency for Multi-Tenant Approximate Queries via Scheduling

Recommendations

Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

In this paper, we consider the problem of scheduling n deteriorating jobs with release dates on a single (batching) machine. Each job's processing time is a simple linear function of its starting time. The objective is to minimize the maximum lateness. ...
Read More
Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines

The deadline of a request is the time instant at which its execution must complete. The deadline of the request in any period of a job with deferred deadline is some time instant after the end of the period. The authors describe a semi-static priority-...
Read More
Machine scheduling with deteriorating and resource-dependent maintenance activity

Concept of deteriorating and resource-dependent maintenance is introduced.Four single-machine scheduling problems are analyzed.Measures are makespan, flowtime, maximum tardiness and due-date related.Solving algorithms are proposed for the considered ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

Middleware '19: Proceedings of the 20th International Middleware Conference
December 2019
342 pages
ISBN:9781450370097
DOI:10.1145/3361525

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 December 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
approximate computing
scheduling
utility-aware scheduling
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate203of948submissions,21%

Upcoming Conference

MIDDLEWARE '24

25th International Middleware Conference

December 2 - 6, 2024

Hong Kong , Hong Kong
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 251
  Total Downloads
- Downloads (Last 12 months)32
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

ReLAQS: Reducing Latency for Multi-Tenant Approximate Queries via Scheduling

Middleware '19: Proceedings of the 20th International Middleware Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines

Machine scheduling with deteriorating and resource-dependent maintenance activity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

ReLAQS: Reducing Latency for Multi-Tenant Approximate Queries via Scheduling

Middleware '19: Proceedings of the 20th International Middleware Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines

Machine scheduling with deteriorating and resource-dependent maintenance activity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media