ABSTRACT
Approximate Query Processing has become increasingly popular as larger data sizes have increased query latency in distributed query processing systems. To provide such approximate results, systems return intermediate results and iteratively update these approximations as they process more data. In shared clusters, however, these systems waste resources by directing resources to queries that are no longer improving the results given to users.
We describe ReLAQS, a cluster scheduling system for online aggregation queries that aims to reduce latency by assigning resources to queries with the most potential for improvement. ReLAQS utilizes the approximate results each query returns to periodically estimate how much progress each concurrent query is currently making. It then uses this information to predict how much progress each query is expected to make in the near future and redistributes resources in real-time to maximize the overall quality of the answers returned across the cluster. Experiments show that ReLAQS achieves a reduction in latency of up to 47% compared to traditional fair schedulers.
- Databricks. URL: http://databricks.com/.Google Scholar
- Ooyala Job Server. URL: https://github.com/ooyala/spark-jobserver.Google Scholar
- S. Agarwal, H. Milner, A. Kleiner, A. Talwalkar, M. Jordan, S. Madden, B. Mozafari, and I. Stoica. Knowing when you're wrong: Building fast and reliable approximate query processing systems. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, pages 481--492, New York, NY, USA, 2014. ACM.Google ScholarDigital Library
- S. Agarwal, B. Mozafari, A. Panda, H. Milner, S. Madden, and I. Stoica. BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data. In ACM EuroSys, 2013.Google ScholarDigital Library
- Y. Ahmad, O. Kennedy, C. Koch, and M. Nikolic. Dbtoaster: Higherorder delta processing for dynamic, frequently fresh views. Proceedings of the VLDB Endowment, 5(10):968--979, 2012.Google ScholarDigital Library
- G. Ananthanarayanan, M. C.-C. Hung, X. Ren, I. Stoica, A. Wierman, and M. Yu. GRASS: Trimming Stragglers in Approximation Analytics. In USENIX NSDI, 2014.Google ScholarDigital Library
- M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan, M. J. Franklin, A. Ghodsi, et al. Spark sql: Relational data processing in spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1383--1394. ACM, 2015.Google ScholarDigital Library
- B. Babcock, S. Chaudhuri, and G. Das. Dynamic Sample Selection for Approximate Query Processing. In ACM SIGMOD, 2003.Google ScholarDigital Library
- A. A. Bhattacharya, D. Culler, E. Friedman, A. Ghodsi, S. Shenker, and I. Stoica. Hierarchical Scheduling for Diverse Datacenter Workloads. In ACM SoCC, 2013.Google ScholarDigital Library
- B. Efron and R. J. Tibshirani. An introduction to the bootstrap. CRC press, 1994.Google ScholarCross Ref
- A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In USENIX NSDI, 2011.Google ScholarDigital Library
- I. Goiri, R. Bianchini, S. Nagarakatte, and T. D. Nguyen. Approxhadoop: Bringing approximations to mapreduce frameworks. In ACM SIGARCH Computer Architecture News, volume 43, pages 383--397. ACM, 2015.Google ScholarDigital Library
- M. Habib, C. McDiarmid, J. Ramirez-Alfonsin, and B. Reed. Probabilistic methods for algorithmic discrete mathematics, volume 16. Springer Science & Business Media, 2013.Google Scholar
- Capacity Scheduler. Retrieved 04/20/2017, URL: https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html.Google Scholar
- J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online Aggregation. In ACM SIGMOD, 1997.Google ScholarDigital Library
- B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A Platform for Fine-grained Resource Sharing in the Data Center. In USENIX NSDI, 2011.Google ScholarDigital Library
- M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: Fair Scheduling for Distributed Computing Clusters. In ACM SOSP, 2009.Google ScholarDigital Library
- C. Jermaine, S. Arumugam, A. Pol, and A. Dobra. Scalable Approximate Query Processing with the DBO Engine. ACM Transactions on Database Systems, 33(4):23, 2008.Google ScholarDigital Library
- Jin, Li. Preemptive scheduling in mesos framework.Google Scholar
- R. Johari and J. N. Tsitsiklis. Efficiency Loss in a Network Resource Allocation Game. Math. Oper. Res., 29:407--435, 2004.Google ScholarDigital Library
- A. John. Mathematical statistics and data analysis. Wadsworth & Brooks/Cole, 1988.Google Scholar
- F. P. Kelly, A. K. Maulloo, and D. K. H. Tan. Rate Control for Communication Networks: Shadow Prices, Proportional Fairness and Stability. The Journal of the Operational Research Society, 49:237--252, 1998.Google ScholarCross Ref
- S. H. Low and D. E. Lapsley. Optimization Flow Control---I: Basic Algorithm and Convergence. IEEE/ACM Transactions on Networking, 7(6):861--874, 1999.Google ScholarDigital Library
- N. Pansare, V. R. Borkar, C. Jermaine, and T. Condie. Online Aggregation for Large MapReduce Jobs. Proceedings of the VLDB Endowment, 4(11), 2011.Google ScholarDigital Library
- A. Parameswaran, N. Polyzotis, and H. Garcia-Molina. Seedb: Visualizing database queries efficiently. Proc. VLDB Endow., 7(4):325--328, Dec. 2013.Google ScholarDigital Library
- Y. Park, M. Cafarella, and B. Mozafari. Visualization-aware sampling for very large databases. In Data Engineering (ICDE), 2016 IEEE 32nd International Conference on, pages 755--766. IEEE, 2016.Google ScholarCross Ref
- S. Rahman, M. Aliakbarpour, H. K. Kong, E. Blais, K. Karahalios, A. Parameswaran, and R. Rubinfield. I've seen "enough": Incrementally improving visualizations to support rapid decision making. Proc. VLDB Endow., 10(11):1262--1273, Aug. 2017.Google ScholarDigital Library
- S. Venkataraman, A. Panda, G. Ananthanarayanan, M. J. Franklin, and I. Stoica. The Power of Choice in Data-aware Cluster Scheduling. In USENIX OSDI, 2014.Google ScholarDigital Library
- E. Wu, L. Jiang, L. Xu, and A. Nandi. Graphical perception in animated bar charts. arXiv preprint arXiv:1604.00080, 2016.Google Scholar
- S. Wu, B. C. Ooi, and K.-L. Tan. Continuous sampling for online aggregation over multiple queries. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD '10, pages 651--662, New York, NY, USA, 2010. ACM.Google ScholarDigital Library
- Apache Hadoop YARN. Retrieved 02/08/2017, URL: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html.Google Scholar
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing. In USENIX NSDI, 2012.Google ScholarDigital Library
- K. Zeng, S. Agarwal, A. Dave, M. Armbrust, and I. Stoica. G-ola: Generalized on-line aggregation for interactive analysis on big data. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 913--918. ACM, 2015.Google ScholarDigital Library
- K. Zeng, S. Agarwal, and I. Stoica. iOLAP: Managing Uncertainty for Efficient Incremental OLAP. In ACM SIGMOD, 2016.Google ScholarDigital Library
- H. Zhang, G. Ananthanarayanan, P. Bodik, M. Philipose, P. Bahl, and M. J. Freedman. Live Video Analytics at Scale with Approximation and Delay-Tolerance. In USENIX NSDI, 2017.Google Scholar
- H. Zhang, L. Stafman, A. Or, and M. J. Freedman. Slaq: Quality-driven scheduling for distributed machine learning. In Proceedings of the 2017 Symposium on Cloud Computing, SoCC '17, pages 390--404, New York, NY, USA, 2017. ACM.Google ScholarDigital Library
Index Terms
- ReLAQS: Reducing Latency for Multi-Tenant Approximate Queries via Scheduling
Recommendations
Scheduling of deteriorating jobs with release dates to minimize the maximum lateness
In this paper, we consider the problem of scheduling n deteriorating jobs with release dates on a single (batching) machine. Each job's processing time is a simple linear function of its starting time. The objective is to minimize the maximum lateness. ...
Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines
The deadline of a request is the time instant at which its execution must complete. The deadline of the request in any period of a job with deferred deadline is some time instant after the end of the period. The authors describe a semi-static priority-...
Machine scheduling with deteriorating and resource-dependent maintenance activity
Concept of deteriorating and resource-dependent maintenance is introduced.Four single-machine scheduling problems are analyzed.Measures are makespan, flowtime, maximum tardiness and due-date related.Solving algorithms are proposed for the considered ...
Comments