Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








132 Hits in 2.3 sec

MLI: An API for Distributed Machine Learning

Evan R. Sparks, Ameet Talwalkar, Virginia Smith, Jey Kottalam, Xinghao Pan, Joseph Gonzalez, Michael J. Franklin, Michael I. Jordan, Tim Kraska
2013 2013 IEEE 13th International Conference on Data Mining  
MLI is an Application Programming Interface designed to address the challenges of building Machine Learning algorithms in a distributed setting based on data-centric computing.  ...  Our initial results show that, relative to existing systems, this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity  ...  CONCLUSION We have presented MLI, an API for building scalable distributed machine learning algorithms.  ... 
doi:10.1109/icdm.2013.158 dblp:conf/icdm/SparksTSKPGFJK13 fatcat:zs2qid3eqbbwlk4w2o7bi6k3xm

MLI: An API for Distributed Machine Learning [article]

Evan R. Sparks, Ameet Talwalkar, Virginia Smith, Jey Kottalam, Xinghao Pan, Joseph Gonzalez, Michael J. Franklin, Michael I. Jordan, Tim Kraska
2013 arXiv   pre-print
MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing.  ...  Our initial results show that, relative to existing systems, this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity  ...  CONCLUSION We have presented MLI, an API for building scalable distributed machine learning algorithms.  ... 
arXiv:1310.5426v2 fatcat:tzjolo6bubbfvkra5iddgdg4pu

Towards an Analytics Query Engine

Nantia Makrynioti, Vasilis Vassalos
2016 International Conference on Extending Database Technology  
This vision paper presents new challenges and opportunities in the area of distributed data analytics, at the core of which are data mining and machine learning.  ...  We argue that these aspects will emerge as important issues for the data management community in the next years and propose promising research directions for solving them.  ...  We also consider the sets of operations provided by MLI API, MLbase and Spark ML [3] good attempts, as except from relational operators, they are also closer to the semantics of machine learning area by  ... 
dblp:conf/edbt/MakryniotiV16 fatcat:yhrewhzlhne3tdcevvyhigguha

Comparison of Proprioception between Kinesiology and Adhesive Ankle Taping: A Randomised Experimental Study

Dimitrios Athanasiadis, Konstantinos Papadopoulos
2019 Journal of Novel Physiotherapies  
The purpose of the first day was to eliminate the learning effect. Participants were introduced to the BBS machine with a full-scale protocol. No measurements were recorded.  ...  Although the Tests of Within-Subjects Effects for the API and MLI variables were not statistically significant, the comparison between the non-taped and KT parameters at the API variable were statistically  ... 
doi:10.4172/2165-7025.1000406 fatcat:tb3hnprymfhd7ozfdvwgzct2qu

RABID: A Distributed Parallel R for Large Datasets

Hao Lin, Shuo Yang, Samuel P. Midkiff
2014 2014 IEEE International Congress on Big Data  
This paper describes highly parallel R system called RABID (R Analytics for BIg Data) that maintains R compatibility, leverages the MapReducelike distributed Spark [22] and achieves high performance and  ...  Large-scale data mining and deep data analysis are increasingly important for both enterprise and scientific applications.  ...  MLI [19] is a set of higher level APIs providing machine learning programming abstractions.  ... 
doi:10.1109/bigdata.congress.2014.107 dblp:conf/bigdata/LinYM14 fatcat:zncxc5ygkzfp5abmpc5aovoefq

Contaminant Removal for Android Malware Detection Systems [article]

Lichao Sun, Xiaokai Wei, Jiawei Zhang, Lifang He, Philip S. Yu and Witawas Srisa-an
2017 arXiv   pre-print
To address this issue, we introduce PUDROID (Positive and Unlabeled learning-based malware detection for Android) to automatically and effectively remove contaminants from training datasets, allowing machine  ...  Second, analysts and researchers who rely on machine learning based detection techniques may also download these apps and mistakenly label them as benign since they have not been disclosed as malware.  ...  The network traffic is then recorded and then analyzed as features for machine learning. For example, Shabtai et al.  ... 
arXiv:1711.02715v2 fatcat:6ilmjrcntfhrpnxayqwfl5brku

Mining Big Data Using Modified Induction Tree Approach

Chintan Bhatt, C Bhensdadia
2016 International Journal of Intelligent Engineering and Systems  
Induction Tree, for example, C4.5 is the most favored technique since it functions well under any dataset set being utilized.  ...  MLlib and Spark (the other segments are MLI) are an API for computation advancement, and ML Optimizer, for robotization of the alteration of hyperparameters.  ...  : It is open-source, distributed learning library for deep-learning, written in Java and Scala.  ... 
doi:10.22266/ijies2016.0630.03 fatcat:4zhmb4aqtffndobb6mbvu5zbbm

A survey of open source tools for machine learning with big data in the Hadoop ecosystem

Sara Landset, Taghi M. Khoshgoftaar, Aaron N. Richter, Tawfiq Hasanin
2015 Journal of Big Data  
Abstract With an ever-increasing amount of options, the task of selecting machine learning tools for big data can be difficult.  ...  The world's data is growing rapidly, and traditional tools for machine learning are becoming insufficient as we move towards distributed and real-time processing.  ...  In a comparison between MLI (an API for distributed machine learning built on Spark), GraphLab, Mahout, and MATLAB of collaborative filtering with alternating least squares [106] , it was observed that  ... 
doi:10.1186/s40537-015-0032-1 fatcat:zgcsiokrynfhzbmaudqf7rcll4

InferSpark: Statistical Inference at Scale [article]

Zhuoyue Zhao, Jialing Pei, Eric Lo, Kenny Q. Zhu, Chris Liu
2017 arXiv   pre-print
Efficient statistical inference can be easily implemented on this framework and inference process can leverage the distributed main memory processing power of Spark.  ...  These frameworks have the potential of automatically generating inference algorithms for the user defined models and answering various statistical queries about the model.  ...  MLI [20] is an API on top of MLBase (and Spark) to ease the development of various distributed machine learning algorithms (e.g., SGD).  ... 
arXiv:1707.02047v2 fatcat:x264bzypyrdtlk3taztdfvgvgu

Pilot-Abstraction: A Valid Abstraction for Data-Intensive Applications on HPC, Hadoop and Cloud Infrastructures? [article]

Andre Luckow, Pradeep Mantha, Shantenu Jha
2015 arXiv   pre-print
I/O intensive workloads (e.g. for data preparation, transformation and SQL).  ...  We propose the extension of the Pilot-Abstraction to Hadoop to serve as interoperability layer for allocating and managing resources across different infrastructures.  ...  The usage of (distributed) memory for caching of input or intermediate data (e. g. for iterative machine learning) is not supported.  ... 
arXiv:1501.05041v1 fatcat:eiu3inxk7bblrcoh7orjkrimjq

Declarative Data Analytics: a Survey [article]

Nantia Makrynioti Athens University of Economics, Business)
2019 arXiv   pre-print
The area of declarative data analytics explores the application of the declarative paradigm on data science and machine learning.  ...  It proposes declarative languages for expressing data analysis tasks and develops systems which optimize programs written in those languages.  ...  ACKNOWLEDGMENTS We thank Panagiotis-Ioannis Betchavas for the implementation of Linear Regression using DML in section 4.4.1.  ... 
arXiv:1902.01304v1 fatcat:mixepfprkjc5xayhz76bwu3px4

A Multispectral Light Field Dataset and Framework for Light Field Deep Learning

Maximilian Schambach, Michael Heizmann
2020 IEEE Access  
While we do not want to judge which distribution is better suited for machine learning, it does reflect the design choices we have made upon the random scene generation as described in Section II-B.  ...  To this end we present a Python framework for light field-related deep learning applications, based on TensorFlow and the Keras API.  ... 
doi:10.1109/access.2020.3033056 fatcat:mziripjsivepnbbv52w37g5gny

TF-Replicator: Distributed Machine Learning for Researchers [article]

Peter Buchlovsky, David Budden, Dominik Grewe, Chris Jones, John Aslanides, Frederic Besse, Andy Brock, Aidan Clark, Sergio Gómez Colmenarejo, Aedan Pope, Fabio Viola, Dan Belov
2019 arXiv   pre-print
We describe TF-Replicator, a framework for distributed machine learning designed for DeepMind researchers and implemented as an abstraction over TensorFlow.  ...  image generation, and (3) a D4PG reinforcement learning agent for continuous control.  ...  RELATED WORK Early systems for distributed machine learning built upon the popular MapReduce batch dataflow architecture (Dean & Ghemawat, 2008 (Shvachko et al., 2010) , and MLI (Sparks et al., 2013  ... 
arXiv:1902.00465v1 fatcat:2ihyygokh5c2foqyxyxcxqjhia

Deep Learning for Real Time Satellite Pose Estimation on Low Power Edge TPU [article]

Alessandro Lotti, Dario Modenini, Paolo Tortora, Massimiliano Saponara, Maria A. Perino
2022 arXiv   pre-print
We designed our pipeline to be compatible with Edge Tensor Processing Units to show how low power machine learning accelerators could enable Artificial Intelligence exploitation in space.  ...  Pose estimation of an uncooperative space resident object is a key asset towards autonomy in close proximity operations.  ...  The former is a Machine Learning (ML) library for on-device inference while the latter consists of M. Saponara and M. A.  ... 
arXiv:2204.03296v2 fatcat:sqqvcxfuanfc7nkixlwcclx5ni

Distributed Machine Learning Using Data Parallelism on Mobile Platform

Máté Szabó
2020 Journal of Mobile Multimedia  
Machine learning has many challenges, and one of them is to deal with large datasets, because the size of them grows continuously year by year. One solution to this problem is data parallelism.  ...  The results show that doing distributed training on mobile cluster is possible and safe, but its performance depends on the algorithm's implementation.  ...  an own architecture for distribution, or MLI [25] , what defines an API, or more widely used solutions, like Apache Spark's MLlib [26] .  ... 
doi:10.13052/jmm1550-4646.1633 fatcat:wwupp7udxzcojfeas55s32bxoy
« Previous Showing results 1 — 15 out of 132 results