A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Live Recovery of Bit Corruptions in Datacenter Storage Systems
[article]
2018
arXiv
pre-print
To do so, we present DIRECT, a novel set of policies that leverages latent redundancy in distributed storage systems to recover from bit corruption errors with minimal performance and recovery overhead ...
Due to its high performance and decreasing cost per bit, flash is becoming the main storage medium in datacenters for hot data. ...
We also show how increasing the resiliency to bit-level errors can significantly reduce storage costs and improve live recovery speed in datacenter environments. ...
arXiv:1805.02790v2
fatcat:htzwylssjnemzlkpc7jx5m67ju
Who's Afraid of Uncorrectable Bit Errors? Online Recovery of Flash Errors with Distributed Redundancy
2019
USENIX Annual Technical Conference
Due to its high performance and decreasing cost per bit, flash storage is the main storage medium in datacenters for hot data. ...
By significantly increasing the availability of distributed storage systems in the face of bit errors, DIRECT helps extend flash lifetimes. ...
We also show how increasing the resiliency to bit errors can significantly reduce storage costs and improve live recovery speed in datacenter environments. ...
dblp:conf/usenix/TaiKKJFC19
fatcat:4jjnw5fafzc25cz5gasrfziufe
Reliable Multi-cloud Storage Architecture Based on Erasure Code to Improve Storage Performance and Failure Recovery
2017
International Journal of Advanced Cloud Computing and Applied Research
To assure stability in storage costs with best practice to assure content failure or loss recovery, we applied a Maximum Distance Separable such as Reed-Solomon to our RMCSA. ...
The increasing popularity of cloud storage services has lead companies that handle critical content to think about using these services for their daily storage needs. ...
The authors express their appreciation to Nanjing University of Science and Technology for creating a research fostering environment. ...
doi:10.23953/cloud.ijaccar.260
fatcat:cwifw7w2j5ch3p5nlx5mxms3da
Toward a high availability cloud: Techniques and challenges
2012
IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN 2012)
Multicore processing, virtualization, distributed storage systems and an overarching management framework that enable a Cloud, offer a plethora of possibilities to provide high availability using commodity ...
With its geographical spread and value proposition comes the need to provide guaranteed level of availability in the infrastructure and in its services. ...
ACKNOWLEDGMENT This work was supported in part by NSF grant CNS 10-18503 CISE, the Department of Energy under Award Number DE-OE0000097, the Air Force Office of Scientific Research, under agreement number ...
doi:10.1109/dsnw.2012.6264687
dblp:conf/dsn/PhamCKI12
fatcat:chins7svy5h2rihrlnqcsyhc7e
PaxosStore
2017
Proceedings of the VLDB Endowment
In this paper, we present PaxosStore, a high-availability storage system developed to support the comprehensive business of WeChat. ...
It employs a combinational design in the storage layer to engage multiple storage engines constructed for different storage models. ...
Acknowledgments We would like to thank the anonymous reviewers and shepherd of the paper for their insightful feedback that helped improve the paper. ...
doi:10.14778/3137765.3137778
fatcat:rydvf7xt7rb5vmpkjzijw4tfui
Datacenter Ethernet and RDMA: Issues at Hyperscale
[article]
2023
arXiv
pre-print
Now, a decade later, we revisit RoCE's design points and conclude that several of its shortcomings must be addressed to fulfill the demands of hyperscale datacenters. ...
We observe that emerging artificial intelligence, high-performance computing, and storage workloads pose new challenges for large-scale datacenter networking. ...
This implies that packets can only be dropped if they are corrupted by bit errors, a very rare event. ...
arXiv:2302.03337v1
fatcat:kra475y64bcp3jb54frgd5j5ra
Proactive Detection and Repair of Data Corruption: Towards a Hassle-free Declarative Approach with Amulet
2011
Proceedings of the VLDB Endowment
Occasional corruption of stored data is an unfortunate byproduct of the complexity of modern systems. ...
The dominant practice to deal with data corruption today involves administrators writing ad hoc scripts that run dataintegrity tests at the application, database, file-system, and storage levels. ...
INTRODUCTION Data corruption-where bits of data in persistent storage differ from what they are supposed to be-is an ugly reality that database and storage administrators have to deal with occasionally ...
dblp:journals/pvldb/BorisovB11
fatcat:t4dwjixfjvb2tfn4e4gqx5pzgu
The RAMCloud Storage System
2015
ACM Transactions on Computer Systems
In a large datacenter with 100,000 nodes, we expect small reads to complete in less than 10μs, which is 50 to 1,000 times faster than the storage systems commonly used today. ...
Over the past 15 years, the use of DRAM in storage systems has accelerated, driven by the needs of large-scale Web applications. ...
-Coordinator crashes -Corruption of segments, either in DRAM or on secondary storage. Multiple failures can occur simultaneously. ...
doi:10.1145/2806887
fatcat:fg3r5yahbjhxhcor6m2w2q6bxy
Flat Datacenter Storage
2012
USENIX Symposium on Operating Systems Design and Implementation
We measure recovery of 92 GB data lost to disk failure in 6.2 s and recovery from a total machine failure with 655 GB of data in 33.7 s. ...
Flat Datacenter Storage (FDS) is a high-performance, fault-tolerant, large-scale, locality-oblivious blob store. ...
Johnson Apacible, Rich Draves, and Reuben Olinsky were part of the sort record team. Trevor Eberl, Jamie Lee, Oleg Losinets and Lucas Williamson provided systems support. ...
dblp:conf/osdi/NightingaleEFHHS12
fatcat:5ulzufamjnhnhblg53d6ibwtiq
The Design, Implementation, and Deployment of a System to Transparently Compress Hundreds of Petabytes of Image Files for a File-Storage Service
2017
arXiv
pre-print
We report the design, implementation, and deployment of Lepton, a fault-tolerant system that losslessly compresses JPEG images to 77% of their original size on average. ...
Lepton matches the compression efficiency of the best prior work, while decoding more than nine times faster and in a streaming manner. ...
KW's participation was as a paid consultant and was not part of his Stanford duties or responsibilities. ...
arXiv:1704.06192v1
fatcat:6n2sefbsnba3fbi4kmovpeyj54
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second edition
2013
Synthesis Lectures on Computer Architecture
Acknowledgments While we draw from our direct involvement in Google's infrastructure design and operation over the past several years, most of what we have learned and now report here is the result of ...
Thanks in advance for taking the time to contribute. ...
The Autopilot system from Microsoft [87] offers an example design for some of this functionality for Windows Live datacenters. ...
doi:10.2200/s00516ed2v01y201306cac024
fatcat:435o455inbcmrakl6l7jp4gope
A Taxonomy and Future Directions for Sustainable Cloud Computing: 360 Degree View
[article]
2018
arXiv
pre-print
In this paper, we propose a comprehensive taxonomy of sustainable cloud computing. ...
The usage of large number of cloud datacenters increases cost as well as carbon footprints, which further effects the sustainability of cloud services. ...
datacenter powering systems and local flash storage with low power CPUs. ...
arXiv:1712.02899v2
fatcat:t26xxbgiijesneqgzi2mqz4gta
A Survey on Security Mechanisms of Leading Cloud Service Providers
2014
International Journal of Computer Applications
With an unprecedented pace of developments in Cloud computing technology, there has been an exponential increase of users of these services and an equal rise of cloud services providers. ...
Clouding Computing is a virtual pool of resources provided to users as service through a web interface. These resources may include Software, Infrastructure, Storage, Network, Platform etc. ...
Data Services (Storage, SQL Database, HDInsight, Cache, Backup, Recovery Manager) III. ...
doi:10.5120/17149-7184
fatcat:262geqqsqbet7gofec7wyrn2di
We report on experiences with Swift congestion control in Google datacenters. Swift targets an end-to-end delay by using AIMD control, with pacing under extreme congestion. ...
In large-scale testbed experiments, Swift delivers a tail latency of <50µs for short RPCs, with near-zero packet drops, while sustaining ∼100Gbps throughput per server. ...
Manya Ghobadi, Emily Blem, Vinh The Lam, Philip Wells and Ashish Naik contributed to the work in the early days of Swift. ...
doi:10.1145/3387514.3406591
dblp:conf/sigcomm/KumarDJWWMWSARW20
fatcat:ks4dtoy7hfdo5nerw6pcfmhcky
Higher SLA satisfaction in datacenters with continuous VM placement constraints
2013
Proceedings of the 9th Workshop on Hot Topics in Dependable Systems - HotDep '13
In a virtualized datacenter, the Service Level Agreement for an application restricts the Virtual Machines (VMs) placement. ...
We propose a Byzantine fault tolerant pub/sub system, on a tree-based overlay, tolerating a configurable number of failures in any part of the system, with minimal divergence from traditional pub/sub specifications ...
We thank the developers and users of SQLite and LevelDB for helping us understand their software in detail. ...
doi:10.1145/2524224.2524226
dblp:conf/hotdep/DangH13
fatcat:xe5soaxhengtxcufr5oahrxyoq
« Previous
Showing results 1 — 15 out of 238 results