Abstract
The paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which allows to keep file history at the cost of space. We have shown that the three level fail-safe mode incorporating these versions does introduce minimal overhead for a random walk microbenchmark application for a 1GB file and checkpoints created every 2000 iterations, computing powers of a graph with 10000 vertices and up to 20% overhead for parallel processing of images up to 1000 megapixels compared to the basic NVRAM cache without fail-safe modes. We also presented times for checkpoint creation and restoring for sizes up to 10GBs.
Citations
-
2
CrossRef
-
0
Web of Science
-
2
Scopus
Authors (2)
Cite as
Full text
- Publication version
- Accepted or Published Version
- DOI:
- Digital Object Identifier (open in new tab) 10.1016/j.procs.2018.08.237
- License
- open in new tab
Keywords
Details
- Category:
- Articles
- Type:
- publikacja w in. zagranicznym czasopiśmie naukowym (tylko język obcy)
- Published in:
-
Procedia Computer Science
no. 136,
pages 52 - 61,
ISSN: 1877-0509 - Language:
- English
- Publication year:
- 2018
- Bibliographic description:
- Malinowski A., Czarnul P.. Three levels of fail-safe mode in MPI I/O NVRAM distributed cache. Procedia Computer Science, 2018, Vol. 136, , s.52-61
- DOI:
- Digital Object Identifier (open in new tab) 10.1016/j.procs.2018.08.237
- Verified by:
- Gdańsk University of Technology
seen 149 times
Recommended for you
A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications
- A. Malinowski,
- P. Czarnul,
- M. Maciejewski
- + 1 authors