Zastosowanie bajtowo adresowanej pamięci NVRAM do zwiększenia wydajności wybranych aplikacji równoległych wykorzystujących MPI I/O
Abstract
Obecnie wiele badań podejmuje temat rosnącego problemu wydajności operacji na plikach w środowiskach klastrowych. Jednocześnie, według ostatnich doniesień związanych z rozwojem technologii pamięci komputerowych, w najbliższej przyszłości na rynku powinny pojawić się układy trwałej pamięci o dostępie swobodnym, adresowanej bajtowo. Niniejsza rozprawa pokazuje, że przy użyciu takiej pamięci można zwiększyć wydajność wybranych aplikacji przetwarzających dane zgromadzone we współdzielonych plikach. Praca skupia się na autorskim rozwiązaniu – rozproszonej pamięci podręcznej, kompatybilnej z interfejsem popularnego standardu dostępu do plików w klastrach jakim jest MPI I/O. Do weryfikacji poprawy wydajności wykorzystano dwa syntetyczne benchmarki oraz cztery aplikacje użytkowe, a same testy przeprowadzono przy użyciu sprzętowego symulatora pamięci o nowych parametrach. Na rozprawę składa się wprowadzenie teoretyczne ze szczególnym uwzględnieniem najnowszych badań, szczegółowy opis architektury oraz implementacji zaproponowanego rozwiązania, charakterystyka zestawu aplikacji demonstrujących wraz z wynikami eksperymentów wydajnościowych i komentarzem rezultatów, oraz podsumowanie i nakreślenie dalszych kierunków prac.
Author (1)
Cite as
Full text
- Publication version
- Accepted or Published Version
- License
- Copyright
Details
- Category:
- Thesis, nostrification
- Type:
- Thesis, nostrification
- Publication year:
- 2019
- Bibliography: test
-
- Grafika poglądowa ilustrująca wyzwania stawiane dzisiaj przed HPC . . . . 17 open in new tab
- Uproszczony schemat hierarchii pamięci . . . . . . . . . . . . . . . . . . . . 24
- Materiały marketingowe firmy AgigA Tech, ilustrujące nie tylko architek- turę układów typu NVDIMM-N, ale również pokazujące konieczność za- pewnienia modułom podtrzymywania zasilania na wypadek awarii (źródło: agigatech.wpengine.com) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 open in new tab
- Materiały marketingowe firmy Intel, pokazujące potencjalną pozycję ukła- dów NVRAM (tutaj opatrzonych nazwą technologii 3D XPoint) w hierarchii pamięci (źródło: intel.com) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 open in new tab
- Potencjalne możliwości kształtu hierarchii pamięci uzupełnionej o NVRAM 30 open in new tab
- Podział funkcji MPI I/O odpowiedzialnych za dostęp do danych pliku (źró- dło: MPI: A Message-Passing Interface Standard [81], prawa autorskie: University of Tennessee) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 open in new tab
- Schemat stosu technologicznego stosowanego w operacjach na plikach w obliczeniach wysokiej wydajności . . . . . . . . . . . . . . . . . . . . . . . . 40 open in new tab
- Schemat działania algorytmu data sieving (a) i two-phase I/O (b) stosowa- nych w popularnym rozwiązaniu ROMIO . . . . . . . . . . . . . . . . . . . 42 open in new tab
- Architektura Catwalk-ROMIO; węzły obliczeniowe dodają kolejno żądania do bufora (na rysunku bufor jest obecnie obsługiwany przez węzeł nr 3) . . 43 open in new tab
- Schemat działania narzędzia PLDA . . . . . . . . . . . . . . . . . . . . . . . 44
- Podstawowa koncepcja S4D-Cache, w której dodatkowa warstwa zlokalizo- wana pomiędzy MPI I/O a serwerem rozproszonego systemu plików może przekierować żądania do pamięci podręcznej . . . . . . . . . . . . . . . . . . 48 open in new tab
- Benchmark ROMPIO -wpływ rozmiaru bloku danych na prędkość odczytu i zapisu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
- Benchmark ROMPIO -wpływ rozmiaru klastra na prędkość odczytu i zapisu 98 open in new tab
- Benchmark -błądzenie losowe -wyniki eksperymentów . . . . . . . . . . . 100
- Zdjęcie oryginalne i przetworzone przy użyciu zaimplementowanych filtrów (fotografia: Krzysztof Krzempek) . . . . . . . . . . . . . . . . . . . . . . . . . 102 open in new tab
- Przetwarzanie dużych obrazów -wyniki eksperymentów . . . . . . . . . . . 106
- Wyznaczanie potęgi grafu -wyniki eksperymentów . . . . . . . . . . . . . . 109
- Przeszukiwanie dwuwymiarowej mapy -wyniki eksperymentów . . . . . . . 110 open in new tab
- Architektura, pokazująca komponenty składowe aplikacji, która symuluje zachowanie tłumu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
- Grafika prezentująca symulację zachowania tłumu; niebieskie punkty repre- zentują agentów . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 open in new tab
- Symulacja zachowania tłumu -wyniki eksperymentów . . . . . . . . . . . . 114
- Wpływ mechanizmów ochrony danych na wydajność -wyniki eksperymentów117 open in new tab
- Czas tworzenia kolejnych wersji pliku i odtwarzania pamięci podręcznej - wyniki eksperymentów . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 open in new tab
- Spis tabel open in new tab
- Porównanie rozwiązań kompatybilnych z aplikacjami wykorzystującymi MPI open in new tab
- I/O, zwiększających wydajność przetwarzania plików w HPC . . . . . . . . 53
- Porównanie rozwiązań opartych o NVRAM dedykowanych operacjom I/O w HPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
- Różnice trzech poziomów zapewnienia bezpieczeństwa danych w zapropo- nowanym rozwiązaniu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 open in new tab
- Konfiguracja sprzętowa klastrów testowych . . . . . . . . . . . . . . . . . . 94
- Oprogramowanie użyte na potrzeby testów . . . . . . . . . . . . . . . . . . 94
- Parametry symulatora NVRAM (tylko klaster Lap06) . . . . . . . . . . . . 96
- Aguilar Leonel, Lalith Maddegedara, Ichimura Tsuyoshi, Hori Muneo. On the per- formance and scalability of an HPC enhanced Multi Agent System based evacuation simulator. W: Procedia Computer Science, s. 937 -947, 2017. International Confe- rence on Computational Science, ICCS 2017, Zurych, Szwajcaria.
- Åkerman Johan. Toward a Universal Memory. W: Science, nr 308 (5721), s. 508 - 510, 2005.
- Alam Sadaf R., El-Harake Hussein N., Howard Kristopher, Stringfellow Neil, Ve- rzelloni Fabio. Parallel I/O and the Metadata Wall. W: Proceedings of the Sixth Workshop on Parallel Data Storage, PDSW '11, s. 13 -18, Nowy Jork, Stany Zjed- noczone, 2011. ACM.
- Badam Anirudh, Pai Vivek S. SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy. W: Proceedings of the 8th USENIX Conference on Networked Sys- tems Design and Implementation, NSDI'11, s. 211-224, Berkeley, Stany Zjednoczone, 2011. USENIX Association. open in new tab
- Balakhontceva Marina, Karbovskii Vladislav, Sutulo Serge, Boukhanovsky Alexan- der. Multi-agent Simulation of Passenger Evacuation from a Damaged Ship under Storm Conditions. W: Procedia Computer Science, nr 80, s. 2455 -2464, 2016. Inter- national Conference on Computational Science 2016, ICCS 2016, San Diego, Stany Zjednoczone.
- Boito F. Z., Inacio E. C., Bez J. L., Navaux P., Dantas M. A Checkpoint of Research on Parallel I/O for High Performance Computing. W: ACM Computing Surveys, nr 51, 2018. ACM. open in new tab
- Bourzac Katherine. Has Intel created a universal memory technology? W: IEEE Spectrum, nr 54 (5), s. 9 -10, 2017. open in new tab
- Cannon Lynn Elliot. A cellular computer to implement the Kalman Filter Algorithm. Praca doktorska. Montana State University, 1969. open in new tab
- Cappelletti Paolo. Non volatile memory evolution and revolution. W: 2015 IEEE International Electron Devices Meeting (IEDM), Rozdziały 10.1.1 -10.1.4, 2015.
- Chaarawi Mohamad, Gabriel Edgar, Keller Rainer, Graham Richard L., Bosilca George, Dongarra Jack J. OMPIO: A Modular Software Architecture for MPI I/O. W: Recent Advances in the Message Passing Interface, s. 81 -89, Berlin, Niemcy, 2011. Springer Berlin Heidelberg.
- Chaimov Nicholas, Malony Allen, Canon Shane, Iancu Costin, Ibrahim Khaled Z., Srinivasan Jay. Scaling Spark on HPC Systems. W: Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC '16, s. 97 -110, Nowy Jork, Stany Zjednoczone 2016. ACM.
- Chen Feng, Mesnier Michael P., Hahn Scott. A protected block device for Persistent Memory. W: 2014 30th Symposium on Mass Storage Systems and Technologies (MSST), s. 1 -12, 2014.
- Coteus Paul i in. Packaging the Blue Gene/L supercomputer. W: IBM Journal of Research and Development, nr 49 (2.3), s. 213 -248, 2005. open in new tab
- Cugnasco Cesare, Becerra Yolanda, Torres Jordi, Ayguadé Eduard. Exploiting Key- Value Data Stores Scalability for HPC. W: 2017 46th International Conference on Parallel Processing Workshops (ICPPW), s. 85 -94, 2017.
- Cutress Ian. Intel's 140GB Optane 3D XPoint PCIe SSD Spot- ted at IDF, 2016. http://www.anandtech.com/show/10604/ intels-140gb-optane-3d-xpoint-pcie-ssd-spotted-at-idf.
- Chamberlain Bradford L., Deitz Steven J., Figueroa Samuel, Iten David M., Stone Andrew. Global HPC Challenge Benchmarks in Chapel. 2008.
- Ching A., Coloma K., Li J., Liao W., Choudhary A. High-Performance Techniqu- es for Parallel I/O. W: Handbook of Parallel Computing: Models, Algorithms and Applications, s. 166 -189, 2001. open in new tab
- de Charentenay Yann. STT-MRAM is moving to large scale commercialization (at last!). W: 2017 IEEE International Magnetics Conference (INTERMAG), s. 1 -2, 2017. open in new tab
- Dorożynski Piotr, Czarnul Paweł, Malinowski Artur, Czuryło Krzysztof, Dorau Łu- kasz, Maciejewski Maciej, Skowron Paweł. Checkpointing of Parallel MPI Appli- cations Using MPI One-sided API with Support for Byte-addressable Non-volatile RAM. W: Procedia Computer Science, nr 80, s. 30 -40, 2016.
- Dursi Jonathan. HPC is dying, and MPI is killing it, 2015. https://www.dursi. ca/post/hpc-is-dying-and-mpi-is-killing-it.html.
- Fan Bin, Tantisiriroj Wittawat, Xiao Lin, Gibson Garth. DiskReduce: RAID for Data-intensive Scalable Computing. W: Proceedings of the 4th Annual Workshop on
- Petascale Data Storage, PDSW '09, s. 6 -10, Nowy Jork, Stany Zjednoczone, 2009. ACM. open in new tab
- Fan Ziqi. Improving Storage Performance with Non-Volatile Memory-based Caching Systems. Praca doktorska. University of Minnesota, 2017.
- Fernando Pradeep, Kannan Sudarsun, Gavrilovska Ada, Schwan Karsten. Phoenix: Memory Speed HPC I/O with NVM. W: 2016 IEEE 23rd International Conference on High Performance Computing (HiPC), s. 121 -131, 2016.
- Fong Scott W., Neumann Christopher M., Wong H. S. Philip. Phase-Change Memo- ry: Towards a Storage-Class Memory. W: IEEE Transactions on Electron Devices, nr 64 (11), s. 4374 -4385, 2017.
- Foong Annie, Hady Frank. Storage As Fast As Rest of the System. W: 2016 IEEE 8th International Memory Workshop (IMW), s. 1 -4, 2016.
- Google Art Project w zbiorach Wikimedia Commons. Gigapixel images from the Google Art Project, 2019. https://commons.wikimedia.org/wiki/Category: Gigapixel_images_from_the_Google_Art_Project. open in new tab
- Greengard Samuel. Better Memory. W: Commun. ACM, nr 59 (1), s. 23 -25, 2015. open in new tab
- Gutierrez-Milla Albert, Borges Francisco, Suppi Remo, Luque Emilio. Individual- oriented Model Crowd Evacuations Distributed Simulation. W: Procedia Computer Science, nr 29, s. 1600 -1609, 2014.
- Hadri Bilel. Introduction to Parallel I/O, 2011. https://www.olcf.ornl.gov/ wp-content/uploads/2011/10/Fall_IO.pdf. open in new tab
- Hasselblad Press Release. Hasselblad introduces the H6D-400c open in new tab
- MS, 2018. https://www.hasselblad.com/press/press-releases/ hasselblad-introduces-the-h6d-400c-ms/. open in new tab
- He Shuibing, Sun Xian-He, Feng Bo. S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems. W: 2014 Ieee 34th Int. Conference On Distributed Computing Systems (ICDCS 2014), s. 514-523, 2014.
- He Shuibing, Sun Xian-He, Feng Bo, Huang Xin, Feng Kun. A cost-aware region-level data placement scheme for hybrid parallel I/O systems. W: 2013 IEEE International Conference on Cluster Computing (CLUSTER), s. 1 -8, 2013.
- He Shuibing, Yang Wang, Xian-He Sun. Improving Performance of Parallel I/O Systems through Selective and Layout-Aware SSD Cache. W: IEEE Transactions on Parallel and Distributed Systems, nr 27 (10), s. 2940 -2952, 2016.
- Henseler Dave, Landsteiner Benjamin, Petesch Doug, Wright Cornell, Wright Ni- cholas J. Architecture and design of cray datawarp. W: Cray User Group CUG, 2016.
- Hoefler Torsten, Snir Marc. Writing Parallel Libraries with MPI -Common Practice, Issues, and Extensions. W: Recent Advances in the Message Passing Interface, s. 345 -355, Berlin, Niemcy, 2011. Springer Berlin Heidelberg.
- Hori Atsushi, Yamamoto Keiji, Ishikawa Yutaka. Catwalk-ROMIO: A Cost-Effective MPI-IO. W: 2011 IEEE 17th Int. Conference On Parallel Distributed Systems (ic- pads), s. 120 -126, 2011.
- Huang Dachuan, Zhang Xuechen, Shi Wei, Zheng Mai, Jiang Song, Qin Feng. LiU: Hiding Disk Access Latency for HPC Applications with a New SSD-Enabled Data Layout. W: 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, s. 111 -120, 2013.
- Intel Corporation. Intel and Micron Produce Breakthrough Memory Technology, 2015. http://newsroom.intel.com/community/intel_newsroom/blog/2015/07/ 28/intel-and-micron-produce-breakthrough-memory-technology. open in new tab
- Intel Corporation. Reimagining the Data Center Memory and Sto- rage Hierarchy, 2018. https://newsroom.intel.com/editorials/ re-architecting-data-center-memory-storage-hierarchy/. open in new tab
- Jain Nikhil, Bhatele Abhinav, Ni Xiang, Wright Nicholas J., Kale Laxmikant V. Ma- ximizing Throughput on a Dragonfly Network. W: SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, s. 336 -347, 2014.
- Jarząbek Łukasz, Czarnul Paweł. Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications. W: The Journal of Supercomputing, nr 73 (12), s. 5378 -5401, 2017.
- JEDEC Press Release. JEDEC Announces Support for NVDIMM Hy- brid Memory Modules, 2015. https://www.jedec.org/news/pressreleases/ jedec-announces-support-nvdimm-hybrid-memory-modules. open in new tab
- JEDEC Press Release. JEDEC DDR5 NVDIMM-P Standards Un- der Development, 2017. https://www.jedec.org/news/pressreleases/ jedec-ddr5-nvdimm-p-standards-under-development. open in new tab
- JEDEC Standards and Documents. Main Memory: DDR4 and DDR5 open in new tab
- SDRAM, 2019. https://www.jedec.org/category/technology-focus-area/ main-memory-ddr3-ddr4-sdram. open in new tab
- Jung Myoungsoo, Choi Wonil, Srikantaiah Shekhar, Yoo Joonhyuk, Kandemir Mah- mut T. HIOS: A Host Interface I/O Scheduler for Solid State Disks. W: SIGARCH Comput. Archit. News, nr 42 (3), s. 289 -300, 2014.
- Kaiser Nick, Burgett William, Chambers Ken, Denneau Larry, Heasley Jim, Jedicke Robert, Magnier Eugene, Morgan Jeff, Onaka Peter, Tonry John. The Pan-STARRS wide-field optical/NIR imaging survey. W :Proc. SPIE, nr 7733 (14), 2010.
- Kang Seok-Hoon, Koo Dong-Hyun, Kang Woon-Hak, Lee Sang-Won. A case for flash memory ssd in hadoop applications. W: International Journal of Control and Automation, nr 6 (1), s. 201 -210, 2013.
- Kannan Sudarsun, Gavrilovska Ada, Schwan Karsten, Milojicic Dejan, Talwar Va- nish. Using Active NVRAM for I/O Staging. W: Proceedings of the 2Nd International
- Kettering Brett M., Nunez James A. The role of non-volatile memory from an application perspective. W: 2010 IEEE Globecom Workshops, s. 1921 -1925, 2010.
- Kim Jungwon, Lee Seyong, Vetter Jeffrey S. PapyrusKV: A High-performance Pa- rallel Key-value Store for Distributed NVM Architectures. W: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '17, s. 1 -14, Nowy Jork, Stany Zjednoczone, 2017. ACM.
- Kobayashi Hiroyuki, Ishimoto Yutaka, Fujioka Masaki, Ishibashi Kenichi . A multi- agent evacuation simulator to design safe cities for high quality of life with computer clustering. W: SICE, 2007 Annual Conference, s. 3043 -3046, 2007.
- Konishi Ryusuke, Amagai Yoshiji, Sato Koji, Hifumi Hisashi, Kihara Seiji, Moriai Satoshi. The Linux Implementation of a Log-structured File System. SIGOPS Oper.
- Syst. Rev., nr 40 (3), s. 102 -107, 2006. open in new tab
- Kryder Mark H., Soo Kim Chang. After Hard Drives -What Comes Next? Ma- gnetics, IEEE Transactions on, nr 45 (10), s. 3406 -3413, 2009.
- Kumar Gyanendra, Tomar Parul. A Novel Longest Distance First Page Replacement Algorithm. nr 10, s. 1 -6, 2017.
- Kyrola Aapo, Blelloch Guy, Guestrin Carlos. GraphChi: Large-Scale Graph Com- putation on Just a PC. W: Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), s. 31 -46, Hollywood, Stany Zjednoczone, 2012. USENIX. open in new tab
- Kültürsay Emre, Kandemir Mahmut, Sivasubramaniam Anand, Mutlu Onur. Evalu- ating STT-RAM as an energy-efficient main memory alternative. W: 2013 IEEE In- ternational Symposium on Performance Analysis of Systems and Software (ISPASS), s. 256 -267, 2013.
- Lang Samuel, Carns Philip, Latham Robert, Ross Robert, Harms Kevin, Allcock William. I/O Performance Challenges at Leadership Scale. W: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, nr 40, s. 1 -12, Nowy Jork, Stany Zjednoczone, 2009. ACM.
- Larrosa Rafael , Asenjo Rafael, Navarro Angeles, Chamberlain Bradford L. A First Implementation of Parallel IO in Chapel for Block Data Distribution. W: Appli- cations, Tools and Techniques on the Road to Exascale Computing, Advances in Parallel Computing, s. 447 -454, 2017.
- Latham Robert, Ross Robert, Thakur Rajeev. Implementing MPI-IO Atomic Mode and Shared File Pointers Using MPI One-Sided Communication. W: International Journal of High Performance Computing Applications, nr 21 (2), s. 132 -143, 2007.
- Li Xu, Lu Kai, Wang Xiaoping, Zhou Xu. NV-process: A Fault-tolerance Process Model Based on Non-volatile Memory. W: Proceedings of the Asia-Pacific Workshop on Systems, APSYS '12, s. 1 -6, Nowy Jork, Stany Zjednoczone, 2012. ACM.
- Li Xu, Lu Kai, Zhou Xu. NV-TS: A Fault Tolerance Transaction System Based on Persistent Memory. W: 2012 International Conference on Computer Science and Electronics Engineering, nr 2, s. 221 -224, 2012.
- Liu Jialin, Racah Evan, Koziol Quincey, Canon Richard Shane. H5spark: bridging the I/O gap between spark and scientific data formats on HPC systems. Cray user group, 2016. open in new tab
- Liu Ning, Cope Jason, Carns Philip, Carothers Christopher, Ross Robert, Grider Gary, Crume Adam, Maltzahn Carlos. On the role of burst buffers in leadership- class storage systems. W: 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), s. 1 -11, 2012.
- Liu Wei, Wu Kai, Liu Jialin, Chen Feng, Li Dong. Performance Evaluation and Modeling of HPC I/O on Non-Volatile Memory. W: 2017 International Conference on Networking, Architecture, and Storage (NAS), s. 1 -10, 2017.
- Luu Huong, Behzad Babak, Aydt Ruth, Winslett Marianne. A multi-level approach for understanding I/O activity in HPC applications. W: 2013 IEEE International Conference on Cluster Computing (CLUSTER), s. 1 -5, 2013.
- Luu Huong, Winslett Marianne, Gropp William, Ross Robert, Carns Philip, Harms Kevin, Prabhat Mr, Byna Suren, Yao Yushu. A Multiplatform Study of I/O Behavior on Petascale Supercomputers. W: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, HPDC '15, s. 33 -44, Nowy Jork, Stany Zjednoczone, 2015. ACM.
- Makinoshima Fumiyasu, Imamura Fumihiko, Abe Yoshi. Enhancing a tsunami eva- cuation simulation for a multi-scenario analysis using parallel computing. W: Simu- lation Modelling Practice and Theory, 2018.
- Malinowski Artur. NVRAM as Main Storage of Parallel File System. W: Journal of Computer Science and Control Systems, nr 9, s. 18 -21, 2016.
- Malinowski Artur. Using Redis supported by NVRAM in HPC applications. W: Computer Science (AGH), nr 18 (3), 2017.
- Malinowski Artur, Czarnul Paweł. Multi-agent large-scale parallel crowd simulation with NVRAM-based distributed cache. W: Journal of Computational Science, vol. 33, s. 83 -94, 2019.
- Malinowski Artur, Czarnul Paweł. Three levels of fail-safe mode in MPI I/O NVRAM distributed cache. W: Procedia Computer Science, nr 136, s. 52 -61, 2018. 7th International Young Scientists Conference on Computational Science, YSC 2018, Heraklion, Greece.
- Malinowski Artur, Czarnul Paweł, Matuszek Mariusz. Recommendations for Writing Parallel Libraries with C and MPI. Przesłany do recenzji.
- Malinowski Artur, Czarnul Paweł. Distributed NVRAM Cache -Optimization and Evaluation with Power of Adjacency Matrix. W: Computer Information Systems and Industrial Management, s. 15 -26, 2017. Springer International Publishing.
- Malinowski Artur, Czarnul Paweł. A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache. W: Scalable Computing: Practice and Experience, nr 19 (1), 2018.
- Malinowski Artur, Czarnul Paweł, Czuryło Krzysztof, Maciejewski Maciej, Skowron Paweł. Multi-agent large-scale parallel crowd simulation. W: Procedia Computer Science, nr 108, s. 917 -926, 2017. International Conference on Computational Science, ICCS 2017, Zurych, Szwajcaria.
- Malinowski Artur, Czarnul Paweł, Dorożynski Piotr, Czuryło Krzysztof, Dorau Łu- kasz, Maciejewski Maciej, Skowron Paweł. A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache. W: Position Papers of the 2016 Federated Conference on Computer Science and Information Systems, tom 9 Annals of Computer Science and Information Systems, s. 133 -140. PTI, 2016.
- Malinowski Artur, Czarnul Paweł, Maciejewski Maciej, Skowron Paweł. A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications. W: Information Systems Architecture and Technology: Proceedings of 37th International Conference on Information Systems Architecture and Technology -ISAT 2016 -Part II, s. 137 -147, 2017. Springer International Publishing.
- Meena Jagan Singh, Sze Simon Min, Chand Umesh, Tseng Tseung-Yuen. Overview of emerging nonvolatile memory technologies. W: Nanoscale Research Letters, nr 9 (1), 2014.
- Mehta Kshitij, Gabriel Edgar, Chapman Barbara. Specification and Performance Evaluation of Parallel I/O Interfaces for OpenMP. W: OpenMP in a Heterogeneous World, s. 1 -14, Berlin, Niemcy, 2012. Springer Berlin Heidelberg.
- Message Passing Interface Forum. MPI: A Message-Passing Interface Standard Ver- sion 3.1, 2015. http://www.mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf. open in new tab
- Mogg Trevor. We could explore this astonishing 195-gigapixel panora- ma of Shanghai all day, 2018. https://www.digitaltrends.com/news/ check-out-this-astonishing-195-gigapixel-image-of-shanghai/. open in new tab
- Molka Daniel, Hackenberg Daniel, Schone Robert, Muller Matthias S. Memory Per- formance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System.
- W: 2009 18th International Conference on Parallel Architectures and Compilation Techniques, s. 261 -270, 2009. open in new tab
- NASA/ESA. Hubble's High-Definition Panoramic View of the Andromeda Galaxy, 2015. http://www.spacetelescope.org/images/heic1502a/.
- Nowak Janusz J. i in. Dependence of Voltage and Size on Write Error Rates in Spin- Transfer Torque Magnetic Random-Access Memory. W: IEEE Magnetics Letters, nr 7, s. 1 -4, 2016.
- Patil Onkar, Hukerikar Saurabh, Mueller Frank, Englemann Christian. W: Exploring Use-cases for Non-Volatile Memories in support of HPC Resilience, 2017.
- Pavlovic Milan, Puzovic Nikola, Ramirez Alex. Data placement in HPC architectures with heterogeneous off-chip memory. W: 2013 IEEE 31st International Conference on Computer Design (ICCD), s. 193 -200, 2013.
- Pawlowski J. Thomas. Memory as we approach a new horizon. W: 2016 IEEE Hot Chips 28 Symposium (HCS), s. 1 -23, 2016. open in new tab
- Protopopov Boris V., Skjellum Anthony . A Multithreaded Message Passing Inter- face (MPI) Architecture: Performance and Program Issues. W: Journal of Parallel and Distributed Computing, nr 61 (4), s. 449 -466, 2001.
- Radulovic Milan Zivanovic Darko, Ruiz Daniel, de Supinski Bronis R., McKee Sal- ly A., Radojković Petar, Ayguadé Eduard . Another Trip to the Wall: How Much Will Stacked DRAM Benefit HPC? W: Proceedings of the 2015 International Sympo- sium on Memory Systems, MEMSYS '15, s. 31 -36, Nowy Jork, Stany Zjednoczone, 2015. ACM.
- Rajachandrasekar Raghunath, Moody Adam, Mohror Kathryn, Panda Dhabaleswar K. A 1 PB/s File System to Checkpoint Three Million MPI Tasks. W: Proceedings of the 22Nd International Symposium on High-performance Parallel and Distributed Computing, HPDC '13, s. 143 -154, Nowy Jork, Stany Zjednoczone, 2013. ACM.
- Rudoff Andy. Persistent Memory: The Value to HPC and the Challenges. W: Pro- ceedings of the Workshop on Memory Centric Programming for HPC, MCHPC'17, s. 7 -10, Nowy Jork, Stany Zjednoczone, 2017. ACM.
- Schaller Robert. Moore's law: past, present and future. W: IEEE Spectrum, nr 34 (6), s. 52 -59, 1997.
- Schenck Wolfram, El Sayed Salem, Foszczynski Maciej, Homberg Wilhelm, Pleiter Dirk. Early Evaluation of the "Infinite Memory Engine" Burst Buffer Solution. W: High Performance Computing, s. 604 -615, 2016. Springer International Publishing.
- Schulz Martin, de Supinski Bronis R. PnMPI Tools: A Whole Lot Greater Than the Sum of Their Parts. W: ACM/IEEE Supercomputing Conference (SC), s. 1 -10.
- Shantharam Manu, Iwabuchi Keita, Cicotti Pietro, Carrington Laura, Gokhale Maya, Pearce Roger. Performance Evaluation of Scale-Free Graph Algorithms in Low Latency Non-volatile Memory. W: 2017 IEEE International Parallel and Di- stributed Processing Symposium Workshops (IPDPSW), s. 1021 -1028, 2017.
- Skjellum Anthony, Doss Nathan E., Bangalore Purushotham V. Writing libraries in MPI. W: Proceedings of Scalable Parallel Libraries Conference, s. 166 -173, 1993.
- Song Huaiming, Yin Yanlong, Sun Xian-He, Thakur Rajeev, Lang Samuel. Server- side I/O Coordination for Parallel File Systems. W: Proceedings of 2011 Internatio- nal Conference for High Performance Computing, Networking, Storage and Analysis, SC '11, nr 17, s. 1 -11, Nowy Jork, Stany Zjednoczone, 2011. ACM.
- Stallings William. Computer Organization and Architecture: Designing for Perfor- mance. Pearson, edycja 9, 2013.
- Storage Networking Industry Association. NVM Programming Model (NPM), 2017. https://www.snia.org/sites/default/files/technical_work/final/ NVMProgrammingModel_v1.2.pdf. open in new tab
- Storage Networking Industry Association. Persistent Memory and NVDIMM Special Interest Group. https://www.snia.org/forums/sssi/NVDIMM. open in new tab
- Suresh Amoghavarsha, Cicotti Pietro, Carrington Laura. Evaluation of emerging memory technologies for HPC, data intensive applications. W: 2014 IEEE Interna- tional Conference on Cluster Computing (CLUSTER), s. 239 -247, 2014.
- Tessier François, Malakar Preeti, Vishwanath Venkatram, Jeannot Emmanuel, Isaila Florin. Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Super- computers. W: 2016 First International Workshop on Communication Optimizations in HPC (COMHPC), s. 73 -81, 2016.
- Thakur Rajeev, Gropp William, Lusk Ewing. Data sieving and collective I/O in ROMIO. W: Frontiers '99 -Seventh Symposium On Frontiers Massively Parallel Computation, Proc., s. 182 -189, 1999.
- Tsujita Yuichi, Yoshinaga Kazumi, Hori Atsushi, Sato Mikiko, Namiki Mitaro, Ishi- kawa Yutaka. Multithreaded Two-Phase I/O: Improving Collective MPI-IO Per- formance on a Lustre File System. W: 2014 22nd Euromicro Int. Conference On Parallel, Distributed, Network-based Processing (pdp 2014), s. 232 -235, 2014.
- Turing Alan Mathison. On Computable Numbers, with an Application to the Ent- scheidungsproblem. W: Proceedings of the London Mathematical Society, s. 230-265, 1937.
- Van Essen Brian, Pearce Roger, Ames Sasha, Gokhale Maya. On the Role of NVRAM in Data-intensive Architectures: An Evaluation. W: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, s. 703 -714, 2012.
- Vetter Jeffrey S., Mittal Sparsh. Opportunities for Nonvolatile Memory Systems in Extreme-Scale High-Performance Computing. W: Computing in Science Engine- ering, nr 17 (2), s. 73 -82, 2015.
- Wang Teng, Mohror Kathryn, Moody Adam, Sato Kento, Yu Weikuan. An Ephe- meral Burst-buffer File System for Scientific Applications. W: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '16, nr 69, s. 1 -12, Nowy Jork, Stany Zjednoczone, 2016. IEEE Press.
- Wang Teng, Oral Sarp, Wang Yandong, Settlemyer Brad, Atchley Scott, Yu Weiku- an. BurstMem: A high-performance burst buffer system for scientific applications. W: 2014 IEEE International Conference on Big Data (Big Data), s. 71 -79, 2014.
- Wasi-ur Rahman, Islam Nusrat Sharmin, Lu Xiaoyi, Panda Dhabaleswar K. Can Non-volatile Memory Benefit MapReduce Applications on HPC Clusters? W: 2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Sca- lable Computing Systems (PDSW-DISCS), s. 19 -24, 2016. open in new tab
- Wautelet Philippe. Best practices for parallel IO and MPI-IO hints, March 2015. http://www.idris.fr/media/docs/docu/idris/idris_patc_hints_proj.pdf.
- Wei Qingsong, Wang Chundong, Chen Cheng, Yang Yechao, Yang Jun, Xue Mingdi. Transactional NVM Cache with High Performance and Crash Consistency. W: Pro- ceedings of the International Conference for High Performance Computing, Networ- king, Storage and Analysis, SC '17, nr 56, s. 1 -12, Nowy Jork, Stany Zjednoczone, 2017. ACM.
- Wittmann Markus, Hager Georg, Zeiser Thomas, Wellein Gerhard. Asynchronous MPI for the Masses. W: CoRR, 2013.
- Wu Kai, Ober Frank, Hamlin Shari, Li Dong. Early Evaluation of Intel Optane Non-Volatile Memory with HPC I/O Workloads. W: CoRR, 2017.
- Wuttig Matthias. Phase-change materials: Towards a universal memory? W: Nature materials, nr 4, s. 265 -266, 2005.
- Xuan Pengfei, Ligon Walter B., Srimani Pradip K., Ge Rong, Luo Feng. Accelerating big data analytics on HPC clusters using two-level storage. W: Parallel Computing.
- Special Issue on 2015 Workshop on Data Intensive Scalable Computing Systems (DISCS-2015), nr 61, s. 18 -34, 2017. open in new tab
- Yang Shuo, Wu Kai, Qiao Yifan, Li Dong, Zhai Jidong. Algorithm-Directed Crash Consistence in Non-volatile Memory for HPC. W: 2017 IEEE International Confe- rence on Cluster Computing (CLUSTER), s. 475 -486, 2017.
- Yin Yanlong, Li Jibing, He Jun, Sun Xian-He, Thakur Rajeev. Pattern-Direct and Layout-Aware Replication Scheme for Parallel I/O Systems. W: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, s. 345 -356, 2013.
- Yu Songping, Deng Mingzhu, Xing Yuxuan, Xiao Nong, Liu Fang, Chen Wei. Py- ramid: Revisiting Memory Extension with Remote Accessible Non-Volatile Main Memory. W: Security, Privacy, and Anonymity in Computation, Communication, and Storage, s. 730 -743, 2017. Springer International Publishing.
- Yu Songping, Xiao Nong, Deng Mingzhu, Xing Yuxuan, Liu Fang, Chen Wei. Me- galloc: Fast Distributed Memory Allocator for NVM-Based Cluster. W: 2017 In- ternational Conference on Networking, Architecture, and Storage (NAS), s. 1 -9, 2017.
- Zhang Michael. Bentley Used NASA Tech to Create This 53-
- Gigapixel Car Photo, 2016. https://petapixel.com/2016/06/23/ bentley-used-nasa-tech-create-53-gigapixel-photo-car/. open in new tab
- Zhang Mingzhe, Lam King Tin, Yao Xin, Wang Cho-Li. SIMPO: A Scalable In- Memory Persistent Object Framework Using NVRAM for Reliable Big Data Com- puting. W: ACM Transactions on Architecture and Code Optimization
- Zhang Xuechen, Davis Kei, Jiang Song. iTransformer: Using SSD to Improve Disk Scheduling for High-performance I/O. W: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, s. 715 -726, 2012.
- Zhang Xuechen, Liu Ke, Davis Kei, Jiang Song. iBridge: Improving Unaligned Paral- lel File Access with Solid-State Drives. W: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, s. 381 -392, 2013.
- Zhou Ping, Zhao Bo, Yang Jun, Zhang Youtao. A Durable and Energy Efficient Main Memory Using Phase Change Memory Technology. W: SIGARCH Comput.
- Archit. News, nr 37 (3), s. 14 -23, 2009. open in new tab
- Verified by:
- No verification
seen 208 times
Recommended for you
Zastosowanie lasera femto - sekundowego do modyfikacji fotoelektrody TiO2 w barwnikowym ogniwie fotowoltaicznym,
- M. Klein,
- K. Siuzdak,
- R. Barbucha
- + 1 authors