Proactive Data Containers (PDC)

Proactive Data Containers (PDC) are containers within a locus of storage (memory, NVRAM, disk, etc.) that store science data in an object-oriented manner. Managing data as objects enables powerful optimization opportunities for data movement and transformations, and storage mechanisms that take advantage of the deep storage hierarchy and enable automated performance tuning.

Command line and python interface to an object-centric data management system

  • Topics: Python, object-centric data management, PDC
  • Skills: Linux, C, Python
  • Difficulty: Medium
  • Size: Large (350 hours)
  • Mentors: Houjun Tang, Suren Byna

Proactive Data Containers (PDC) is an object-centric data management system for scientific data on high performance computing systems. It manages objects and their associated metadata within a locus of storage (memory, NVRAM, disk, etc.). Managing data as objects enables powerful optimization opportunities for data movement and transformations, and storage mechanisms that take advantage of the deep storage hierarchy and enable automated performance tuning. This project includes developing and updating efficient and user friendly command line and Python interfaces for PDC.

Houjun Tang
Houjun Tang
Computer Research Scientist, Lawrence Berkeley National Laboratory