This site has been established as part of the ECP CODAR project.
This site provides reference scientific datasets, data reduction techniques, error metrics, error controls and error assessment tools for users and developers of scientific data reduction techniques.
Important: when publishing results from one or more datasets presented in this webpage, make sure to:
Mark Taylor (SNL)
Dataset1 : 79 fields: 2D, 1800 x 3600 ;
Dataset2 : 1 field :
Both are single precision, binary
1.47GB + 17GB
This dataset has been approved for unlimited release by Los Alamos National Laboratory and has been assigned LA-UR-18-25670.
Molecular dynamics simulation
6 fields: x,y,z,vx,vy,vz,
Each field stored separately,
13 fields: 3D, 100x500x500, single-precision, binary (cleared dataset by replacing background by 0)
Images from the LCLS instrument
HDF5 and binary
1 snapshot: 6 fields x,y,z,vx,vy,vz)
Each field stored separately,
Lukic et al. “methods: numerical, intergalactic medium, quasars: absorption lines, large-scale structure of universe”, journal of Monthly Notices of Royal Astronomical Society
Adaptive mesh hydrodynamics + N-body cosmological simulation
6 fields, 3D,
512 x 512 x 512
Example molecular 2-electron integral values generated by libint, a library developed by the Valeev research group at Virginia Tech. See https://github.com/evaleev/libint Libint is an integral evaluation option in NWChemEx.
Two-electron repulsion integrals computed over Gaussian-type orbital basis sets
3 fields, 1D,
QMCPACK performance test
(contact: Ye Luo: email@example.com)
Many-body ab initio Quantum Monte Carlo (electronic structure of atoms, molecules, and solids)
288 orbitals, 3D,
69 x 69 x 115,
Single precision and
Binary, Little endian
Kolla, Hemanth NMN (firstname.lastname@example.org)
11 fields, 3D, 500x500x500,
Princeton Plasma Physics Laboratory (PPPL)
9 timesteps, 3D,
(the mesh data is in the archive),
Michael Churchill, Princeton Plasma Physics Laboratory (PPPL)
- Copyright: Free to use but need to check with Michael Churchil (email@example.com) before publishing results.
Fusion Gas Puff Image (GPI) data
2D time-series data (movie), 80x64 image with
Synthetic, generated to specified regularity
(3 datasets with 3 different regularity)
Note: This table will be augmented with metrics that matter for users of these datasets as well as recommended settings for error control (lossy compression).
In general, the extension of the data file is named in the following convention :
Others: please submit your proposal of datasets to codar-reduction (at) cels.anl.gov. Requirements: datasets will be open to public access. Dataset should be linked to a simulation application or a scientific instrument. Metadata should explain the source origin of the dataset and how it has been produced (what simulation, what instrument, what settings). Upon review by the SDRBenchmarks committee, the dataset will (or will not) be added to the SDRBenchmarks repository.
Commonly used metrics for reduction technique assessment:
Assessment tools, metrics and error control: