Data Loaders

The load module is a set of methods dedicated to loading the output files from a GPUMD simulation into a python environment. The format of the loaded data depends on the file itself, but is usually either an array, or a dictionary of arrays/scalars. Please refer to the documentation for each method.

List of all methods

gpyumd.load.load_compute(quantities: List[str], directory: Optional[str] = None, filename: str = 'compute.out') → Dict[str, Union[ndarray, int, Any]][source]

Loads data from compute.out GPUMD output file. Currently supports: loading a single run.

Parameters

quantities – Quantities to extract from compute.out Accepted quantities are: [‘temperature’, ‘potential’, ‘force’, ‘virial’, ‘jp’, ‘jk’]. Other quantities will be ignored.
directory – Directory to load compute file from
filename – file to load compute from

Returns

Dictionary containing the data from compute.out. Units are: [temperature -> K], [potential, virial, Ein, Eout -> eV], [force -> eV/A], [jp, jk -> (eV^(3/2))(amu^(-1/2))]

gpyumd.load.load_dos(num_dos_points: Union[int, List[int]], filename: str = 'dos.out', directory: Optional[str] = None) → Dict[str, Dict[str, ndarray]][source]

Loads data from dos.out GPUMD output file.

Parameters

num_dos_points – Number of frequency points the DOS is computed for.
filename – File to load DOS from.
directory – Directory to load ‘dos.out’ file from (dir. of simulation)

Returns

Dictonary with DOS data. The outermost dictionary stores each: individual run. Units are [nu -> THz], [DOSx, DOSy, DOSz -> THz^-1]

gpyumd.load.load_force(num_atoms: int, filename: str = 'force.out', directory: Optional[str] = None) → ndarray[source]

Loads data from force.out GPUMD output file. Currently supports loading a single run.

Parameters

num_atoms – Number of atoms force is output for
filename – Name of force data file
directory – Directory to load force file from

Returns

Numpy array of shape (-1,n,3) containing all forces (ev/A) from filename

gpyumd.load.load_frequency_info(num_atoms: int, bin_f_size: float, eigfile: str = 'eigenvector.out', directory: Optional[str] = None) → dict[source]

Gathers eigen-frequency information from the eigenvector file and sorts it appropriately based on the selected frequency bins (identical to internal GPUMD representation).

Parameters

num_atoms – The number of atoms in the structure
bin_f_size – The frequency-based bin size (in THz)
directory – Directory eigfile is stored
eigfile – The filename of the eigenvector output/input file created by GPUMD phonon package

Returns

Dictionary with the system eigen-freqeuency information along with: binning information. Units are [fq, fmax, fmin, bin_f_size -> THz], [shift, nbins, bin_count -> N/A].

gpyumd.load.load_hac(num_corr_points: Union[int, List[int]], output_interval: Union[int, List[int]], filename: str = 'hac.out', directory: Optional[str] = None) → Dict[str, Dict[str, ndarray]][source]

Loads data from hac.out GPUMD output file.

Parameters

num_corr_points – Number of correlation steps
output_interval – Output interval for HAC and RTC data
filename – The hac data file
directory – Directory containing hac data file

Returns

Dictionary containing the data from hac runs. Units are [t -> ps],: [kxi, kxo, kyi, kyo, kz -> W(m^-1)(K^-1)], [jxijx, jxojx, jyijy, jyojy, jzjz -> (eV^3)(amu^-1)]

gpyumd.load.load_heatmode(nbins: int, nsamples: int, inputfile: str = 'heatmode.out', directory: Optional[str] = None, directions: str = 'xyz', ndiv: Optional[int] = None, outputfile: str = 'heatmode.npy', save: bool = False, multiprocessing: bool = False, ncore: Optional[int] = None, block_size: int = 65536, return_data: bool = True) → Union[None, Dict[str, ndarray]][source]

Loads data from heatmode.out GPUMD file. Option to save as binary file for fast re-load later. WARNING: If using multiprocessing, memory usage may be significantly larger than file size.

Parameters

nbins – Number of bins used during the GPUMD simulation
nsamples – Number of times heat flux was sampled with GKMA during GPUMD simulation
inputfile – Modal heat flux file output by GPUMD
directory – Name of directory storing the input file to read
directions – Directions to gather data from. Any order of ‘xyz’ is accepted. Excluding directions also allowed (i.e. ‘xz’ is accepted)
ndiv – Integer used to shrink number of bins output. If originally have 10 bins, but want 5, ndiv=2. nbins/ndiv need not be an integer
outputfile – File name to save read data to. Output file is a binary dictionary. Loading from a binary file is much faster than re-reading data files and saving is recommended
save – Toggle saving data to binary dictionary. Loading from save file is much faster and recommended
multiprocessing – Toggle using multi-core processing for conversion of text file
ncore – Number of cores to use for multiprocessing. Ignored if multiprocessing is False
block_size – Size of block (in bytes) to be read per read operation. File reading performance depend on this parameter and file size
return_data – Toggle returning the loaded modal heat flux data. If this is False, the user should ensure that save is True
Returns –

Dictionary with all modal heat fluxes requested. Units are
[nbins, nsamples -> N/A], [jmxi, jmxo, jmyi, jmyo, jmz -> (eV^3/2)(amu^-1/2)(x*^-1)]. Here *x is the size of the bins in THz. For example, if there are 4 bins per THz, x = 0.25 THz.

gpyumd.load.load_kappa(filename: str = 'kappa.out', directory: Optional[str] = None) → Dict[str, ndarray][source]

Loads data from kappa.out GPUMD output file which contains HNEMD kappa.

Parameters

filename – The kappa data file
directory – Directory containing kappa data file

Returns

Dictionary with keys corresponding to the columns in ‘kappa.out’.: Units are [kxi, kxo, kyi, kyo, kz -> W(m^-1)(K^-1)]

gpyumd.load.load_kappamode(nbins: int, nsamples: int, inputfile: str = 'kappamode.out', directory: Optional[str] = None, directions: str = 'xyz', ndiv: Optional[int] = None, outputfile: str = 'kappamode.npy', save: bool = False, multiprocessing: bool = False, ncore: Optional[int] = None, block_size: int = 65536, return_data: bool = True) → Union[None, Dict[str, ndarray]][source]

Loads data from kappamode.out GPUMD file. Option to save as binary file for fast re-load later. WARNING: If using multiprocessing, memory useage may be significantly larger than file size

Parameters

nbins – Number of bins used during the GPUMD simulation
nsamples – Number of times thermal conductivity was sampled with HNEMA during GPUMD simulation
inputfile – Modal thermal conductivity file output by GPUMD
directory – Name of directory storing the input file to read
directions – Directions to gather data from. Any order of ‘xyz’ is accepted. Excluding directions also allowed (i.e. ‘xz’ is accepted)
ndiv – Integer used to shrink number of bins output. If originally have 10 bins, but want 5, ndiv=2. nbins/ndiv need not be an int
outputfile – File name to save read data to. Output file is a binary dictionary. Loading from a binary file is much faster than re-reading data files and saving is recommended
save – Toggle saving data to binary dictionary. Loading from save file is much faster and recommended
multiprocessing – Toggle using multi-core processing for conversion of text file
ncore – Number of cores to use for multiprocessing. Ignored if multiprocessing is False
block_size – Size of block (in bytes) to be read per read operation. File reading performance depend on this parameter and file size
return_data – Toggle returning the loaded modal thermal conductivity data. If this is False, the user should ensure that save is True
Returns –

Dictionary with all modal thermal conductivities requested.
Units are [nbins, nsamples -> N/A], [kmxi, kmxo, kmyi, kmyo, kmz -> W(m^-1)(K^-1)(x*^-1)]. Here *x is the size of the bins in THz. For example, if there are 4 bins per THz, x = 0.25 THz.

gpyumd.load.load_omega2(filename: str = 'omega2.out', directory: Optional[str] = None) → ndarray[source]

Loads data from omega2.out GPUMD output file.

Parameters

filename – Name of force data file
directory – Directory to load force file from

Returns

Array of shape (N_kpoints,3*N_basis) in units of THz. N_kpoints is number of k points in kpoint.in and N_basis is the number of basis atoms defined in basis.in

gpyumd.load.load_saved_heatmode(filename: str = 'heatmode.npy', directory: Optional[str] = None)[source]

Loads data saved by the ‘load_heatmode’ or ‘get_gkma_kappa’ function and returns the original dictionary.

Parameters

filename – Name of the file to load
directory – Directory the data file is located in

Returns

Dictionary with all modal heat flux previously requested

gpyumd.load.load_saved_kappamode(filename: str = 'kappamode.npy', directory: Optional[str] = None)[source]

Loads data saved by the ‘load_kappamode’ function and returns the original dictionary.

Parameters

filename – Name of the file to load
directory – Directory the data file is located in

Returns

Dictionary with all modal thermal conductivities previously: requested

gpyumd.load.load_sdc(num_corr_points: Union[int, List[int]], filename: str = 'sdc.out', directory: Optional[str] = None) → Dict[str, Dict[str, ndarray]][source]

Loads data from sdc.out GPUMD output file.

Parameters

num_corr_points – Number of time correlation points the VAC/SDC is computed for
filename – File to load SDC from
directory – Directory to load ‘sdc.out’ file from (dir. of simulation)

Returns

Dictonary with SDC/VAC data. The outermost dictionary stores each: individual run. Units are [t -> ps], [VACx, VACy, VACz -> (A^2) (ps^-2)], [SDCx, SDCy, SDCz -> (A^2)(ps^-1)]

gpyumd.load.load_shc(num_corr_points: Union[int, List[int]], num_omega: Union[int, List[int]], filename: str = 'shc.out', directory: Optional[str] = None) → Dict[str, Dict[str, ndarray]][source]

Loads the data from shc.out GPUMD output file.

Parameters

num_corr_points – Maximum number of correlation steps. If multiple shc runs, can provide a list of nc.
num_omega – Number of frequency points. If multiple shc runs, can provide a list of num_omega.
filename – File to load SHC from.
directory – Directory to load ‘shc.out’ file from (dir. of simulation)

Returns

Dictionary of in- and out-of-plane shc results (average).: [t -> ps], [Ki, Ko -> A eV ps^-1], [nu -> THz], [jwi, jwo -> A eV (ps^-1)(THz^-1)]

gpyumd.load.load_thermo(filename: str = 'thermo.out', directory: Optional[str] = None) → Dict[str, ndarray][source]

Loads data from thermo.out GPUMD output file.

Parameters

filename – Name of thermal data file
directory – Directory to load thermal data file from

Returns

Dict containing the data from thermo.out. Units are [temperature: -> K], [K, U -> eV], [Px, Py, Pz, Pyz, Pxz, Pxy -> GPa], [Lx, Ly, Lz, ax, ay, az, bx, by, bz, cx, cy, cz -> A]

gpyumd.load.load_vac(num_corr_points: Union[int, List[int]], filename: str = 'mvac.out', directory: Optional[str] = None) → Dict[str, Dict[str, ndarray]][source]

Loads data from mvac.out GPUMD output file.

Parameters

num_corr_points – Number of time correlation points the VAC is computed for
filename – File to load VAC from
directory – Directory to load ‘mvac.out’ file from

Returns

Dictonary with VAC data. The outermost dictionary stores each: individual run. Units are [t -> ps], [VACx, VACy, VACz -> (A^2)(ps^-2)]

gpyumd.load.load_velocity(num_atoms: int, filename: str = 'velocity.out', directory: Optional[str] = None) → ndarray[source]

Loads data from velocity.out GPUMD output file. Currently supports loading a single run.

Parameters

num_atoms – Number of atoms velocity is output for
filename – Name of velocity data file
directory – Directory to load velocity file from

Returns

Numpy array of shape (-1,n,3) containing all forces (A/ps) from filename

gpyumd.load.read_modal_analysis_file(nbins: int, nsamples: int, datapath: str, ndiv: int, multiprocessing: bool = False, ncore: Optional[int] = None, block_size: int = 65536) → ndarray[source]

Core reader for the modal analysis methods. Recommend using the: load_heatmode or load_kappamode functions instead.

Parameters

nbins – Number of frequency bins
nsamples – Number of samples for simulation
datapath – Full path of the data file
ndiv – Divisor for shrinking the number of bins
multiprocessing – Whether or not to use multi-core processing
ncore – Number of cores to use if using multiprocessing
block_size – Number of bytes to read at once from the output files

Returns

ndiv will change nbins.

Return type

3D array with of data with dimension (nbins, nsamples, 5). Note