ValueError: No dataset in HDF5 file with pandas.read_hdf from a MatLab h5 file

Multi tool use
ValueError: No dataset in HDF5 file with pandas.read_hdf from a MatLab h5 file
I get a ValueError: No dataset in HDF5 file.
when using :
ValueError: No dataset in HDF5 file.
In [1]: import pandas as pda
In [2]: store = pda.read_hdf('X.h5')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-72e9d80a2c5b> in <module>()
----> 1 store = pda.read_hdf('X.h5')
/usr/local/miniconda3/envs/tensorFlow-GPU/lib/python3.6/site-packages/pandas/io/pytables.py in read_hdf(path_or_buf, key, mode, **kwargs)
356 groups = store.groups()
357 if len(groups) == 0:
--> 358 raise ValueError('No dataset in HDF5 file.')
359 candidate_only_group = groups[0]
360
ValueError: No dataset in HDF5 file.
h5dump
shows :
h5dump
$ h5dump -n X.h5
HDF5 "X.h5" {
FILE_CONTENTS {
group /
dataset /DS
}
}
And if I use the h5py
I can see the data :
h5py
In [3]: import h5py
/usr/local/miniconda3/envs/tensorFlow-GPU/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
In [4]: f = h5py.File('X.h5','r')
In [5]: f.keys()
Out[5]: KeysView(<HDF5 file "X.h5" (mode r)>)
In [6]: list( f.keys() )
Out[6]: ['DS']
In [7]: f['DS']
Out[7]: <HDF5 dataset "DS": shape (10, 20), type "<f8">
In [8]: f['DS'][:]
Out[8]:
array([[1., 0., 1., 1., 0., 0., 1., 1., 1., 1., 0., 1., 0., 0., 0., 1.,
0., 1., 0., 0.],
[0., 0., 0., 1., 0., 1., 1., 0., 1., 0., 1., 1., 1., 1., 0., 0.,
1., 1., 0., 0.],
[0., 1., 1., 1., 1., 0., 1., 1., 1., 0., 1., 0., 1., 1., 0., 0.,
1., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 1., 1., 1., 0., 0., 1., 1., 1., 1., 1.,
1., 0., 1., 0.],
[0., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 0., 1., 1., 0., 1.,
0., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1., 0., 1., 1.,
1., 1., 0., 0.],
[0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 0., 0., 1., 1., 0., 1.,
1., 1., 0., 1.],
[0., 0., 1., 1., 1., 0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 0.,
0., 1., 0., 1.],
[0., 0., 0., 1., 1., 1., 1., 1., 0., 0., 0., 0., 1., 1., 1., 0.,
1., 0., 0., 0.],
[0., 1., 0., 1., 1., 1., 1., 1., 1., 0., 0., 1., 1., 1., 0., 1.,
1., 0., 0., 0.]])
read_hdf
pandas/pytables
h5py
h5
@hpaulj Hi, theses Matlab examples using
h5write
clearly shows they are storing their datasets in the root group (/
)– SebMa
Jul 2 at 16:22
h5write
/
I've only looked at
h5
files written by save
- the newer h5 version of the original .mat
. (and more specifically the Octave equivalent).– hpaulj
Jul 2 at 16:30
h5
save
.mat
read_hdf
says it Retrieve pandas object stored in file
. So it's expecting a file created with df.to_hdf()
, where df
is a dataframe or other pandas object.– hpaulj
Jul 2 at 16:40
read_hdf
Retrieve pandas object stored in file
df.to_hdf()
df
@hpaulj OK, thank you :) So if I can sum things up. If the HDF5 file was generated using
pandas
, then I can use pandas
to read it, else I need to use h5py
. Is that correct ?– SebMa
Jul 2 at 17:09
pandas
pandas
h5py
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Does the
read_hdf
provide any parameters for reading a file that wasn't written bypandas/pytables
? Theh5py
read shows that 'DS' is not embedded in any group; about as plain ah5
file as possible (the dump confirms that). I'm a bit surprised since the MATLAB h5 examples that I've seen have the data several layers down, with added type and shape information.– hpaulj
Jul 2 at 16:11