Data:

This module contains functions written to load and save data.

API 

aopy.data.base.align_neuropixel_recoring_drive(neuropixel_drive, drive2, subject, theta=0, center=(0, 0))[source]

This function aligns one drive to another drive type. In the current iteration, this function only supports aligning neuropixels drives (‘NP_Insert72’/’NP_Insert137’’) to each other or to ‘ECoG244’/’Opto32’ drives. This function assumes a fixed mapping between subject and alignment is not currently compatible with selecting subsets of channels. The mapping between subject and alignment is defined in aopy/config/neuropixel_insert_ch_mapping/NP_insert_angle_alignment.xlsx. The following images depict the alignment between neuropixels insert grid hole locations and ECoG channel location for two subjects.

_images/NP_Insert137_ECoG244_alignment.png

_images/NP_Insert72_ECoG244_alignment.png

Parameters:

neuropixel_drive (str) – Neuropixel drive to align. Currently supports ‘NP_Insert72’, and ‘NP_Insert137’
drive2 (str) – Other drive to align. Currently supports ‘ECoG244’, ‘Opto32’, ‘NP_Insert72’, and ‘NP_Insert137’
subject (str) – Subject recordings were performed on. Currently supports ‘Affi’ and ‘Beignet’
theta (float) – rotation (in degrees) to apply to positions. Rotations are applied clockwise. Default 0.
center (2-tuple) – chamber coordinates of the center of the drive in mm. Defaults to (0,0).

Returns:

Tuple Containing:: aligned_np_drive_coordinates (nelec, 2): X and Y coordinates of each neuropixel insert recording site relative to drive2

aligned_drive2_coordinates (nelec, 2): X and Y coordinates of each drive2 recording site

recording_sites (nelec): Neuropixel insert recording site numbers

acq_ch (nelec): Acquisition channels (0-indexed) for each drive2 recording site

Return type:

tuple

aopy.data.base.find_preproc_ids_from_day(preproc_dir, subject, date, data_source)[source]

Returns the task entry ids that have preprocessed files in the given directory matching the subject, date, and data source given.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
date (str) – Date of recording
data_source (str) – Processed data type (exp, eye, broadband, lfp, etc.)

Returns: list of ids: task entry id for each matching file found in the given folder

aopy.data.base.get_filenames_in_dir(base_dir, te)[source]

Gets the filenames for available systems in a given task entry. Requires that files are organized by system in the base directory, and named with their task entry somewhere in their filename or directory name.

Parameters:

base_dir (str) – directory where the files will be
te (int) – block number for the task entry

Returns:

dictionary of files indexed by system

Return type:

dict

aopy.data.base.get_hdf_dictionary(data_dir, hdf_filename, show_tree=False)[source]

Lists the hdf contents in a dictionary. Does not read any data! For example, calling get_hdf_dictionary() with show_tree will result in something like this:

>>> dict = get_hdf_dictionary('/exampledir', 'example.hdf', show_tree=True)
example.hdf
└──group1
|  └──group_data: [shape: (1000,), type: int64]
└──test_data: [shape: (1000,), type: int64]
>>> print(dict)
{
    'group1': {
        'group_data': ((1000,), dtype('int64'))
    },
    'test_data': ((1000,), dtype('int64'))
}

Parameters:

data_dir (str) – folder where data is located
hdf_filename (str) – name of hdf file

Returns:

contents of the file keyed by name as tuples containing:: shape (tuple): size of the data

dtype (np.dtype): type of the data

Return type:

dict

aopy.data.base.get_kilosort_foldername(subject, te_id, date, data_source)[source]

Generates a folder name string to access the Kilosort output.

Parameters:

subject (str) – The subject name.
te_id (int or list of int) – The experiment task entry(s) to use.
date (str) – The experiment date.
data_source (str) – The data source (e.g., ‘Neuropixel’)

Returns:

A formatted folder name string for the kilosort output in the format:: ”{date}_{data_source}_{subject}_te{te_id1}_te{te_id2}…”.

Return type:

str

aopy.data.base.get_preprocessed_filename(subject, te_id, date, data_source)[source]

Generates preprocessed filenames as per our naming conventions. Format: preproc_<Date>_<MonkeyName>_<TaskEntry>_<DataSource>.hdf

Parameters:

subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
data_source (str) – Processed data type (exp, eye, broadband, lfp, etc.)

Returns:

filename

Return type:

str

aopy.data.base.is_table_in_hdf(table_name: str, hdf_filename: str)[source]

Checks if a table exists in an hdf file’ first level directory(i.e. non-recursively)

Parameters:

table_name (str) – table name to be checked
hdf_filename (str) – full path to the hdf file

Returns:

Boolean

aopy.data.base.list_root_groups(data_dir, hdf_filename)[source]

List the name of groups directly under the root in HDF5 files.

Parameters:

data_dir (str) – folder where data is located
hdf_filename (str) – name of hdf file

Returns:

Name of groups

Return type:

list

aopy.data.base.load_chmap(drive_type='ECoG244', acq_ch_subset=None, theta=0, center=(0, 0), **kwargs)[source]

Load the centered mapping between acquisition channels and electrode position for supported drives. Currently supports ‘ECoG244’, ‘Opto32’, ‘NP_Insert72’, and ‘NP_Insert137’ drives.

Parameters:

drive_type (str, optional) – Drive type of the method used to record neural activity. - ‘ECoG244’: Viventi 244 channel ECoG array - ‘Opto32’: Orsborn 32 channel fiber optic array - ‘NP_Insert72’: Orsborn 72 site Neuropixel grid - ‘NP_Insert137’: Orsborn 137 site Neuropixel grid
acq_ch_subset (nacq, optional) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be returned.
theta (float) – rotation (in degrees) to apply to positions. rotations are applied clockwise, e.g., theta = 90 rotates the map clockwise by 90 degrees, -90 rotates the map anti-clockwise by 90 degrees. Default 0.
center (2-tuple) – chamber coordinates of the center of the drive in mm. This function translates the coordinates of the drive to be centered on this value. Defaults to (0,0).
kwargs (dict) – Additional keyword arguments to pass to map_acq2pos()

Returns:

Tuple Containing:: acq_ch_position (nelec, 2): X and Y coordinates (in mm) of the electrodes corresponding to each acquisition channel. X position is in the first column and Y position is in the second column

acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)

connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

Examples

plot_ECoG244_data_map(np.zeros(256,), cmap='Greys')
annotate_spatial_map_channels(drive_type='ECoG244', color='k')
annotate_spatial_map_channels(drive_type='Opto32', color='b')
annotate_spatial_map_channels(drive_type='ECoG244', color='r', theta=90)
annotate_spatial_map_channels(drive_type='Opto32', color='g', theta=90)

plt.figure()
plot_spatial_drive_map(np.zeros(64,), drive_type='EMG_GR08MM1305', cmap='Greys', theta=0)
annotate_spatial_map_channels(drive_type='EMG_GR08MM1305', color='k', theta=0)

aopy.data.base.load_electrode_pos(data_dir, pos_file)[source]

Reads an electrode position map file and returns the x and y positions. The file should have the columns ‘topdown_x’ and ‘topdown_y’.

Parameters:

data_dir (str) – where to find the file
pos_file (str) – the excel file

Returns:

Tuple containing:: x_pos (nch): x position of each electrode

y_pos (nch): y position of each electrode

Return type:

tuple

aopy.data.base.load_hdf_data(data_dir, hdf_filename, data_name, data_group='/', cached=False)[source]

Simple wrapper to get the data from an hdf file as a numpy array

Parameters:

data_dir (str) – folder where data is located
hdf_filename (str) – name of hdf file
data_name (str) – table to load
data_group (str, optional) – from which group to load data
cached (bool, optional) – whether to allow loading cached data or not

Returns:

numpy array of data from hdf

Return type:

ndarray

aopy.data.base.load_hdf_group(data_dir, hdf_filename, group='/', cached=False)[source]

Loads any datasets from the given hdf group into a dictionary. Also will recursively load other groups if any exist under the given group

Parameters:

data_dir (str) – folder where data is located
hdf_filename (str) – name of hdf file
group (str, optional) – name of the group to load
cached (bool, optional) – whether to allow loading cached data or not

Returns:

all the datasets contained in the given group

Return type:

dict

aopy.data.base.load_hdf_ts_segment(preproc_dir, filename, data_group, data_name, samplerate, start_time, end_time, channels=None)[source]

Load a segment of HDF timeseries data given a start and end time and a sampling rate.

Parameters:

preproc_dir (str) – base directory where the files live
filename (str) – filename of the hdf file where the data resides
data_group (str) – hdf group of the desired dataset
data_name (str) – hdf name of the desired dataset
samplerate (float) – the sampling rate of the data in Hz
start_time (float) – time (in seconds) in the recording at which the desired segment starts
end_time (float) – time (in seconds) in the recording at which the desired segment ends
channels (list, optional) – list of channels to include in the segment (default all channels)

Raises:

ValueError – if the dataset cannot be found in the file

Returns:

tuple containing:: segment (nt, nch): data segment from the given preprocessed file

samplerate (float): sampling rate of the returned data

Return type:

tuple

aopy.data.base.load_hdf_ts_trial(preproc_dir, filename, data_group, data_name, samplerate, trigger_time, time_before, time_after, channels=None)[source]

Load a segment of HDF timeseries data given start and end times and a sampling rate.

Parameters:

preproc_dir (str) – base directory where the files live
filename (str) – filename of the hdf file where the data resides
data_group (str) – hdf group of the desired dataset
data_name (str) – hdf name of the desired dataset
samplerate (float) – the sampling rate of the data in Hz
trigger_time (float) – time (in seconds) in the recording at which the desired segment starts
time_before (float) – time (in seconds) to include before the trigger times
time_after (float) – time (in seconds) to include after the trigger times
channels (list, optional) – list of channels to include in the segment (default all channels

Raises:

ValueError – if the dataset cannot be found in the file

Returns:

tuple containing:: segment (nt, nch): data segment from the given preprocessed file

samplerate (float): sampling rate of the returned data

Return type:

tuple

aopy.data.base.load_matlab_cell_strings(data_dir, hdf_filename, object_name)[source]

This function extracts strings from an object within .mat file that was saved from matlab in version -7.3 (-v7.3).

example:

>>> testfile = 'matlab_cell_str.mat'
>>> strings = load_matlab_cell_strings(data_dir, testfile, 'bmiSessions')
>>> print(strings)
['jeev070412j', 'jeev070512g', 'jeev070612d', 'jeev070712e', 'jeev070812d']

Parameters:

data_dir (str) – where the matlab file is located
hdf_filename (str) – .mat filename
object_name (str) – Name of object to load. This is typically the variable name saved from matlab

Returns:

List of strings in the hdf file object

Return type:

(list of strings)

aopy.data.base.load_preproc_analog_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads analog data from a preprocessed file.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

analog data dict: dictionary of analog metadata

Return type:

dict

aopy.data.base.load_preproc_ap_data(preproc_dir, subject, te_id, date, drive_number=None, cached=True)[source]

Loads spike band time series from a preprocessed file. When drive_number is None, load lfp_data and lfp_metadata directly. Please specify drive_number when there are drives in hdf files.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
drive_number (int) – drive number for multiple recordings. 1-based indexing.
cached (bool, optional) – whether to allow loading cached version of data (default True)

Raises:

ValueError – if drives are detected when drive number is None.

Returns:

numpy array of ap data from hdf dict: Dictionary of ap metadata

Return type:

ndarray

aopy.data.base.load_preproc_broadband_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads broadband data from a preprocessed file.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

broadband data dict: Dictionary of broadband metadata

Return type:

dict

aopy.data.base.load_preproc_digital_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads digital data from a preprocessed file.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

digital data dict: dictionary of digital metadata

Return type:

dict

aopy.data.base.load_preproc_emg_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads emg data from a preprocessed file.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

Dictionary of exp data dict: Dictionary of exp metadata

Return type:

dict

aopy.data.base.load_preproc_exp_data(preproc_dir, subject, te_id, date, verbose=True, cached=True)[source]

Loads experiment data from a preprocessed file.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
verbose (bool, optional) – check for preprocessing errors and print them (default True)
cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

Dictionary of exp data dict: Dictionary of exp metadata

Return type:

dict

aopy.data.base.load_preproc_eye_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads eye data from a preprocessed file.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

Dictionary of eye data dict: Dictionary of eye metadata

Return type:

dict

aopy.data.base.load_preproc_lfp_data(preproc_dir, subject, te_id, date, drive_number=None, cached=True)[source]

Loads LFP data from a preprocessed file. When drive_number is None, load lfp_data and lfp_metadata directly. Please specify drive_number when there are drives in hdf files.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
drive_number (int) – drive number for multiple recordings. 1-based indexing.
cached (bool, optional) – whether to allow loading cached version of data (default True)

Raises:

ValueError – if drives are detected when drive number is None.

Returns:

numpy array of lfp data from hdf dict: Dictionary of lfp metadata

Return type:

ndarray

aopy.data.base.load_preproc_spike_data(preproc_dir, subject, te_id, date, drive_number=1, cached=True)[source]

Loads spike data from a preprocessed file.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
drive_number (int) – drive number for multiple recordings. 1-based indexing.
cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

spike data dict: Dictionary of spike metadata

Return type:

dict

aopy.data.base.load_spike_waveforms(preproc_dir, subject, te_id, date, drive_number=1, cached=True)[source]

Loads spike waveforms from a preprocessed file.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
drive_number (int) – drive number for multiple recordings. 1-based indexing.
cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

spike waveforms

Return type:

dict

aopy.data.base.load_yaml_config(filename)[source]

Load a yaml configuration file into a dictionary

Parameters:: config_file (str) – path to the yaml configuration file
Returns:: dictionary containing the configuration parameters
Return type:: dict

aopy.data.base.lookup_acq2elec(data_dir, signal_path_file, acq, zero_index=True)[source]

Looks up the electrode number for a given acquisition channel using an excel map file (from Dr. Map)

Parameters:

data_dir (str) – where the signal path file is located
signal_path_file (str) – signal path definition file
acq (int) – which channel to look up
zero_index (bool, optional) – use 0-indexing for acq and elec (default True)

Returns:

matching electrode number. If no matching electrode is found, returns -1 (or 0 with zero_index=False)

Return type:

int

aopy.data.base.lookup_elec2acq(data_dir, signal_path_file, elec, zero_index=True)[source]

Looks up the acquisition channel for a given electrode number using an excel map file (from Dr. Map)

Parameters:

data_dir (str) – where the signal path file is located
signal_path_file (str) – signal path definition file
elec (int) – which electrode to look up
zero_index (bool, optional) – use 0-indexing for acq and elec (default True)

Returns:

matching acquisition channel. If no matching channel is found, returns -1 (or 0 with zero_index=False)

Return type:

int

aopy.data.base.lookup_excel_value(data_dir, excel_file, from_column, to_column, lookup_value)[source]

Finds a matching value for the given key in an excel file. Used for looking up electrode and acquisition channels for signal path files, but can also be useful as a lookup table for other numeric mappings.

Parameters:

data_dir (str) – where the signal path file is located
signal_path_file (str) – signal path definition file
from_column (str, optional) – the name of the electrode column
to_column (str, optional) – the name of the acquisition column
lookup_value (int) – match this value in the from_column

Returns:

the corresponding value in the lookup table, or 0 if none is found

Return type:

int

aopy.data.base.map_acq2elec(signalpath_table, acq_ch_subset=None)[source]

Create index mapping from acquisition channel to electrode number. Excel files can be loaded as a pandas dataframe using pd.read_excel

Parameters:

signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)
acq_ch_subset (nacq) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be return. If a requested acquisition channel isn’t returned a warned will be displayed

Returns:

Tuple containing:: acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)

connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

aopy.data.base.map_acq2pos(signalpath_table, eleclayout_table, acq_ch_subset=None, theta=0, rotation_offset=(0, 0), xpos_name='topdown_x', ypos_name='topdown_y')[source]

Create index mapping from acquisition channel to electrode position by calling aopy.data.map_acq2elec Excel files can be loaded as a pandas dataframe using pd.read_excel

Parameters:

signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)
eleclayout_table (pd dataframe) – Electrode position information in a pandas dataframe. (Mapping between electrode and position on array)
acq_ch_subset (nacq) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be return. If a requested acquisition channel isn’t returned a warned will be displayed
theta (float) – rotation (in degrees) to apply to positions. rotations are applied clockwise, e.g., theta = 90 rotates the map clockwise by 90 degrees, -90 rotates the map anti-clockwise by 90 degrees. Default 0.
rotation_offset (tuple) – X and Y coordinates of the rotation center. Defaults to (0,0)
xpos_name (str) – Column name for the electrode ‘x’ position. Defaults to ‘topdown_x’ used with the viventi ECoG array
ypos_name (str) – Column name for the electrode ‘y’ position. Defaults to ‘topdown_y’ used with the viventi ECoG array

Returns:

Tuple Containing:: acq_ch_position (nelec, 2): X and Y coordinates of the electrode each acquisition channel gets data from. X position is in the first column and Y position is in the second column

acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)

connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

aopy.data.base.map_data2elec(datain, signalpath_table, acq_ch_subset=None, zero_indexing=False)[source]

Map data from its acquisition channel to the electrodes recorded from. Wrapper for aopy.data.map_acq2elec Excel files can be loaded as a pandas dataframe using pd.read_excel

Parameters:

datain (nt, nacqch) – Data recoded from an array.
signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)
acq_ch_subset (nacq) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be return. If a requested acquisition channel isn’t returned a warned will be displayed
zero_indexing (bool) – Set true if acquisition channel numbers start with 0. Defaults to False.

Returns:

Tuple containing:: dataout (nt, nelec): Data from the connected electrodes

acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)

connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

aopy.data.base.map_data2elecandpos(datain, signalpath_table, eleclayout_table, acq_ch_subset=None, theta=0, rotation_offset=(0, 0), xpos_name='topdown_x', ypos_name='topdown_y', zero_indexing=False)[source]

Map data from its acquisition channel to the electrodes recorded from and their position. Wrapper for aopy.data.map_acq2pos Excel files can be loaded as a pandas dataframe using pd.read_excel

Parameters:

datain (nt, nacqch) – Data recoded from an array.
signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)
eleclayout_table (pd dataframe) – Electrode position information in a pandas dataframe. (Mapping between electrode and position on array)
acq_ch_subset (nacq) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be return. If a requested acquisition channel isn’t returned a warned will be displayed
theta (float) – rotation (in degrees) to apply to positions. rotations are applied clockwise, e.g., theta = 90 rotates the map clockwise by 90 degrees, -90 rotates the map anti-clockwise by 90 degrees. Default 0.
rotation_offset (tuple) – X and Y coordinates of the rotation center. Defaults to (0,0)
xpos_name (str) – Column name for the electrode ‘x’ position. Defaults to ‘topdown_x’ used with the viventi ECoG array
ypos_name (str) – Column name for the electrode ‘y’ position. Defaults to ‘topdown_y’ used with the viventi ECoG array
zero_indexing (bool) – Set true if acquisition channel numbers start with 0. Defaults to False.

Returns:

Tuple containing:: dataout (nt, nelec): Data from the connected electrodes

acq_ch_position (nelec, 2): X and Y coordinates of the electrode each acquisition channel gets data from. X position is in the first column and Y position is in the second column

acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)

connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

aopy.data.base.map_elec2acq(signalpath_table, elecs)[source]

This function finds the acquisition channels that correspond to the input electrode numbers given the signal path table input. This function works by calling aopy.data.map_acq2elec and subsampling the output. If a requested electrode isn’t connected to an acquisition channel a warning will be displayed alerting the user and the corresponding index in the output array will be a np.nan value.

Parameters:

signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)
elecs (nelec) – Electrodes to find the acquisition channels for

Returns:

Acquisition channels that map to electrodes (e.g. nelec/256 for viventi ECoG array)

Return type:

acq_chs

aopy.data.base.parse_str_list(strings, str_include=None, str_avoid=None)[source]

This function parses a list of strings to return the strings that include/avoid specific substrings It was designed to parse dictionary keys

Parameters:

strings (list of strings) – List of strings
str_include (list of strings) – List of substrings that must be included in a string to keep it
str_avoid (list of strings) – List of substrings that can not be included in a string to keep it

Returns:

List of strings fitting the input conditions

Return type:

(list of strings)

Example::

>>> str_list = ['sig001i_wf', 'sig001i_wf_ts', 'sig002a_wf', 'sig002a_wf_ts',
                'sig002b_wf', 'sig002b_wf_ts', 'sig002i_wf', 'sig002i_wf_ts']
>>> parsed_strings = parse_str_list(str_list, str_include=['sig002', 'wf'], str_avoid=['b_wf', 'i_wf'])
>>> print(parsed_strings)
['sig002a_wf', 'sig002a_wf_ts']

aopy.data.base.pkl_read(file_to_read, read_dir)[source]

Reads data stored in a pickle file.

Parameters:

file_to_read (str) – filename with ‘.pkl’ extension
read_dir (str) – Path to folder where the file is stored

Returns:

data in a format as it is stored

aopy.data.base.pkl_write(file_to_write, values_to_dump, write_dir)[source]

Write data into a pickle file. Note: H5D5 (HDF) files can not be pickled. Refer aopy.data.save_hdf() for saving HDF data

Parameters:

file_to_write (str) – filename with ‘.pkl’ extension
values_to_dump (any) – values to write in a pickle file
write_dir (str) – Path - where do you want to write this file

Returns:

None

examples: pkl_write(‘meta.pkl’, data, ‘/data_dir’)

aopy.data.base.save_hdf(data_dir, hdf_filename, data_dict, data_group='/', compression=0, append=False, debug=False)[source]

Writes data_dict and params into a hdf file in the data_dir folder

Parameters:

data_dir (str) – destination file directory
hdf_filename (str) – name of the hdf file to be saved
data_dict (dict) – the data to be saved as a hdf file
data_group (str, optional) – where to store the data in the hdf
compression (int, optional) – gzip compression level. 0 indicate no compression. Compression not added to existing datasets. (default: 0)
append (bool, optional) – append an existing hdf file or create a new hdf file

Returns:

None

aopy.data.base.yaml_read(filename)[source]

The FullLoader parameter handles the conversion from YAML scalar values to Python the dictionary format :param filename: Filename including the full path :type filename: str

Returns:: Params data dumped into a yaml file
Return type:: data (dict)

Example

>>>params_file = ‘/test_data/task_codes.yaml’ >>>task_codes = yaml_read(params_file, params)

aopy.data.base.yaml_write(filename, data)[source]

YAML stands for Yet Another Markup Language. It can be used to save Params or configuration files. :param filename: Filename including the full path :type filename: str :param data: Params data to be dumped into a yaml file :type data: dict

Returns: None

Example

>>>params = [{ ‘CENTER_TARGET_ON’: 16 , ‘CURSOR_ENTER_CENTER_TARGET’ : 80 , ‘REWARD’ : 48 , ‘DELAY_PENALTY’ : 66 }] >>>params_file = ‘/test_data/task_codes.yaml’ >>>yaml_write(params_file, params)

BMI3D 

aopy.data.bmi3d.extract_lfp_features(preproc_dir, subject, te_id, date, decoder, samplerate=None, channels=None, start_time=None, end_time=None, latency=0.02, datatype='lfp', preproc=None, **kwargs)[source]

Extracts features from a BMI3D experiment using data aligned to the timestamps of the experiment. Using this function, you can replicate closely the features that would have been extracted from a real-time BMI3D experiment, even if the experiment did not include a decoder.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
decoder (riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes
samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default None, uses the sampling rate of the experiment.
channels (int array, optional) – which channel indices to load. If None (the default), uses the channels specified in the decoder.
start_time (float, optional) – time (in seconds) in the recording at which the desired segment starts
end_time (float, optional) – time (in seconds) in the recording at which the desired segment ends
latency (float, optional) – time (in seconds) to include before the trigger times
datatype (str, optional) – choice of ‘lfp’ or ‘broadband’ data to load. Defaults to ‘lfp’. If the sampling rate of the data is different from the decoder, the data will be downsampled by decimation.
preproc (fn, optional) – function mapping (state, fs) data to (state_new, fs_new). For example, a smoothing function.
kwargs – additional keyword arguments to pass to sample_timestamped_data

Returns:

tuple containing:: feats (nt, nfeats): lfp features for the given channels after preprocessing

samplerate (float): the sampling rate of the states after preprocessing

Return type:

tuple

Note

For best accuracy, use ‘broadband’ or other datatype without any filtering. Using filtered ‘lfp’ results in DC shifted features.

Examples

subject = 'affi'
te_id = 17269
date = '2024-05-03'
preproc_dir = data_dir
start_time = 10
end_time = 30

Extract features using extract_lfp_features() and states using extract_lfp_features() with decode=True:

Get online extracted features from get_extracted_features() and states from get_decoded_states() for comparison:

features_online, samplerate_online = get_extracted_features(
    preproc_dir, subject, te_id, date, decoder,
    start_time=start_time, end_time=end_time)

Plot the online and offline features:

time_offline = np.arange(len(features_offline))/samplerate_offline + start_time
time_online = np.arange(len(features_online))/samplerate_online + start_time

plt.figure(figsize=(8,3))
plt.plot(time_offline, features_offline[:,1], alpha=0.8, label='offline')
plt.plot(time_online, features_online[:,1], alpha=0.8, label='online')
plt.xlabel('time (s)')
plt.ylabel('power')
plt.legend()
plt.title('readout 1')

aopy.data.bmi3d.filter_lfp_from_broadband(broadband_filepath, result_filepath, drive_number=1, mean_subtract=True, dtype='int16', max_memory_gb=1.0, **filter_kwargs)[source]

Filters local field potential (LFP) data from a given broadband signal file into an hdf file.

Parameters:

broadband_filepath (str) – Path to the input broadband signal file.
result_filepath (str) – Path to save the filtered LFP data.
mean_subtract (bool, optional) – Whether to subtract the mean from the filtered LFP signal. Default is True.
dtype (str, optional) – Data type for the filtered LFP signal. Default is ‘int16’.
max_memory_gb (float, optional) – Maximum memory (in gigabytes) to use for filtering. Default is 1.0 GB.
**filter_kwargs – Additional keyword arguments to customize the filtering process. These arguments will be passed to the filtering function.

Raises:

IOError – If the input broadband file is not found.
MemoryError – If the specified max_memory_gb is insufficient for the filtering process.

Note

This function is used in the proc_lfp() wrapper.

aopy.data.bmi3d.filter_lfp_from_ecube(ecube_filepath, result_filepath, drive_number=1, mean_subtract=True, dtype='int16', max_memory_gb=1.0, **filter_kwargs)[source]

Filters local field potential (LFP) data from an eCube recording file.

Parameters:

ecube_filepath (str) – Path to the input eCube recording file.
result_filepath (str) – Path to save the filtered LFP data.
mean_subtract (bool, optional) – Whether to subtract the mean from the filtered LFP signal. Default is True.
dtype (str, optional) – Data type for the filtered LFP signal. Default is ‘int16’.
max_memory_gb (float, optional) – Maximum memory (in gigabytes) to use for filtering. Default is 1.0 GB.
**filter_kwargs – Additional keyword arguments to customize the filtering process. These arguments will be passed to the filtering function.

Raises:

IOError – If the input eCube recording file is not found.
MemoryError – If the specified max_memory_gb is insufficient for the filtering process.

Note

This function is used in the proc_lfp() wrapper.

aopy.data.bmi3d.get_decoded_states(preproc_dir, subject, te_id, date, decoder, samplerate=None, start_time=None, end_time=None, preproc=None, **kwargs)[source]

Fetches online decoded states from readouts in a BCI experiment. Wrapper around get_task_data.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
decoder (riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes
samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default None, uses the sampling rate of the experiment.
start_time (float, optional) – start time of the segment to load (in seconds). Default None, which loads from the beginning of the data.
end_time (float, optional) – end time of the segment to load (in seconds). Default None, which loads until the end of the data.
preproc (fn, optional) – function mapping (state, fs) data to (state_new, fs_new). For example, a smoothing function.
kwargs – additional keyword arguments to pass to sample_timestamped_data

Returns:

tuple containing:: state (nt, nstate): decoded states from the given experiment after preprocessing

samplerate (float): the sampling rate of the states after preprocessing

Return type:

tuple

aopy.data.bmi3d.get_e3v_video_frame_data(digital_data, sync_channel_idx, trigger_channel_idx, samplerate)[source]

Compute pulse times and duty cycles from e3vision video data frames collected on an ecube digital panel.

Parameters:

digital_data (nt, nch) – array of data read from ecube digital panel
sync_channel_idx (int) – sync channel to read from digital_data. Indicates each video frame.
trigger_channel_idx (int) – trigger channel to read from digital_data. Indicates start/end video triggers.
sample_rate (numeric) – data sampling rate (Hz)

Returns:

array of floats indicating pulse start times duty_cycle (np.array): array of floats indicating pulse duty cycle (quotient of pulse width and pulse period)

Return type:

pulse_times (np.array)

aopy.data.bmi3d.get_ecube_data_sources(data_dir)[source]

Lists the available data sources in a given data directory

Parameters:: data_dir (str) – eCube data directory
Returns:: available sources (AnalogPanel, Headstages, etc.)
Return type:: str array

aopy.data.bmi3d.get_ecube_digital_input_times(path, data_dir, ch)[source]

Computes the times when digital input turns on or off in ecube For synchronizing openephys with ecube, use ch=-1.

Args: path (str): base directory where ecube data is stored data_dir (str): folder you want to load ch (str): digital channel

Returns:

Tuple containing:: on_times (n_times): times at which sync line turned on

off_times (n_times): times at which sync line turned off

Return type:

tuple

aopy.data.bmi3d.get_extracted_features(preproc_dir, subject, te_id, date, decoder, samplerate=None, start_time=None, end_time=None, datatype='lfp_power', preproc=None, **kwargs)[source]

Fetches online extracted features from readouts of a BCI experiment. Wrapper around get_task_data.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
decoder (riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes
samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default None, uses the sampling rate of the experiment.
start_time (float, optional) – start time of the segment to load (in seconds). Default None, which loads from the beginning of the data.
end_time (float, optional) – end time of the segment to load (in seconds). Default None, which loads until the end of the data.
datatype (str, optional) – type of features to load. Defaults to ‘lfp_power’.
preproc (fn, optional) – function mapping (state, fs) data to (state_new, fs_new). For example, a smoothing function.
kwargs – additional keyword arguments to pass to sample_timestamped_data

Returns:

tuple containing:: state (nt, nfeats): decoded states from the given experiment after preprocessing

samplerate (float): the sampling rate of the states after preprocessing

Return type:

tuple

aopy.data.bmi3d.get_interp_task_data(exp_data, exp_metadata, datatype='cursor', samplerate=1000, step=1, **kwargs)[source]

Gets interpolated data from preprocessed experiment task cycles to the desired sampling rate. Cursor kinematics are returned in screen coordinates, while user input kinematics are returned either in their original raw coordinate system with datatype=’user_raw’ (e.g. optitrack coordinates), in world coordinates with datatype=’user_world’, or in screen coordinates with datatype=’user_screen’ (similar to cursor kinematics but without any bounding under position control).

Parameters:

exp_data (dict) – A dictionary containing the experiment data.
exp_metadata (dict) – A dictionary containing the experiment metadata.
datatype (str, optional) – The type of kinematic data to interpolate. - ‘cursor’ for cursor kinematics - ‘user_raw’ for raw input coordinates - ‘user_world’ for user input in world coordinates - ‘user_screen’ for user input in screen coordinates - ‘reference’ for reference kinematics - ‘disturbance’ for disturbance kinematics - ‘targets’ for target positions - other datatypes if they exist as exp_data[‘task’][<datatype>]
samplerate (float, optional) – The desired output sampling rate in Hz. Defaults to 1000.
step (int, optional) – task data will be decimated with steps this big. Default 1.
**kwargs – Additional keyword arguments to pass to sample_timestamped_data()

Returns:

Kinematic data interpolated and filtered: to the desired sampling rate.

Return type:

data_time (ns, …)

Examples

Cursor kinematics in screen coordinates (datatype ‘cursor’)

exp_data, exp_metadata = load_preproc_exp_data(preproc_dir, 'test',  3498, '2021-12-13')
cursor_interp = get_interp_task_data(exp_data, exp_metadata, datatype='cursor', samplerate=100)

plt.figure()
visualization.plot_trajectories([cursor_interp], [-10, 10, -10, 10])

Raw input kinematics (datatype ‘user_raw’, ‘hand’, or ‘manual_input’)

hand_interp = get_interp_task_data(exp_data, exp_metadata, datatype='hand', samplerate=100)
ax = plt.axes(projection='3d')
visualization.plot_trajectories([hand_interp], [-10, 10, -10, 10, -10, 10])

User input kinematics in world coordinates (datatype ‘user_world’)

user_world = get_interp_task_data(exp_data, exp_metadata, datatype='user_world', samplerate=100)
ax = plt.axes(projection='3d')
visualization.plot_trajectories([user_world], [-10, 10, -10, 10, -10, 10])

User input kinematics in screen coordinates (datatype ‘user_screen’)

user_screen = get_interp_task_data(exp_data, exp_metadata, datatype='user_screen', samplerate=100)
ax = plt.axes(projection='3d')
visualization.plot_trajectories([user_screen], [-10, 10, -10, 10, -10, 10])

Target positions (datatype ‘target’)

targets_interp = get_interp_task_data(exp_data, exp_metadata, datatype='targets', samplerate=100)
time = np.arange(len(targets_interp))/100
plt.plot(time, targets_interp[:,:,0]) # plot just the x coordinate
plt.xlim(10, 20)
plt.xlabel('time (s)')
plt.ylabel('x position (cm)')

_images/get_interp_targets_centerout.png

Cursor and target (datatype ‘reference’) kinematics

exp_data, exp_metadata = load_preproc_exp_data(data_dir, 'test', 8461, '2023-02-25')
cursor_interp = get_interp_task_data(exp_data, exp_metadata, datatype='cursor', samplerate=exp_metadata['fps'])
ref_interp = get_interp_task_data(exp_data, exp_metadata, datatype='reference', samplerate=exp_metadata['fps'])
time = np.arange(exp_metadata['fps']*120)/exp_metadata['fps']
plt.plot(time, cursor_interp[:int(exp_metadata['fps']*120),1], color='blueviolet', label='cursor') # plot just the y coordinate
plt.plot(time, ref_interp[:int(exp_metadata['fps']*120),1], color='darkorange', label='ref')
plt.xlabel('time (s)')
plt.ylabel('y position (cm)'); plt.ylim(-10,10)
plt.legend()

User, reference, and disturbance kinematics

user_interp = get_interp_task_data(exp_data, exp_metadata, datatype='user', samplerate=exp_metadata['fps'])
ref_interp = get_interp_task_data(exp_data, exp_metadata, datatype='reference', samplerate=exp_metadata['fps'])
dis_interp = get_interp_task_data(exp_data, exp_metadata, datatype='disturbance', samplerate=exp_metadata['fps'])
time = np.arange(exp_metadata['fps']*120)/exp_metadata['fps']
plt.plot(time, user_interp[:int(exp_metadata['fps']*120),1], color='darkturquoise', label='user')
plt.plot(time, ref_interp[:int(exp_metadata['fps']*120),1], color='darkorange', label='ref')
plt.plot(time, dis_interp[:int(exp_metadata['fps']*120),1], color='tab:red', linestyle='--', label='dis')
plt.xlabel('time (s)')
plt.ylabel('y position (cm)'); plt.ylim(-10,10)
plt.legend()

Changes:: 2023-10-20: Added support for ‘targets’ datatype 2024-01-29: Removed kinematic filtering below 15 Hz. See filter_kinematics().

aopy.data.bmi3d.get_kinematic_segments(preproc_dir, subject, te_id, date, trial_start_codes, trial_end_codes, trial_filter=<function <lambda>>, datatype='cursor', deriv=0, norm=False, samplerate=1000, **kwargs)[source]

Loads x,y,z cursor, hand, or eye trajectories for each “trial” from a preprocessed HDF file. Trials can be specified by numeric start and end codes. Trials can also be filtered so that only successful trials are included, for example. The filter is applied to numeric code segments for each trial. Finally, the cursor data can be preprocessed by a supplied function to, for example, convert position to velocity estimates. The preprocessing function is applied to the (time, position) cursor or eye data.

See also

get_kinematic_segment(), get_kinematics()

Example

subject = ‘beignet’ te_id = 4301 date = ‘2021-01-01’ trial_filter = lambda t: TRIAL_END not in t trajectories, segments = get_kinematic_segments(preproc_dir, subject, te_id, date,

[CURSOR_ENTER_CENTER_TARGET], [REWARD, TRIAL_END], trial_filter=trial_filter)

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
trial_start_codes (list) – list of numeric codes representing the start of a trial
trial_end_codes (list) – list of numeric codes representing the end of a trial
trial_filter (fn, optional) – function mapping trial segments to boolean values. Any trials for which the filter returns False will not be included in the output
datatype (str, optional) – type of kinematics to load. Defaults to ‘cursor’.
deriv (int, optional) – order of the derivative to compute. Default 0, no derivative.
norm (bool, optional) – if the output segments should be vector normalized at each timepoint. Default False.
samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default 1000.
kwargs – additional keyword arguments to pass to get_kinematics

Returns:

tuple containing:: trajectories (ntrial): array of filtered cursor trajectories for each trial

trial_segments (ntrial): array of numeric code segments for each trial

Return type:

tuple

Note

The sampling rate of the returned data might be different from the requested sampling rate if the preprocessing function does any modification to the length of the data.

Modified September 2023 to include optional sampling rate argument Modified July 2025 to include optional deriv and norm arguments

aopy.data.bmi3d.get_kinematics(preproc_dir, subject, te_id, date, samplerate, datatype='cursor', deriv=0, norm=False, filter_kinematics=False, **kwargs)[source]

Return all kinds of kinematics from preprocessed data. Caches the data for faster loading.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
samplerate (float) – the desired samplerate of the data in Hz.
datatype (str, optional) – type of kinematics to load. Defaults to ‘cursor’.
deriv (int, optional) – order of the derivative to compute. Default 0, no derivative.
norm (bool, optional) – if the output segments should be vector normalized at each timepoint. Default False.
filter_kinematics (bool, optional) – if True, the kinematics will be filtered. Default False.
kwargs – additional keyword arguments to pass to get_interp_task_data

Raises:

ValueError – if the datatype is invalid

Returns:

tuple containing:: kinematics (nt, nch): kinematics from the given experiment after preprocessing

samplerate (float): the sampling rate of the kinematics after preprocessing

Return type:

tuple

aopy.data.bmi3d.get_lfp_aligned(preproc_dir, subject, te_id, date, trial_start_codes, trial_end_codes, time_before, time_after, drive_number=None, trial_filter=<function <lambda>>)[source]

Loads lfp data (same length for each trial) from a preprocessed HDF file. Trials can be specified by numeric start and end codes. Trials can also be filtered so that only successful trials are included, for example. The filter is applied to numeric code segments for each trial.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
trial_start_codes (list) – list of numeric codes representing the start of a trial
trial_end_codes (list) – list of numeric codes representing the end of a trial
time_before (float) – time before the trial start to include in the aligned lfp (in seconds)
time_after (float) – time after the trial end to include in the aligned lfp (in seconds)
trial_filter (fn, optional) – function mapping trial segments to boolean values. Any trials for which the filter returns False will not be included in the output

Returns:

aligned lfp data output from func:aopy.preproc.trial_align_data

Return type:

(ntrials, nt, nch)

aopy.data.bmi3d.get_lfp_segments(preproc_dir, subject, te_id, date, trial_start_codes, trial_end_codes, drive_number=None, trial_filter=<function <lambda>>)[source]

Loads lfp segments (different length for each trial) from a preprocessed HDF file. Trials can be specified by numeric start and end codes. Trials can also be filtered so that only successful trials are included, for example. The filter is applied to numeric code segments for each trial.

Parameters:

preproc_dir (str) – path to the preprocessed directory
preproc_dir – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
trial_start_codes (list) – list of numeric codes representing the start of a trial
trial_end_codes (list) – list of numeric codes representing the end of a trial
trial_filter (fn, optional) – function mapping trial segments to boolean values. Any trials for which the filter returns False will not be included in the output

Returns:

tuple containing:: lfp_segments (ntrial): array of filtered lfp segments for each trial

trial_segments (ntrial): array of numeric code segments for each trial

Return type:

tuple

aopy.data.bmi3d.get_source_files(preproc_dir, subject, te_id, date)[source]

Retrieves the dictionary of source files from a preprocessed file

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording

Returns:

tuple containing:: ** files (dict):** dictionary of (source, filepath) files that are associated with the given experiment

** data_dir (str):** directory where the source files were located

Return type:

tuple

aopy.data.bmi3d.get_spike_data_aligned(preproc_dir, subject, te_id, date, trigger_times, time_before, time_after, drive=1, bin_width=0.01)[source]

Loads spike data for a given subject and experiment, then aligns binned spike to trigger times.

Parameters:

preproc_dir (str) – Path to the preprocessed data directory.
subject (str) – Subject name.
te_id (str) – Task entry number.
date (str) – The date of the experiment.
trigger_times (numpy.ndarray) – 1D Array of trigger times (in seconds) for each trial to which spike data should be aligned.
time_before (float) – The amount of time (in seconds) before each trigger time to include in the aligned spike data.
time_after (float) – The amount of time (in seconds) after each trigger time to include in the aligned spike data.
drive (int) – The drive number corresponding to the spike data.
bin_width (float, optional) – The width of the bins [s]. Default is 0.01 (10ms) seconds.

Returns:

A tuple containing:

spike_aligned (numpy.ndarray): A 3D array of aligned spike data with shape (ntime, nunits, ntrials), where:
- ntime is the number of time bins between time_before and time_after around each trigger.
- nch is the number of units.
- ntrials is the number of trials (trigger events).
unit_labels (list of str): A list of unit labels corresponding to the ‘nunits’ dimension in the aligned spike data.
bins (numpy.ndarray): The time bin centers relative to the trigger times.

Return type:

tuple

aopy.data.bmi3d.get_spike_data_segment(preproc_dir, subject, te_id, date, start_time, end_time, drive=1, bin_width=0.01)[source]

Loads and extracts a segment of spiking data for a given subject and experiment, optionally binning the spike times.

Parameters:

preproc_dir (str) – Path to the preprocessed data directory.
subject (str) – Subject name.
te_id (str) – Task entry number.
date (str) – The date of the experiment.
start_time (float) – The start time [s] of the segment to extract.
end_time (float) – The end time [s] of the segment to extract.
drive (int, optional) – Which drive (port) to load data from.
bin_width (float, optional) – The width of the bins [s]. Default is 0.01 (10ms) seconds. If set to None, no binning is applied and spike times are returned.

Returns:

A tuple containing:

spike_segment (dict): A dictionary where keys are unit labels and values are arrays of spike times (or binned spike counts) for that unit.
bins (numpy.ndarray or None): An array of bin edges if binning was applied, otherwise None.

Return type:

tuple

aopy.data.bmi3d.get_target_locations(preproc_dir, subject, te_id, date, target_indices)[source]

Loads the x,y,z location of targets in a preprocessed HDF file given by their index. Requires that the preprocessed exp_data includes a trials structured array containing index and target fields (the default behavior of :func:~aopy.preproc.proc_exp)

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
target_indices (ntarg) – a list of which targets to fetch

Returns:

(ntarg x 3) array of coordinates of the given targets

Return type:

ndarray

aopy.data.bmi3d.get_task_data(preproc_dir, subject, te_id, date, datatype, samplerate=None, step=1, preproc=None, **kwargs)[source]

Return interpolated task data. Wraps get_interp_task_data() but caches the data for faster loading.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
datatype (str) – column of task data to load.
samplerate (float) – choose the samplerate of the data in Hz. Default None, which uses the sampling rate of the experiment.
step (int, optional) – task data will be decimated with steps this big. Default 1.
preproc (fn, optional) – function mapping (position, fs) data to (kinematics, fs_new). For example, a smoothing function or an estimate of velocity from position
kwargs – additional keyword arguments to pass to get_interp_task_data

Raises:

ValueError – if the datatype is invalid

Returns:

tuple containing:: kinematics (nt, nch): kinematics from the given experiment after preprocessing

samplerate (float): the sampling rate of the kinematics after preprocessing

Return type:

tuple

Examples

subject = 'beignet'
te_id = 4301
date = '2021-01-01'
ts_data, samplerate = get_task_data(preproc_dir, subject, te_id, date, 'cycle')
time = np.arange(len(ts_data))/samplerate
plt.figure()
plt.plot(time[1:], 1/np.diff(ts_data), 'ko')
plt.xlabel('time (s)')
plt.ylabel('cycle step')
plt.ylim(0, 2)

aopy.data.bmi3d.get_trajectory_frequencies(preproc_dir, subject, te_id, date)[source]

For continuous tracking tasks, get the set of frequencies (in Hz) used to generate the trajectories that were preesented on each trial of the experiment, using get_ref_dis_frequencies().

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording

Returns:

Tuple containing:: freq_r (list of arrays): (ntrial) list of (nfreq,) frequencies used to generate reference trajectory

freq_d (list of arrays): (ntrial) list of (nfreq,) frequencies used to generate disturbance trajectory

Return type:

tuple

aopy.data.bmi3d.get_ts_data_segment(preproc_dir, subject, te_id, date, start_time, end_time, drive_number=None, channels=None, datatype='lfp')[source]

Simple wrapper around load_hdf_ts_segment for lfp or broadband data.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
trigger_time (float) – time (in seconds) in the recording at which the desired segment starts
time_before (float) – time (in seconds) to include before the trigger times
time_after (float) – time (in seconds) to include after the trigger times
channels (int array, optional) – which channel indices to load
datatype (str, optional) – choice of ‘lfp’ or ‘broadband’ data to load. Defaults to ‘lfp’.

Returns:

tuple containing:: segment (nt, nch): data segment from the given preprocessed file

samplerate (float): sampling rate of the returned data

Return type:

tuple

aopy.data.bmi3d.get_ts_data_trial(preproc_dir, subject, te_id, date, trigger_time, time_before, time_after, drive_number=None, channels=None, datatype='lfp')[source]

Simple wrapper around load_hdf_ts_trial for lfp or broadband data.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
te_id (int) – Block number of Task entry object
date (str) – Date of recording
trigger_time (float) – time (in seconds) in the recording at which the desired segment starts
time_before (float) – time (in seconds) to include before the trigger times
time_after (float) – time (in seconds) to include after the trigger times
channels (int array, optional) – which channel indices to load
datatype (str, optional) – choice of ‘lfp’ or ‘broadband’ data to load. Defaults to ‘lfp’.

Returns:

tuple containing:: segment (nt, nch): data segment from the given preprocessed file

samplerate (float): sampling rate of the returned data

Return type:

tuple

aopy.data.bmi3d.get_velocity_segments(*args, norm=True, **kwargs)[source]

Estimates velocity from cursor position, then finds the trial segments for velocity using get_kinematic_segments().

Parameters:

*args – arguments for get_kinematic_segments()
norm (bool) – if the output segments should be normalized. Set to false to output component velocities.
**kwargs – parameters for get_kinematic_segments()

Returns:

tuple containing:: velocities (ntrial): array of velocity estimates for each trial

trial_segments (ntrial): array of numeric code segments for each trial

Return type:

tuple

aopy.data.bmi3d.load_bmi3d_hdf_table(data_dir, filename, table_name)[source]

Loads data and metadata from a table in an hdf file generated by BMI3D

Parameters:

data_dir (str) – path to the data
filename (str) – name of the file to load from
table_name (str) – name of the table you want to load

Returns:

Tuple containing:: data (ndarray): data from bmi3d

metadata (dict): attributes associated with the table

Return type:

tuple

aopy.data.bmi3d.load_bmi3d_lasers(filename='lasers.yaml')[source]

Load the config metadata for BMI3D lasers.

Parameters:

filename (str, optional) – filename of the laser names to load. Defaults to ‘laser_names.yaml’.

Returns:

list of lasers available in the config. Each laser is a dictionary with keys

name: name of the laser
stimulation_site: name of the metadata key for the stimulation site
trigger: name of the metadata key for the trigger channel
trigger_dch: index of the trigger digital channel
sensor: name of the metadata key for the sensor channel
sensor_ach: index of the sensor analog channel

Return type:

list

aopy.data.bmi3d.load_bmi3d_root_metadata(data_dir, filename)[source]

Root metadata not accessible using pytables, instead use h5py

Parameters:

data_dir (str) – path to the data
filename (str) – name of the file to load from

Returns:

key-value attributes

Return type:

dict

aopy.data.bmi3d.load_bmi3d_task_codes(filename='task_codes.yaml')[source]

Load the default BMI3D task codes. File-specific codes can be found in exp_metadata[‘event_sync_dict’]

Parameters:: filename (str, optional) – filename of the task codes to load. Defaults to ‘task_codes.yaml’.
Returns:: (name, code) task code dictionary
Return type:: dict

aopy.data.bmi3d.load_ecube_analog(path, data_dir, channels=None)[source]

Just a wrapper around load_ecube_data() and load_ecube_metadata()

Parameters:

path (str) – base directory where ecube data is stored
data_dir (str) – folder you want to load
channels (int array, optional) – which channels to load

Returns:

Tuple containing:: data (nt, nch): analog data for the requested channels

metadata (dict): metadata (see load_ecube_metadata() for details)

Return type:

tuple

aopy.data.bmi3d.load_ecube_data(data_dir, data_source, channels=None)[source]

Loads data from eCube for a given directory and datasource

Requires load_ecube_metadata(), process_channels()

Parameters:

data_dir (str) – folder containing the data you want to load
data_source (str) – type of data (“Headstages”, “AnalogPanel”, “DigitalPanel”)
channels (int array or None) – list of channel numbers (0-indexed) to load. If None, will load all channels by default

Returns:

all the data for the given source

Return type:

(nt, nch)

aopy.data.bmi3d.load_ecube_data_chunked(data_dir, data_source, channels=None, chunksize=728)[source]

Loads a data file one “chunk” at a time. Useful for replaying files as if they were online data.

Parameters:

data_dir (str) – folder containing the data you want to load
data_source (str) – type of data (“Headstages”, “AnalogPanel”, “DigitalPanel”)
channels (int array or None) – list of channel numbers (0-indexed) to load. If None, will load all channels by default
chunksize (int) – how many samples to include in each chunk

Yields:

(chunksize, nch) – one chunk of data for the given source

aopy.data.bmi3d.load_ecube_digital(path, data_dir)[source]

Just a wrapper around load_ecube_data() and load_ecube_metadata()

Parameters:

path (str) – base directory where ecube data is stored
data_dir (str) – folder you want to load

Returns:

Tuple containing:: data (nt): digital data, arranged as 64-bit numbers representing the 64 channels

metadata (dict): metadata (see load_ecube_metadata() for details)

Return type:

tuple

aopy.data.bmi3d.load_ecube_headstages(path, data_dir, channels=None)[source]

Just a wrapper around load_ecube_data() and load_ecube_metadata()

Parameters:

path (str) – base directory where ecube data is stored
data_dir (str) – folder you want to load
channels (int array, optional) – which channels to load

Returns:

Tuple containing:: data (nt, nch): analog data for the requested channels

metadata (dict): metadata (see load_ecube_metadata() for details)

Return type:

tuple

aopy.data.bmi3d.load_ecube_metadata(data_dir, data_source)[source]

Sums the number of channels and samples across all files in the data_dir

Parameters:

data_dir (str) – eCube data directory
source (str) – selects the source (AnalogPanel, Headstages, etc.)

Returns:

Dictionary of metadata with fields:: samplerate (float): sampling rate of data for this source

data_source (str): copied from the function argument

n_channels (int): number of channels

n_samples (int): number of samples for one channel

Return type:

dict

aopy.data.bmi3d.load_emg_analog(data_dir, emg_filename)[source]

Loads emg analog data

Parameters:

data_dir (str) – base directory where emg data is stored
emg_filename (str) – hdf file you want to load

Returns:

Tuple containing:: data (nt): analog data

metadata (dict): metadata from the emg file containing samplerate

Return type:

tuple

aopy.data.bmi3d.load_emg_data(data_dir, emg_filename)[source]

Loads emg data

Parameters:

data_dir (str) – base directory where emg data is stored
emg_filename (str) – hdf file you want to load

Returns:

Tuple containing:: data (nt): emg data

metadata (dict): metadata from the emg file containing samplerate

Return type:

tuple

aopy.data.bmi3d.load_emg_digital(data_dir, emg_filename)[source]

Loads and converts emg analog data to 64-bit digital data.

Parameters:

data_dir (str) – base directory where emg data is stored
emg_filename (str) – hdf file you want to load

Returns:

Tuple containing:: data (nt): digital data, arranged as 64-bit numbers

metadata (dict): metadata from the emg file containing samplerate

Return type:

tuple

aopy.data.bmi3d.proc_ecube_data(data_path, data_source, result_filepath, result_name='broadband_data', max_memory_gb=1.0)[source]

Loads and saves eCube data into an HDF file

Requires load_ecube_metadata()

Parameters:

data_path (str) – path to folder containing the ecube data you want to load
data_source (str) – type of data (“Headstages”, “AnalogPanel”, “DigitalPanel”)
result_filepath (str) – path to hdf file to be written (or appended)
max_memory_gb (float, optional) – max memory used to load binary data at one time

Returns:

tuple containing:: dset (h5py.Dataset): the new hdf dataset

metadata (dict): the ecube metadata

Return type:

tuple

aopy.data.bmi3d.tabulate_behavior_data(preproc_dir, subjects, ids, dates, start_events, end_events, reward_events, penalty_events, metadata=[], df=None, event_code_type='code', return_bad_entries=False, repeating_start_codes=False)[source]

Concatenate trials from across experiments. Experiments are given as lists of subjects, task entry ids, and dates. Each list must be the same length. Trials are defined by intervals between the given trial start and end codes.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
start_events (list) – list of numeric codes representing the start of a trial
end_events (list) – list of numeric codes representing the end of a trial
reward_events (list) – list of numeric codes representing rewards
penalty_events (list) – list of numeric codes representing penalties
metadata (list, optional) – list of metadata keys that should be included in the df
df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.
event_code_type (str, optional) – type of event codes to use. Defaults to ‘code’. Other choices include ‘event’ and ‘data’.
return_bad_entries (bool, optional) – If True, returns the list of task entries that could not be loaded. Defaults to False.
repeating_start_codes (bool) – whether the start codes might occur multiple times within one segment. Otherwise always use the last start code within a segment. May lead to segments spanning multiple trials if used improperly. Defaults to False.

Returns:

pandas DataFrame containing the concatenated trial data with columns:: subject (str): subject name

te_id (str): task entry id

date (str): date of recording

event_codes (ntrial): numeric code segments for each trial (specified by event_code_type)

event_times (ntrial): time segments for each trial

event_idx (ntrial): index segments for each trial

reward (ntrial): boolean values indicating whether each trial was rewarded

penalty (ntrial): boolean values indicating whether each trial was penalized

%metadata_key% (ntrial): requested metadata values for each key requested

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_behavior_data_center_out(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for center-out experiments. Makes use of the task codes saved in /config/task_codes.yaml to automatically assign event codes for trial start, trial end, reward, penalty, and targets.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
metadata (list, optional) – list of metadata keys that should be included in the df
df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:: subject (str): subject name

te_id (str): task entry id

date (str): date of recording

event_codes (ntrial): numeric code segments for each trial

event_times (ntrial): time segments for each trial

reward (ntrial): boolean values indicating whether each trial was rewarded

penalty (ntrial): boolean values indicating whether each trial was penalized

%metadata_key% (ntrial): requested metadata values for each key requested

target_idx (ntrial): index of the target that was presented

target_location (ntrial): location of the target that was presented

center_target_on_time (ntrial): time at which the trial started

prev_trial_end_time (ntrial): time at which the previous trial ended

trial_end_time (ntrial): time at which the trial ended

trial_initiated (ntrial): boolean values indicating whether the trial was initiated

hold_start_time (ntrial): time at which the hold period started

hold_completed (ntrial): boolean values indicating whether the hold period was completed

delay_start_time (ntrial): time at which the delay period started

delay_completed (ntrial): boolean values indicating whether the delay period was completed

go_cue_time (ntrial): time at which the go cue was presented

reach_completed (ntrial): boolean values indicating whether the reach was completed

reach_end_time (ntrial): time at which the reach was completed

reward_start_time (ntrial): time at which the reward was presented

penalty_start_time (ntrial): time at which the penalty was presented

penalty_event (ntrial): numeric code for the penalty event

pause_start_time (ntrial): time at which the pause occurred

pause_event (ntrial): numeric code for the pause event

Return type:

pd.DataFrame

Example

subject = 'test'
start_date = '2025-08-15'
end_date = '2025-08-16'
entries = db.lookup_mc_sessions(subject=subject, date=(date.fromisoformat(start_date), date.fromisoformat(end_date)), task_desc='center out with random delay')
subjects, te_ids, te_dates = db.list_entry_details(entries)

df = tabulate_behavior_data_center_out(preproc_dir, subjects, te_ids, te_dates)
display(df.head(8))

_images/tabulate_behavior_data_center_out.png

aopy.data.bmi3d.tabulate_behavior_data_corners(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for corner reaching experiments. Makes use of the task codes saved in /config/task_codes.yaml to automatically assign event codes for trial start, trial end, reward, penalty, and targets.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
metadata (list, optional) – list of metadata keys that should be included in the df
df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:: subject (str): subject name

te_id (str): task entry id

date (str): date of recording

event_codes (ntrial): numeric code segments for each trial

event_times (ntrial): time segments for each trial

event_idx (ntrial): index segments for each trial

reward (ntrial): boolean values indicating whether each trial was rewarded

penalty (ntrial): boolean values indicating whether each trial was penalized

%metadata_key% (ntrial): requested metadata values for each key requested

sequence_params (ntrial): string of params used to generate all trajectories in the same task entry

chain_length (ntrial): number of targets presented in each trial

target_idx (ntrial): list of indices of the targets presented

target_location (ntrial): list of locations of the targets presented

prev_trial_end_time (ntrial): time at which the previous trial ended

trial_end_time (ntrial): time at which the trial ended

first_target_on_time (ntrial): time at which the trial started

trial_initiated (ntrial): boolean values indicating whether the trial was initiated

hold_start_time (ntrial): time at which the hold period started

hold_completed (ntrial): boolean values indicating whether the hold period was completed

delay_start_time (ntrial): time at which the delay period started

delay_completed (ntrial): boolean values indicating whether the delay period was completed

go_cue_time (ntrial): time at which the go cue was presented

reach_completed (ntrial): boolean values indicating whether the reach was completed

reach_end_time (ntrial): time at which the reach was completed

reward_start_time (ntrial): time at which the reward was presented

penalty_start_time (ntrial): time at which the penalty occurred

penalty_event (ntrial): numeric code for the penalty event

pause_start_time (ntrial): time at which the pause occurred

pause_event (ntrial): numeric code for the pause event

Return type:

pd.DataFrame

Example

subject = 'churro'
start_date = '2025-01-17'
end_date = '2025-01-18'
entries = db.lookup_mc_sessions(subject=subject, date=(date.fromisoformat(start_date), date.fromisoformat(end_date)))
subjects, te_ids, te_dates = db.list_entry_details(entries)

df = tabulate_behavior_data_corners(preproc_dir, subjects, te_ids, te_dates)
display(df.head(8))

_images/tabulate_behavior_data_corners.png

aopy.data.bmi3d.tabulate_behavior_data_flash(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for flash experiments. Uses the task event names (b’TARGET_ON’, b’REWARD’, and b’TRIAL_END’, specifically) to find start and end times for flash experiments.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
metadata (list, optional) – list of metadata keys that should be included in the df
df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:: subject (str): subject name

te_id (str): task entry id

date (str): date of recording

event_names (ntrial): event name segments for each trial

event_times (ntrial): time segments for each trial

%metadata_key% (ntrial): requested metadata values for each key requested

flash_start_time (ntrial): time the flash started

flash_end_time (ntrial): time the flash ended

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_behavior_data_out(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for out experiments (similar to center-out but without a trial-initiating center target). Makes use of the task codes saved in /config/task_codes.yaml to automatically assign event codes for trial start, trial end, reward, penalty, and targets.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
metadata (list, optional) – list of metadata keys that should be included in the df
df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:: subject (str): subject name

te_id (str): task entry id

date (str): date of recording

event_codes (ntrial): numeric code segments for each trial

event_times (ntrial): time segments for each trial

reward (ntrial): boolean values indicating whether each trial was rewarded

penalty (ntrial): boolean values indicating whether each trial was penalized

%metadata_key% (ntrial): requested metadata values for each key requested

target_idx (ntrial): index of the target that was presented

target_location (ntrial): location of the target that was presented

trial_start_time (ntrial): time at which the trial started

trial_end_time (ntrial): time at which the trial ended

reach_completed (ntrial): boolean values indicating whether the reach was completed

reach_end_time (ntrial): time at which the reach was completed

reward_start_time (ntrial): time at which the reward was presented

penalty_start_time (ntrial): time at which the penalty was presented

penalty_event (ntrial): numeric code for the penalty event

pause_start_time (ntrial): time at which the pause occurred

pause_event (ntrial): numeric code for the pause event

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_behavior_data_random_targets(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for random target location experiments. Uses the task event names (b’TARGET_ON’ and b’TRIAL_END’, specifically) to find start and end times for experiments.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
metadata (list, optional) – list of metadata keys that should be included in the df
df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:: subject (str): subject name

te_id (str): task entry id

date (str): date of recording

event_names (ntrial): event name segments for each trial

event_times (ntrial): time segments for each trial

%metadata_key% (ntrial): requested metadata values for each key requested

**target_idx (ntrial): ** target index for each trial within a unique data session

**target_loc (ntrial): ** target locations (x,y,z) for each trial

**prev_trial_end_time (ntrial): **time at which previous trial ended

**trial_end_time (ntrial): ** time at which trial ended

**target on (ntrial): ** time at which target appears

**reach completed (ntrial): **boolean indicating whether reach was completed

**cursor_enter_target(ntrial): **time at which cursor enters target

**reward_start_time (ntrial): **time of reward

**penalty_start_time (ntrial): **penalty start time (if applicable)

**penalty_event(ntrial): ** event description of penalty

Examples

Visualization of 5 reaches to random targets.

   subjects = ['Leo', 'Leo']
   ids = [1957, 1959]
   dates = ['2025-02-13', '2025-02-13']

   df = tabulate_behavior_data_random_targets(data_dir, subjects, ids, dates, metadata = ['sequence_params'])
   example_reaches = df[-5:] #last 5 reaches in the earlier dataframe
   example_traj = tabulate_kinematic_data(data_dir, example_reaches['subject'], example_reaches['te_id'],
                                   example_reaches['date'], example_reaches['target_on'],
                                   example_reaches['cursor_enter_target'], datatype = 'cursor')
   ex_targets = example_reaches['target_location'].to_numpy()
   bounds = [-5,5,-5,5,-5,5] #equal bounds to make visualization appear as spheres
   default_colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
   colors = default_colors[:len(ex_targets)] #match colors from the trajectories

   fig = plt.figure()
   ax = fig.add_subplot(111, projection = '3d')
   for idx, path in enumerate(example_traj):
       ax.plot(*path.T)
       visualization.plot_sphere(ex_targets[idx], color = colors[idx], radius = 0.5,
                          bounds = bounds, ax = ax)

.. image:: _images/tabulate_behavior_random_targets.png

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_behavior_data_tracking_task(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for tracking task experiments. Makes use of the task codes saved in /config/task_codes.yaml to automatically assign event codes for trial start, trial end, reward, penalty.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
metadata (list, optional) – list of metadata keys that should be included in the df
df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:: subject (str): subject name

te_id (str): task entry id

date (str): date of recording

event_codes (ntrial): numeric code segments for each trial

event_times (ntrial): time segments for each trial

event_idx (ntrial): index segments for each trial

reward (ntrial): boolean values indicating whether each trial was rewarded

penalty (ntrial): boolean values indicating whether each trial was penalized

%metadata_key% (ntrial): requested metadata values for each key requested

sequence_params (ntrial): string of params used to generate all trajectories in the same task entry

ref_freqs (ntrial): array of frequencies used to generate reference trajectory for each trial

dis_freqs (ntrial): array of frequencies used to generate disturbance trajectory for each trial

prev_trial_end_time (ntrial): time at which the previous trial ended

target_on_time (ntrial): time at which the trial started

trial_initiated (ntrial): boolean values indicating whether the trial was initiated (i.e. hold was attempted)

hold_start_time (ntrial): time at which the hold period started

hold_completed (ntrial): boolean values indicating whether the hold period was completed

tracking_start_time (ntrial): time at which the hold period ended and tracking started

trajectory_start_time (ntrial): time at which the ref & dis trajectories started (excluding the ramp up period)

trajectory_end_time (ntrial): time at which the ref & dis trajectories ended (excluding the ramp down period if the trial was rewarded)

tracking_end_time (ntrial): time at which tracking ended (whether with a reward or tracking out penalty)

reward_start_time (ntrial): time at which the reward was presented

penalty_start_time (ntrial): time at which the penalty occurred

penalty_event (ntrial): numeric code for the penalty event

pause_start_time (ntrial): time at which the pause occurred

pause_event (ntrial): numeric code for the pause event

trial_end_time (ntrial): time at which the trial ended

Return type:

pd.DataFrame

Example

_images/tabulate_behavior_data_tracking_task.png

aopy.data.bmi3d.tabulate_feature_data(preproc_dir, subjects, te_ids, dates, start_times, end_times, decoders, datatype='lfp_power', samplerate=None, preproc=None, **kwargs)[source]

Grab (online extracted) decoder feature segments across arbitrary preprocessed files. Wrapper around tabulate_task_data.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
datatype (str, optional) – column of task data to load. Default ‘lfp_power’.
samplerate (float, optional) – choose the samplerate of the data in Hz. Default None, which uses the sampling rate of the experiment.
start_times (list of float) – times in the recording at which the desired segments starts
end_times (list of float) – times in the recording at which the desired segments ends
decoders (list of riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes. If only one decoder is supplied, it will be applied to all recordings.
preproc (fn, optional) – function mapping (position, fs) data to (kinematics, fs_new). For example, a smoothing function or an estimate of velocity from position
kwargs – additional keyword arguments to pass to get_interp_task_data

Returns:

tuple containing:: segments (ntrial,): list of tensors of (nt, nfeat) feature data from each trial

samplerate (float): samplerate of the feature data

Return type:

tuple

aopy.data.bmi3d.tabulate_kinematic_data(preproc_dir, subjects, te_ids, dates, start_times, end_times, samplerate=1000, deriv=0, norm=False, datatype='cursor', filter_kinematics=False, **kwargs)[source]

Grab kinematics data from trials across arbitrary preprocessed files. Before segmenting, filters data using filter_kinematics() (default 15 Hz low-pass) and optionally applies a derivate to the data to get velocity, acceleration, or jerk.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
start_times (list of float) – times in the recording at which the desired segments starts
end_times (list of float) – times in the recording at which the desired segments ends
samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default 1000.
datatype (str, optional) – type of kinematics to tabulate. Defaults to ‘cursor’.
deriv (int, optional) – order of the derivative to compute. Default 0, no derivative.
norm (bool, optional) – if the output segments should be vector normalized at each timepoint. Default False.
filter_kinematics (bool, optional) – if True, filters the kinematics data before segmenting. Default False.
kwargs (dict, optional) – optional keyword arguments to pass to get_kinematic_segment()

Returns:

list of tensors of (nt, nch) kinematics from each trial

Return type:

(ntrial,)

Examples

subjects = ['test']
ids = [3498]
dates = ['2021-12-13']

df = tabulate_behavior_data_center_out(write_dir, subjects, ids, dates, df=None)

# Only consider completed reaches
df = df[df['reach_completed']]
kin = tabulate_kinematic_data(write_dir, df['subject'], df['te_id'], df['date'], df['go_cue_time'], df['reach_end_time'],
                            datatype='cursor', samplerate=1000)

plt.figure()
bounds = [-10, 10, -10, 10]
visualization.plot_trajectories(kin, bounds=bounds)

dst = tabulate_kinematic_data(write_dir, df['subject'], df['te_id'], df['date'], df['go_cue_time'], df['reach_end_time'],
                            deriv=0, norm=True, datatype='cursor', samplerate=1000)
spd = tabulate_kinematic_data(write_dir, df['subject'], df['te_id'], df['date'], df['go_cue_time'], df['reach_end_time'],
                            deriv=1, norm=True, datatype='cursor', samplerate=1000)
acc = tabulate_kinematic_data(write_dir, df['subject'], df['te_id'], df['date'], df['go_cue_time'], df['reach_end_time'],
                            deriv=2, norm=True, datatype='cursor', samplerate=1000)
plt.figure()
visualization.plot_timeseries(dst[0], 1000)
visualization.plot_timeseries(spd[0], 1000)
visualization.plot_timeseries(acc[0], 1000)
plt.legend(['distance', 'speed', 'acceleration'])
plt.xlabel('time from go cue (s)')
plt.ylabel('kinematics (cm)')

_images/tabulate_kinematics_derivative.png

subject = 'CES003'
te_id = 2234
date = '2025-03-04'
df = tabulate_behavior_data_center_out(data_dir, [subject], [te_id], [date])
df = df[df['reach_completed']]
plot_kin(df, 'go_cue_time', 'reach_end_time')

Different interpolation options:

raw = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='cursor', samplerate=1000)
raw_filt = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='cursor', samplerate=1000, low_cut=5, buttord=2,
                            filter_kinematics=True)
nan = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='user_screen', samplerate=1000, remove_nan=False)
nan_filt = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='user_screen', samplerate=1000, low_cut=5, buttord=2,
                            filter_kinematics=True, remove_nan=False)
pos = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='user_screen', samplerate=1000)
pos_filt = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='user_screen', samplerate=1000, low_cut=5, buttord=2,
                            filter_kinematics=True)
spd = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            deriv=1, norm=True, datatype='cursor', samplerate=1000,
                            filter_kinematics=True)
weird_trials = np.where([np.any(s > 500) for s in spd])[0]
plt.figure(figsize=(5,6))
plt.subplot(3,1,1)
for i in weird_trials:
    visualization.plot_timeseries(raw[i][:,0], 1000)
    visualization.plot_timeseries(raw_filt[i][:,0], 1000, color='k', alpha=0.5)
plt.ylabel('x position (cm)')
plt.xlabel('')
plt.title('cursor')
plt.legend(['raw', 'filtered'])
plt.subplot(3,1,2)
for i in weird_trials:
    visualization.plot_timeseries(nan[i][:,0], 1000)
    visualization.plot_timeseries(nan_filt[i][:,0], 1000, color='k', alpha=0.5)
plt.ylabel('x position (cm)')
plt.xlabel('time from go cue (s)')
plt.title('user_screen')
plt.subplot(3,1,3)
for i in weird_trials:
    visualization.plot_timeseries(pos[i][:,0], 1000)
    visualization.plot_timeseries(pos_filt[i][:,0], 1000, color='k', alpha=0.5)
plt.ylabel('x position (cm)')
plt.xlabel('time from go cue (s)')
plt.title('user_screen interp')
plt.tight_layout()

aopy.data.bmi3d.tabulate_lfp_features(preproc_dir, subjects, te_ids, dates, start_times, end_times, decoders, samplerate=None, channels=None, datatype='lfp', preproc=None, verbose=True, **kwargs)[source]

Extract (new, offline) lfp feature segments across arbitrary preprocessed files. Uses a decoder object to extract features from either lfp or broadband timeseries data. Can be applied offline to arbitrary channels. If used on broadband data from a BCI experiment, the features extracted will be (nearly) the same as the online features if the same decoder is used.

Parameters:

preproc_dir (str) – base directory where the files live
subjects (list of str) – Subject name for each recording
ids (list of int) – Block number of Task entry object for each recording
dates (list of str) – Date for each recording
datatype (str, optional) – column of task data to load. Default ‘lfp_power’.
start_times (list of float) – times in the recording at which the desired segments starts
end_times (list of float) – times in the recording at which the desired segments ends
decoders (list of riglib.bmi.Decoder) – decoder objects for each recording. If only one decoder is supplied, it will be applied to all recordings.
samplerate (float, optional) – choose the samplerate of the data in Hz. Default None, which uses the sampling rate of the experiment.
channels (list of int, optional) – list of channel indices to extract. Default None, which extracts all channels.
datatype – type of data to load. Default ‘lfp’.
preproc (fn, optional) – function mapping (position, fs) data to (kinematics, fs_new). For example, a smoothing function or an estimate of velocity from position
decode (bool, optional) – whether to decode the lfp features. Default False.
verbose (bool, optional) – whether to display a progress bar. Default True.
kwargs – additional keyword arguments

Returns:

tuple containing:: segments (ntrial,): list of tensors of (nt, nfeat) feature data from each trial

samplerate (float): samplerate of the feature data

Return type:

tuple

Examples

Plot online extracted lfp features and overlay offline extracted feature segments

subject = 'affi'
te_id = 17269
date = '2024-05-03'
subjects = [subject, subject, subject]
te_ids = [te_id, te_id, te_id]
dates = [date, date, date]
start_time = 10
end_time = 30
start_times = [10, 15, 20]
end_times = [14, 18, 28]

Load the decoder that was used in the experiment

with open(os.path.join(data_dir, 'test_decoder.pkl'), 'rb') as file:
decoder = pickle.load(file)

Load the full features for comparison

features_offline, samplerate_offline = extract_lfp_features(
    preproc_dir, subject, te_id, date, decoder,
    start_time=start_time, end_time=end_time)
features_online, samplerate_online = get_extracted_features(
    preproc_dir, subject, te_id, date, decoder,
    start_time=start_time, end_time=end_time)

time_offline = np.arange(len(features_offline))/samplerate_offline + start_time
time_online = np.arange(len(features_online))/samplerate_online + start_time

plt.figure(figsize=(8,3))
plt.plot(time_offline, features_offline[:,1], alpha=0.8, label='offline')
plt.plot(time_online, features_online[:,1], alpha=0.8, label='online')
plt.xlabel('time (s)')
plt.ylabel('power')
plt.title('readout 1')

Tabulate the segments

features_offline, samplerate_offline = tabulate_lfp_features(
    preproc_dir, subjects, te_ids, dates, start_times, end_times, decoder)
features_online, samplerate_online = tabulate_feature_data(
    preproc_dir, subjects, te_ids, dates, start_times, end_times, decoder)

for idx in range(len(start_times)):
    time_offline = np.arange(len(features_offline[idx]))/samplerate_offline + start_times[idx]
    time_online = np.arange(len(features_online[idx]))/samplerate_online + start_times[idx]
    plt.plot(time_offline, features_offline[idx][:,1], 'k--')
    plt.plot(time_online, features_online[idx][:,1], 'k--')

Add legend

plt.plot([], [], 'k--', label='segments')
plt.legend()

Database 

Interface between database methods/models and data analysis code

class aopy.data.db.BMI3DDecoder(decoder, dbname='default')[source]

Wrapper for BMI3D Decoder objects. Written like this so that other database types can implement their own decoder classes with the same methods without needing to modfiy their database model.

property channels

The decoder channels

Returns:: channels used by the decoder
Return type:: list

property decoder

The decoder object

Returns:: decoder object
Return type:: object

property filt

The decoder filter

Returns:: decoder filter
Return type:: object

get(decoder_dir=None)[source]

Fetch the decoder object from the database, if there is one.

Returns:: decoder object (type depends on which decoder is being loaded)
Return type:: Decoder

class aopy.data.db.BMI3DTaskEntry(task_entry, dbname='default')[source]

Wrapper class for bmi3d database entry classes. Written like this so that other database types can implement their own classes with the same methods without needing to modfiy their database model.

property duration

Duration of recording in seconds

Returns:: duration
Return type:: float

property experimenter

Experimenter

Returns:: name of the experimenter
Return type:: str

property features

List of features that were enabled during recording

Returns:: enabled features
Return type:: list

get_db_object()[source]

Get the raw database object representing this task entry

Returns:: bmi3d task entry object
Return type:: models.TaskEntry

get_decoder(decoder_dir=None)[source]

Fetch the decoder object from the database, if there is one.

Returns:: decoder object (type depends on which decoder is being loaded)
Return type:: Decoder

get_decoder_record()[source]

Fetch the database models.Decoder record for this recording, if there is one.

Returns:: decoder record
Return type:: models.Decoder

get_exp_mapping(raw=False)[source]

Get the experiment mapping matrix for this task entry that maps from world to screen coordinates. Only useful for manual control experiments.

Parameters:: raw (bool, optional) – if True, return the mapping in BMI3D coordinates (x,z,y). Only useful for debugging. Defaults to False.
Returns:: 3x3 mapping matrix
Return type:: np.ndarray

get_preprocessed_sources()[source]

Returns a list of datasource names that should be preprocessed for this task entry. Always includes experiment data (exp) and eye data (eye), and additionally includes broadband, lfp, and spike data if there are associated datafiles with appropriate neural data.

Returns:: preprocessed sources for this task entry
Return type:: list

get_raw_files(system_subfolders=None)[source]

Gets the raw data files associated with this task entry

Parameters:: system_subfolders (dict, optional) – dictionary of system subfolders where the data for that system is located. If None, defaults to the system name
Returns:: list of (system, filepath) for each datafile associated with this task entry
Return type:: files

get_sequence_param(paramname, default=None)[source]

Get a specific sequence parameter

Parameters:

paramname (str) – name of the parameter to get
default (object, optional) – default value to return if the parameter is not found. Defaults to None.

Returns:

parameter value

Return type:

object

get_task_param(paramname, default=None)[source]

Get a specific task parameter

Parameters:

paramname (str) – name of the parameter to get
default (object, optional) – default value to return if the parameter is not found. Defaults to None.

Returns:

parameter value

Return type:

object

has_exp_perturbation()[source]

Check if this task entry has an experiment perturbation

Returns:: True if the task entry has a non-identity mapping matrix
Return type:: bool

has_feature(featname)[source]

Check whether a feature was included in this recording

Parameters:: featname (str) – name of the feature to check
Returns:: whether or not the feature was enabled
Return type:: bool

property n_rewards

Number of rewarded trials

Returns:: number of rewarded trials
Return type:: int

property n_trials

Number of total trials presented

Returns:: number of total trials
Return type:: int

property notes

Notes

Returns:: notes
Return type:: str

preprocess(data_dir, preproc_dir, overwrite=False, exclude_sources=[], system_subfolders=None, **kwargs)[source]

Preprocess the data associated with this task entry

Parameters:

data_dir (str) – directory where the raw data is stored
preproc_dir (str) – directory where the preprocessed data will be written
overwrite (bool, optional) – whether or not to overwrite existing preprocessed data. Defaults to False.
exclude_sources (list, optional) – list of sources to exclude from preprocessing. Defaults to [].
system_subfolders (dict, optional) – dictionary of system subfolders where the data for that system is located. If None, defaults to the system name
kwargs (dict, optional) – additional keyword arguments to pass to the preprocessing function

Returns:

error message if there was an error during preprocessing

Return type:

str

property sequence_name

Sequence name, e.g. centerout_2D

Returns:: sequence name
Return type:: str

property sequence_params

All sequence parameters, e.g. ntargets or target_radius

Returns:: sequence params
Return type:: dict

property task_desc

Task description, e.g. flash

Returns:: task description
Return type:: str

property task_name

Task name, e.g. manual control

Returns:: task name
Return type:: str

property task_params

All task parameters

Returns:: task params
Return type:: dict

aopy.data.db.add_metadata_columns(df, sessions, column_names, apply_fns)[source]

Adds metadata columns (in-place) to a dataframe keyed on session id (e.g. from tabulate_behavior_data()). Specify the same number of column names as functions. Each function should take a single session as input and return a single value of any type. The return value will be appended to the dataframe in all rows where the task entry id (te_id) matches the input session.

Parameters:

df (pd.DataFrame) – dataframe of session summaries
sessions (list) – list of bmi3d task entry objects
column_names (list of str) – list of column names to append to the dataframe
apply_fns (list of functions) – functions to apply to each session to generate metadata columns

Examples

Addding a metadata column to a dataframe of session summaries

date_obj = date.fromisoformat('2023-02-06')
entries = db.lookup_sessions(date=date_obj)
df = db.summarize_entries(entries)
db.append_metadata_columns(df, entries, 'hs_data', lambda x: x.get_task_param('record_headstage'))
display(df)

Adding session and experimenter info after tabulating behavior data

date_obj = date.fromisoformat('2023-02-06')
entries = db.lookup_sessions(date=date_obj)
df = aopy.data.tabulate_behavior_data(entries)
db.append_metadata_columns(df, entries, ['session', 'experimenter'],
                                        [lambda x: x.session, lambda x: x.experimenter])
display(df)

More information about entries can be found in BMI3DTaskEntry

aopy.data.db.create_decoder_parent(project, session, task_name='nothing', task_desc='decoder parent', **kwargs)[source]

Create a new decoder parent entry (a TaskEntry) in the database. These are used to keep track of decoders that weren’t trained on a specific session.

Parameters:

project (str) – project name
session (str) – session name
task_name (str, optional) – task name. Defaults to ‘nothing’.
task_desc (str, optional) – task description. Defaults to ‘decoder parent’.
kwargs (dict, optional) – optional keyword arguments, including dbname to specify the database

Returns:

the new decoder parent entry

Return type:

TaskEntry

aopy.data.db.encode_onehot_sequence_name(sessions, sequence_types)[source]

Generates a dataframe summarizing the id, subject, date and by onehot encoding the sequences of interest of each entry in the input session list.

Parameters:

sessions (list) – list of bmi3d task entries
sequence_types (list) – Array of sequence_name strings. Can only be a list of strings

Returns:

Dataframe of entry summaries containing sequence name occurance

Return type:

pd.Dataframe

Examples

sessions = db.lookup_mc_sessions()
sequence_types = ['rand_target_chain_2D', 'centerout_2D', 'out_2D',
                'rand_target_chain_3D', 'corners_2D', 'centerout_2D_different_center',
                'sequence_2D', 'centerout_2D_select', 'single_laser_pulse']

df = db.encode_onehot_sequence_name(entries, sequence_types)
display(df)

_images/db_encode_onehot_sequence_name.png

aopy.data.db.filter_has_features(features)[source]

Filter function to select sessions only if they had the given features enabled

Parameters:: features (list or str) – a list of feature names, or a single feature
Returns:: a filter function to pass to lookup_sessions
Return type:: function

aopy.data.db.filter_has_neural_data(datasource)[source]

Filter function to select sessions only if they contain neural data recordings

Parameters:: datasource (str) – ‘ecog’ or ‘neuropixel’
Returns:: a filter function to pass to lookup_sessions
Return type:: function

aopy.data.db.get_aba_perturbation_days(entries)[source]

Finds all days with ABA block design manual control perturbation experiments based on get_aba_sessions().

Parameters:

entries (list) – list of task entries

Returns:

tuple containing: aba_days (list): list of dates with ABA block design sessions

aba_sessions (list): list of lists of session names for each date

Return type:

tuple

aopy.data.db.get_aba_perturbation_sessions(day_entries)[source]

Given a list of task entries, returns a list of session names that follow an ABA block design. The sessions must be on the same day and contain at least one session with an experiment perturbation within the day. The sessions are assigned as follows:

‘a’ for sessions until the first perturbation

‘b’ for sessions with a perturbation

‘aprime’ for the remaining (non-perturbed) sessions

Parameters:

day_entries (n_rec,) – list of task entries for a single day

Returns:

list of session names (‘a’, ‘b’, ‘aprime’): with the same length as the input list. If no sessions matching ABA format are found, returns None.

Return type:

(n_rec,) list or None

aopy.data.db.group_entries(sessions, grouping_fn=<function <lambda>>)[source]

Automatically group together a flat list of database IDs

Parameters:

sessions (list of task entries) – TaskEntry objects to group
grouping_fn (callable, optional) – grouping_fn(task_entry) takes a TaskEntry as its only argument and returns a hashable and sortable object by which to group the ids

Returns:

list of tuples, each tuple containing a group of sessions

Return type:

list

aopy.data.db.list_entry_details(sessions)[source]

Returns (subject, te_id, date) for each given session.

Parameters:

sessions (list of TaskEntry) – list of sessions

Returns:

tuple containing: subject (list): list of subject names

te_id (list): list of task entry ids

date (list): list of dates

Return type:

tuple

aopy.data.db.lookup_bmi_sessions(bmi_task_name='bmi control', **kwargs)[source]: Returns list of entries for all bmi control sessions on the given date See lookup_sessions() for details.

aopy.data.db.lookup_decoder_parent(task_name='nothing', task_desc='decoder parent', **kwargs)[source]: Lookup by project and session

aopy.data.db.lookup_decoders(id=None, parent_id=None, **kwargs)[source]

Returns list of decoders with the given filter parameters

Parameters:

id (int or list, optional) – Lookup decoders with the given ids, if provided.
parent_id (int, optional) – Lookup decoders with the given parent ids, if provided.
kwargs (dict, optional) – optional keyword arguments to pass to database lookup function.

Returns:

list of Decoder records matching the query

Return type:

list

aopy.data.db.lookup_flash_sessions(mc_task_name='manual control', **kwargs)[source]: Returns list of entries for all flash sessions on the given date. See lookup_sessions() for details.

aopy.data.db.lookup_mc_sessions(mc_task_name='manual control', **kwargs)[source]: Returns list of entries for all manual control sessions on the given date See lookup_sessions() for details.

aopy.data.db.lookup_sessions(id=None, subject=None, date=None, task_name=None, task_desc=None, session=None, project=None, experimenter=None, exclude_ids=[], filter_fn=<function <lambda>>, **kwargs)[source]

Returns list of entries for all sessions on the given date

Parameters:

id (int or list, optional) – Lookup sessions with the given ids, if provided.
subject (str, optional) – Lookup sessions with the given subject, if provided.
date (multiple, optional) – Lookup sessions from the given date, if provided. Accepts multiple formats: | datetime.date object | (start, end) tuple of datetime.date objects | (start, end) tuple of strings in the format ‘YYYY-MM-DD’ | (year, month, day) tuple of integers
task_name (str, optional) – Lookup sessions with the given task name, if provided. Examples include manual control, tracking, nothing, etc.
task_desc (str, optional) – Lookup sessions with the given task description, if provided. Examples include flash, simple center out, resting state, etc.
session (str, optional) – Lookup sessions with the given session name, if provided.
project (str, optional) – Lookup sessions with the given project name, if provided.
experimenter (str, optional) – Lookup sessions with the given experimenter, if provided.
exclude_ids (list, optional) – Exclude sessions with matching task entry ids, if provided.
filter_fn (function, optional) – Additional filtering, of signature fn(session)->bool. Defaults to `lambda x:True.
kwargs (dict, optional) – optional keyword arguments to pass to database lookup function.

Returns:

list of TaskEntry sessions matching the query

Return type:

list

aopy.data.db.lookup_tracking_sessions(tracking_task_name='tracking', **kwargs)[source]: Returns list of entries for all tracking sessions on the given date See lookup_sessions() for details.

aopy.data.db.save_decoder(decoder_parent, decoder, suffix, **kwargs)[source]

Save a new decoder to the database, associated with the given parent TaskEntry. If the decoder was trained on a specific session, use that as the parent. If not, use lookup_decoder_parent() or create_decoder_parent() to look up or create a new parent entry, respectively.

Parameters:

decoder_parent (TaskEntry) – the parent decoder entry
decoder (object) – the decoder object to save
suffix (str) – suffix to append to the decoder name
kwargs (dict, optional) – optional keyword arguments, including dbname to specify the database

Note

This only works if you have the bmi system path locally. See the BMI3D setup page to find this path and make it available on your system.

aopy.data.db.summarize_entries(entries, sum_trials=False)[source]

Generates a dataframe summarizing the subject, date, task, number of trials, and duration in minutes of each entry in the input list. Optionally sum the number of trials and duration for unique tasks across days for each subject

Parameters:

entries (list) – list of bmi3d task entries
sum_trials (bool, optional) – sum the number of trials and duration across unique tasks for each day for each subject

Returns:

dataframe of entry summaries

Return type:

pd.DataFrame

Examples

date_obj = date.fromisoformat('2023-02-06')
entries = db.lookup_sessions(date=date_obj)
df = db.summarize_entries(entries)
display(df)

df_unique = db.summarize_entries(entries, sum_trials=True)
display(df_unique)

Peslab 

aopy.data.peslab.get_exp_var(exp_data, *args)[source]

Generate a list of variable names from a .MAT formatted experiment data

Parameters:: exp_data (dict) – MAT file data dict
Returns:: list of variable names in exp_data
Return type:: var_names (list)

aopy.data.peslab.load_ecog_clfp_data(data_file_name, t_range=(0, -1), exp_file_name=None, mask_file_name=None, compute_mask=True)[source]

Load ECoG data file from a goose wireless dataset file.

Parameters:

data_file_name (str) – file path to data file
t_range (listlike, optional) – Start and stop times to read data. (0, -1) reads the entire file. Defaults to (0,-1).
exp_file_name (str, optional) – File path to experiment data JSON file.
mask_file_name (str, optional) – File path to data quality mask file. Defaults to None.
compute_mask (bool, optional) – Compute a data quality mask array if no mask file is given or found. Defaults to True.

Raises:

NameError – If experiment file cannot be found, NameError is raised.
NameError – If mask file cannot be found, NameError is raised.

Returns:

numpy array of multichannel ECoG data mask (numpy.array): binary mask indicating bad data samples exp (dict): dictionary of experiment data

Return type:

data (nt x nch)

aopy.data.peslab.load_experiment_data(exp_file_name)[source]

Reads experiment metadata from an experiment JSON file. Returns the complete data structure as a dictionary and returns electrode data as a pandas DataFrame.

Parameters:: exp_file_name (str) – JSON experiment data file path
Returns:: dict data object containing experiment metadata. See lab documentation for more information. electrode_df (DataFrame): pandas DataFrame containing microdrive electrode information. Individual channels are indexed along columns, column names are electrode IDs.
Return type:: experiment (dict)

aopy.data.peslab.load_mask_data(mask_file_name)[source]

Loads binary mask data from recording mask files. Binary True values indicate “bad” or noisy data not used in analyses.

Parameters:: mask_file_name (str) – file path to binary mask file
Returns:: numpy array of binary values. Length is equal to the number of time points in the respective data array.
Return type:: mask (numpy.array)

aopy.data.peslab.parse_file_info(file_path)[source]

Parses file strings for goose_wireless ECoG and LFP signal data into data parameters.

Parameters:: file_path (str) – path to the file’s location
Returns:: JSON experiment data file path mask_file_name (str): binary data mask file path microdrive_name (str): string name of the microdrive type used to collect data in file_path rec_type (str): recording modality reflected in this file (‘ECOG’, ‘LFP’, etc.)
Return type:: exp_file_name (str)

aopy.data.peslab.read_from_file(data_file_path, data_type, n_ch, n_read, n_offset, reshape_order='F')[source]

Reads recorded neural activity from a goose_wireless file.

Parameters:

data_file_path (str) – file path to data file
data_type (numeric type) – numpy numeric type reflecting the variable encoding in data_file_path
n_ch (int) – Number of channels in data_file_path
n_read (int) – Number of data samples read from data_file_path
n_offset (int) – Offset point defining where data reading starts
reshape_order (str, optional) – Data reshaping order. Defaults to ‘F’

Returns:

numpy array of neural activity stored in data_file_path

Return type:

data (np.array)

aopy.data.peslab.read_from_start(data_file_path, data_type, n_ch, n_read)[source]

Read data from goose wireless data file. Reads a fixed number of samples from the start of the recording.

Parameters:

data_file_path (str) – file path to data file
data_type (numeric type) – numpy numeric type reflecting the variable encoding in data_file_path
n_ch (int) – number of channels saved in data_file_path
n_read (int) – number of time points to read from data_file_path

Returns:

numpy array of neural recording data saved in data_file_path

Return type:

data (np.array)

Optitrack 

aopy.data.optitrack.load_optitrack_data(data_dir, filename)[source]

This function loads a series of x, y, z positional data from the optitrack .csv file that has 1 rigid body exported with the following settings:

Markers: Off

Unlabeled markers: Off

Quality Statistics: Off

Rigid Bodies: On

Rigid Body Markers: Off

Bones: Off

Bone Markers: Off

Header Information: On

Optitrack format Version(s): 1.23

Required packages: pandas, numpy

Parameters:

data_dir (string) – Directory to load data from
filename (string) – File name to load within the data directory

Returns:

Tuple containing:: mocap_data_pos (nt, 3): Positional mocap data

mocap_data_rot (nt, 4): Rotational mocap data

Return type:

tuple

aopy.data.optitrack.load_optitrack_metadata(data_dir, filename, metadata_row=0)[source]

This function loads optitrack metadata from .csv file that has 1 rigid body exported with the following settings:

Markers: Off

Unlabeled markers: Off

Quality Statistics: Off

Rigid Bodies: On

Rigid Body Markers: Off

Bones: Off

Bone Markers: Off

Header Information: On

Optitrack format Version(s): 1.23

Required packages: csv, pandas

Parameters:

data_dir (string) – Directory to load data from
filename (string) – File name to load within the data directory

Returns:

Dictionary of metadata for for an optitrack datafile

Return type:

dict

aopy.data.optitrack.load_optitrack_time(data_dir, filename)[source]

This function loads timestamps from the optitrack .csv file

Required packages: pandas, numpy

Parameters:

data_dir (string) – Directory to load data from
filename (string) – File name to load within the data directory

Returns:

Array of timestamps for each captured frame

Return type:

(nt)

Eye 

aopy.data.eye.proc_eye_day(preproc_dir, subject, date, correlation_min=0.9, dry_run=False)[source]

Finds files from the given subject and date with the best eye calibration and automatically applies it to every recording on that day for that subject. If no good calibration is found, raises a ValueError exception.

Parameters:

preproc_dir (str) – base directory where the files live
subject (str) – Subject name
date (str) – Date of recording
correlation_min (float, optional) – correlation below which is unacceptable
dry_run (bool, optional) – if True, files will not be modified.

Raises:

ValueError –

Returns:

tuple containing:: best_id (int): the task entry id with the highest mean absolute value correlation coefficient

te_ids (list of int): the ids to which the coeff were applied

Return type:

tuple

Neuropixel 

aopy.data.neuropixel.get_channel_bank_name(ch_bank_data, ch_config_dir='/data/channel_config_np', filename='channel_bank.npy')[source]

Get the information about which channels are used for recording. This function assumes channel configuration is either of below, long-br, middle, long-tr, top, long-tl, long-bl, bottom.

Parameters:

ch_bank_data (nch) – channel bank information contained in neuropixel
ch_config_dir (str, optional) – directory that contains the channel configuration file
filename (str, optional) – filename that includes all bank information.

Returns:

channel name (long-br, middle, long-tr, top, long-tl, long-bl, bottom)

Return type:

chname (str)

aopy.data.neuropixel.get_neuropixel_digital_input_times(data_dir, data_folder, datatype, node_idx=0, ex_idx=0, port_number=1)[source]

Computes the times when sync line come to the degital channel in openephys. Openephys recodings doesn’t always begin with 0 time index.

Parameters:

data_dir (str) – data directory where the data files are located
data_folder (str) – data folder where 1 experiment data is saved
datatype (str) – datatype. ‘ap’ or ‘lfp’
node_idx (int) – record node index. This is usually 0.
ex_idx (int) – experiment index. This is usually 0.
port_number (int) – port number which a probe connected to. Natural number from 1 to 4.

Returns:

Tuple containing:: on_times (n_times): times at which sync line turned on

off_times (n_times): times at which sync line turned off

Return type:

tuple

aopy.data.neuropixel.load_neuropixel_configuration(data_dir, data_folder, ex_idx=0, port_number=1)[source]

get neuropixel probe information from xml condiguration files made by OpenEphys channel number and electrode x pos is sorded in the order of y pos when saved by openephys This function also sorts x pos and y pos in the order of channel number

Parameters:

data_dir (str) – where to find the file
data_folder (str) – the xml file that describes neuropixel probe configuration
ex_idx (int) – experiment idx. This is usually 0.
port_number (int) – port number which a probe connected to. natural number from 1 to 4.

Returns:

dictionary thet contains electrode configuration

Return type:

config (dict)

aopy.data.neuropixel.load_neuropixel_data(data_dir, data_folder, datatype, node_idx=0, ex_idx=0, port_number=1)[source]

Load neuropixel data object and metadata. The data obeject has 4 properties of samples, sample_numbers, timestamps, and metadata. See this link: https://github.com/open-ephys/open-ephys-python-tools/tree/main/src/open_ephys/analysis

Parameters:

data_dir (str) – data directory where the data files are located
data_folder (str) – data folder where 1 experiment data is saved
datatype (str) – datatype. ‘ap’ or ‘lfp’
node_idx (int) – record node index. This is usually 0.
ex_idx (int) – experiment index. This is usually 0.
port_number (int) – port number which a probe connected to. natural number from 1 to 4.

Returns:

Tuple containing:: rawdata (object): data object

metadata (dict): metadata

Return type:

tuple

aopy.data.neuropixel.load_neuropixel_event(data_dir, data_folder, datatype, node_idx=0, ex_idx=0, port_number=1)[source]

Load neuropixel’s event data saved by openephys, accroding to datatype

Parameters:

data_dir (str) – data directory where the data files are located
data_folder (str) – data folder where 1 experiment data is saved
datatype (str) – datatype. ‘ap’ or ‘lfp’
node_idx (int) – record node index. This is usually 0.
ex_idx (int) – experiment index. This is usually 0.
port_number (int) – port number which a probe connected to. natural number from 1 to 4.

Returns:

events data

Return type:

events (ndarray)

aopy.data.neuropixel.load_parsed_ksdata(kilosort_dir, data_dir)[source]

load kilosort data (spike indices, clusters, and label) parsed into the task entries This data is not still synchronized

Parameters:

kilosort_dir (str) – kilosort directory (ex. ‘/data/preprocessed/kilosort’)
data_dir (str) – data directory that contains parsed data (ex. ‘2023-06-30_Neuropixel_ks_affi_bottom_port1_9847’)

Returns:

spike indices detected by kilosort (not spike times) spike_clusters (nspikes): unit label detected by kilsort

Return type:

spike_indices (nspikes)

Data:

API

BMI3D

Database

Peslab

Optitrack

Eye

Neuropixel

API 

BMI3D 

Database 

Peslab 

Optitrack 

Eye 

Neuropixel 