Data:

This module contains functions written to load and save data.

API

aopy.data.base.align_neuropixel_recoring_drive(neuropixel_drive, drive2, subject, theta=0, center=(0, 0))[source]

This function aligns one drive to another drive type. In the current iteration, this function only supports aligning neuropixels drives (‘NP_Insert72’/’NP_Insert137’’) to each other or to ‘ECoG244’/’Opto32’ drives. This function assumes a fixed mapping between subject and alignment is not currently compatible with selecting subsets of channels. The mapping between subject and alignment is defined in aopy/config/neuropixel_insert_ch_mapping/NP_insert_angle_alignment.xlsx. The following images depict the alignment between neuropixels insert grid hole locations and ECoG channel location for two subjects.

_images/NP_Insert137_ECoG244_alignment.png _images/NP_Insert72_ECoG244_alignment.png
Parameters:
  • neuropixel_drive (str) – Neuropixel drive to align. Currently supports ‘NP_Insert72’, and ‘NP_Insert137’

  • drive2 (str) – Other drive to align. Currently supports ‘ECoG244’, ‘Opto32’, ‘NP_Insert72’, and ‘NP_Insert137’

  • subject (str) – Subject recordings were performed on. Currently supports ‘Affi’ and ‘Beignet’

  • theta (float) – rotation (in degrees) to apply to positions. Rotations are applied clockwise. Default 0.

  • center (2-tuple) – chamber coordinates of the center of the drive in mm. Defaults to (0,0).

Returns:

Tuple Containing:
aligned_np_drive_coordinates (nelec, 2): X and Y coordinates of each neuropixel insert recording site relative to drive2
aligned_drive2_coordinates (nelec, 2): X and Y coordinates of each drive2 recording site
recording_sites (nelec): Neuropixel insert recording site numbers
acq_ch (nelec): Acquisition channels (0-indexed) for each drive2 recording site

Return type:

tuple

aopy.data.base.find_preproc_ids_from_day(preproc_dir, subject, date, data_source)[source]

Returns the task entry ids that have preprocessed files in the given directory matching the subject, date, and data source given.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • date (str) – Date of recording

  • data_source (str) – Processed data type (exp, eye, broadband, lfp, etc.)

Returns

list of ids: task entry id for each matching file found in the given folder

aopy.data.base.get_filenames_in_dir(base_dir, te)[source]

Gets the filenames for available systems in a given task entry. Requires that files are organized by system in the base directory, and named with their task entry somewhere in their filename or directory name.

Parameters:
  • base_dir (str) – directory where the files will be

  • te (int) – block number for the task entry

Returns:

dictionary of files indexed by system

Return type:

dict

aopy.data.base.get_hdf_dictionary(data_dir, hdf_filename, show_tree=False)[source]

Lists the hdf contents in a dictionary. Does not read any data! For example, calling get_hdf_dictionary() with show_tree will result in something like this:

>>> dict = get_hdf_dictionary('/exampledir', 'example.hdf', show_tree=True)
example.hdf
└──group1
|  └──group_data: [shape: (1000,), type: int64]
└──test_data: [shape: (1000,), type: int64]
>>> print(dict)
{
    'group1': {
        'group_data': ((1000,), dtype('int64'))
    },
    'test_data': ((1000,), dtype('int64'))
}
Parameters:
  • data_dir (str) – folder where data is located

  • hdf_filename (str) – name of hdf file

Returns:

contents of the file keyed by name as tuples containing:
shape (tuple): size of the data
dtype (np.dtype): type of the data

Return type:

dict

aopy.data.base.get_kilosort_foldername(subject, te_id, date, data_source)[source]

Generates a folder name string to access the Kilosort output.

Parameters:
  • subject (str) – The subject name.

  • te_id (int or list of int) – The experiment task entry(s) to use.

  • date (str) – The experiment date.

  • data_source (str) – The data source (e.g., ‘Neuropixel’)

Returns:

A formatted folder name string for the kilosort output in the format:

”{date}_{data_source}_{subject}_te{te_id1}_te{te_id2}…”.

Return type:

str

aopy.data.base.get_preprocessed_filename(subject, te_id, date, data_source)[source]

Generates preprocessed filenames as per our naming conventions. Format: preproc_<Date>_<MonkeyName>_<TaskEntry>_<DataSource>.hdf

Parameters:
  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • data_source (str) – Processed data type (exp, eye, broadband, lfp, etc.)

Returns:

filename

Return type:

str

aopy.data.base.is_table_in_hdf(table_name: str, hdf_filename: str)[source]

Checks if a table exists in an hdf file’ first level directory(i.e. non-recursively)

Parameters:
  • table_name (str) – table name to be checked

  • hdf_filename (str) – full path to the hdf file

Returns:

Boolean

aopy.data.base.list_root_groups(data_dir, hdf_filename)[source]

List the name of groups directly under the root in HDF5 files.

Parameters:
  • data_dir (str) – folder where data is located

  • hdf_filename (str) – name of hdf file

Returns:

Name of groups

Return type:

list

aopy.data.base.load_chmap(drive_type='ECoG244', acq_ch_subset=None, theta=0, center=(0, 0), **kwargs)[source]

Load the centered mapping between acquisition channels and electrode position for supported drives. Currently supports ‘ECoG244’, ‘Opto32’, ‘NP_Insert72’, and ‘NP_Insert137’ drives.

Parameters:
  • drive_type (str, optional) – Drive type of the method used to record neural activity. - ‘ECoG244’: Viventi 244 channel ECoG array - ‘Opto32’: Orsborn 32 channel fiber optic array - ‘NP_Insert72’: Orsborn 72 site Neuropixel grid - ‘NP_Insert137’: Orsborn 137 site Neuropixel grid

  • acq_ch_subset (nacq, optional) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be returned.

  • theta (float) – rotation (in degrees) to apply to positions. rotations are applied clockwise, e.g., theta = 90 rotates the map clockwise by 90 degrees, -90 rotates the map anti-clockwise by 90 degrees. Default 0.

  • center (2-tuple) – chamber coordinates of the center of the drive in mm. This function translates the coordinates of the drive to be centered on this value. Defaults to (0,0).

  • kwargs (dict) – Additional keyword arguments to pass to map_acq2pos()

Returns:

Tuple Containing:
acq_ch_position (nelec, 2): X and Y coordinates (in mm) of the electrodes corresponding to each acquisition channel. X position is in the first column and Y position is in the second column
acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)
connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

Examples

plot_ECoG244_data_map(np.zeros(256,), cmap='Greys')
annotate_spatial_map_channels(drive_type='ECoG244', color='k')
annotate_spatial_map_channels(drive_type='Opto32', color='b')
annotate_spatial_map_channels(drive_type='ECoG244', color='r', theta=90)
annotate_spatial_map_channels(drive_type='Opto32', color='g', theta=90)
_images/ecog244_opto32_theta90.png
plt.figure()
plot_spatial_drive_map(np.zeros(64,), drive_type='EMG_GR08MM1305', cmap='Greys', theta=0)
annotate_spatial_map_channels(drive_type='EMG_GR08MM1305', color='k', theta=0)
_images/emg64_gr08mm1305.png
aopy.data.base.load_electrode_pos(data_dir, pos_file)[source]

Reads an electrode position map file and returns the x and y positions. The file should have the columns ‘topdown_x’ and ‘topdown_y’.

Parameters:
  • data_dir (str) – where to find the file

  • pos_file (str) – the excel file

Returns:

Tuple containing:
x_pos (nch): x position of each electrode
y_pos (nch): y position of each electrode

Return type:

tuple

aopy.data.base.load_hdf_data(data_dir, hdf_filename, data_name, data_group='/', cached=False)[source]

Simple wrapper to get the data from an hdf file as a numpy array

Parameters:
  • data_dir (str) – folder where data is located

  • hdf_filename (str) – name of hdf file

  • data_name (str) – table to load

  • data_group (str, optional) – from which group to load data

  • cached (bool, optional) – whether to allow loading cached data or not

Returns:

numpy array of data from hdf

Return type:

ndarray

aopy.data.base.load_hdf_group(data_dir, hdf_filename, group='/', cached=False)[source]

Loads any datasets from the given hdf group into a dictionary. Also will recursively load other groups if any exist under the given group

Parameters:
  • data_dir (str) – folder where data is located

  • hdf_filename (str) – name of hdf file

  • group (str, optional) – name of the group to load

  • cached (bool, optional) – whether to allow loading cached data or not

Returns:

all the datasets contained in the given group

Return type:

dict

aopy.data.base.load_hdf_ts_segment(preproc_dir, filename, data_group, data_name, samplerate, start_time, end_time, channels=None)[source]

Load a segment of HDF timeseries data given a start and end time and a sampling rate.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • filename (str) – filename of the hdf file where the data resides

  • data_group (str) – hdf group of the desired dataset

  • data_name (str) – hdf name of the desired dataset

  • samplerate (float) – the sampling rate of the data in Hz

  • start_time (float) – time (in seconds) in the recording at which the desired segment starts

  • end_time (float) – time (in seconds) in the recording at which the desired segment ends

  • channels (list, optional) – list of channels to include in the segment (default all channels)

Raises:

ValueError – if the dataset cannot be found in the file

Returns:

tuple containing:
segment (nt, nch): data segment from the given preprocessed file
samplerate (float): sampling rate of the returned data

Return type:

tuple

aopy.data.base.load_hdf_ts_trial(preproc_dir, filename, data_group, data_name, samplerate, trigger_time, time_before, time_after, channels=None)[source]

Load a segment of HDF timeseries data given start and end times and a sampling rate.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • filename (str) – filename of the hdf file where the data resides

  • data_group (str) – hdf group of the desired dataset

  • data_name (str) – hdf name of the desired dataset

  • samplerate (float) – the sampling rate of the data in Hz

  • trigger_time (float) – time (in seconds) in the recording at which the desired segment starts

  • time_before (float) – time (in seconds) to include before the trigger times

  • time_after (float) – time (in seconds) to include after the trigger times

  • channels (list, optional) – list of channels to include in the segment (default all channels

Raises:

ValueError – if the dataset cannot be found in the file

Returns:

tuple containing:
segment (nt, nch): data segment from the given preprocessed file
samplerate (float): sampling rate of the returned data

Return type:

tuple

aopy.data.base.load_matlab_cell_strings(data_dir, hdf_filename, object_name)[source]

This function extracts strings from an object within .mat file that was saved from matlab in version -7.3 (-v7.3).

example:

>>> testfile = 'matlab_cell_str.mat'
>>> strings = load_matlab_cell_strings(data_dir, testfile, 'bmiSessions')
>>> print(strings)
['jeev070412j', 'jeev070512g', 'jeev070612d', 'jeev070712e', 'jeev070812d']
Parameters:
  • data_dir (str) – where the matlab file is located

  • hdf_filename (str) – .mat filename

  • object_name (str) – Name of object to load. This is typically the variable name saved from matlab

Returns:

List of strings in the hdf file object

Return type:

(list of strings)

aopy.data.base.load_preproc_analog_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads analog data from a preprocessed file.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

analog data dict: dictionary of analog metadata

Return type:

dict

aopy.data.base.load_preproc_ap_data(preproc_dir, subject, te_id, date, drive_number=None, cached=True)[source]

Loads spike band time series from a preprocessed file. When drive_number is None, load lfp_data and lfp_metadata directly. Please specify drive_number when there are drives in hdf files.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • drive_number (int) – drive number for multiple recordings. 1-based indexing.

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Raises:

ValueError – if drives are detected when drive number is None.

Returns:

numpy array of ap data from hdf dict: Dictionary of ap metadata

Return type:

ndarray

aopy.data.base.load_preproc_broadband_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads broadband data from a preprocessed file.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

broadband data dict: Dictionary of broadband metadata

Return type:

dict

aopy.data.base.load_preproc_digital_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads digital data from a preprocessed file.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

digital data dict: dictionary of digital metadata

Return type:

dict

aopy.data.base.load_preproc_emg_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads emg data from a preprocessed file.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

Dictionary of exp data dict: Dictionary of exp metadata

Return type:

dict

aopy.data.base.load_preproc_exp_data(preproc_dir, subject, te_id, date, verbose=True, cached=True)[source]

Loads experiment data from a preprocessed file.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • verbose (bool, optional) – check for preprocessing errors and print them (default True)

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

Dictionary of exp data dict: Dictionary of exp metadata

Return type:

dict

aopy.data.base.load_preproc_eye_data(preproc_dir, subject, te_id, date, cached=True)[source]

Loads eye data from a preprocessed file.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

Dictionary of eye data dict: Dictionary of eye metadata

Return type:

dict

aopy.data.base.load_preproc_lfp_data(preproc_dir, subject, te_id, date, drive_number=None, cached=True)[source]

Loads LFP data from a preprocessed file. When drive_number is None, load lfp_data and lfp_metadata directly. Please specify drive_number when there are drives in hdf files.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • drive_number (int) – drive number for multiple recordings. 1-based indexing.

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Raises:

ValueError – if drives are detected when drive number is None.

Returns:

numpy array of lfp data from hdf dict: Dictionary of lfp metadata

Return type:

ndarray

aopy.data.base.load_preproc_spike_data(preproc_dir, subject, te_id, date, drive_number=1, cached=True)[source]

Loads spike data from a preprocessed file.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • drive_number (int) – drive number for multiple recordings. 1-based indexing.

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

spike data dict: Dictionary of spike metadata

Return type:

dict

aopy.data.base.load_spike_waveforms(preproc_dir, subject, te_id, date, drive_number=1, cached=True)[source]

Loads spike waveforms from a preprocessed file.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • drive_number (int) – drive number for multiple recordings. 1-based indexing.

  • cached (bool, optional) – whether to allow loading cached version of data (default True)

Returns:

spike waveforms

Return type:

dict

aopy.data.base.load_yaml_config(filename)[source]

Load a yaml configuration file into a dictionary

Parameters:

config_file (str) – path to the yaml configuration file

Returns:

dictionary containing the configuration parameters

Return type:

dict

aopy.data.base.lookup_acq2elec(data_dir, signal_path_file, acq, zero_index=True)[source]

Looks up the electrode number for a given acquisition channel using an excel map file (from Dr. Map)

Parameters:
  • data_dir (str) – where the signal path file is located

  • signal_path_file (str) – signal path definition file

  • acq (int) – which channel to look up

  • zero_index (bool, optional) – use 0-indexing for acq and elec (default True)

Returns:

matching electrode number. If no matching electrode is found, returns -1 (or 0 with zero_index=False)

Return type:

int

aopy.data.base.lookup_elec2acq(data_dir, signal_path_file, elec, zero_index=True)[source]

Looks up the acquisition channel for a given electrode number using an excel map file (from Dr. Map)

Parameters:
  • data_dir (str) – where the signal path file is located

  • signal_path_file (str) – signal path definition file

  • elec (int) – which electrode to look up

  • zero_index (bool, optional) – use 0-indexing for acq and elec (default True)

Returns:

matching acquisition channel. If no matching channel is found, returns -1 (or 0 with zero_index=False)

Return type:

int

aopy.data.base.lookup_excel_value(data_dir, excel_file, from_column, to_column, lookup_value)[source]

Finds a matching value for the given key in an excel file. Used for looking up electrode and acquisition channels for signal path files, but can also be useful as a lookup table for other numeric mappings.

Parameters:
  • data_dir (str) – where the signal path file is located

  • signal_path_file (str) – signal path definition file

  • from_column (str, optional) – the name of the electrode column

  • to_column (str, optional) – the name of the acquisition column

  • lookup_value (int) – match this value in the from_column

Returns:

the corresponding value in the lookup table, or 0 if none is found

Return type:

int

aopy.data.base.map_acq2elec(signalpath_table, acq_ch_subset=None)[source]

Create index mapping from acquisition channel to electrode number. Excel files can be loaded as a pandas dataframe using pd.read_excel

Parameters:
  • signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)

  • acq_ch_subset (nacq) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be return. If a requested acquisition channel isn’t returned a warned will be displayed

Returns:

Tuple containing:
acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)
connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

aopy.data.base.map_acq2pos(signalpath_table, eleclayout_table, acq_ch_subset=None, theta=0, rotation_offset=(0, 0), xpos_name='topdown_x', ypos_name='topdown_y')[source]

Create index mapping from acquisition channel to electrode position by calling aopy.data.map_acq2elec Excel files can be loaded as a pandas dataframe using pd.read_excel

Parameters:
  • signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)

  • eleclayout_table (pd dataframe) – Electrode position information in a pandas dataframe. (Mapping between electrode and position on array)

  • acq_ch_subset (nacq) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be return. If a requested acquisition channel isn’t returned a warned will be displayed

  • theta (float) – rotation (in degrees) to apply to positions. rotations are applied clockwise, e.g., theta = 90 rotates the map clockwise by 90 degrees, -90 rotates the map anti-clockwise by 90 degrees. Default 0.

  • rotation_offset (tuple) – X and Y coordinates of the rotation center. Defaults to (0,0)

  • xpos_name (str) – Column name for the electrode ‘x’ position. Defaults to ‘topdown_x’ used with the viventi ECoG array

  • ypos_name (str) – Column name for the electrode ‘y’ position. Defaults to ‘topdown_y’ used with the viventi ECoG array

Returns:

Tuple Containing:
acq_ch_position (nelec, 2): X and Y coordinates of the electrode each acquisition channel gets data from. X position is in the first column and Y position is in the second column
acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)
connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

aopy.data.base.map_data2elec(datain, signalpath_table, acq_ch_subset=None, zero_indexing=False)[source]

Map data from its acquisition channel to the electrodes recorded from. Wrapper for aopy.data.map_acq2elec Excel files can be loaded as a pandas dataframe using pd.read_excel

Parameters:
  • datain (nt, nacqch) – Data recoded from an array.

  • signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)

  • acq_ch_subset (nacq) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be return. If a requested acquisition channel isn’t returned a warned will be displayed

  • zero_indexing (bool) – Set true if acquisition channel numbers start with 0. Defaults to False.

Returns:

Tuple containing:
dataout (nt, nelec): Data from the connected electrodes
acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)
connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

aopy.data.base.map_data2elecandpos(datain, signalpath_table, eleclayout_table, acq_ch_subset=None, theta=0, rotation_offset=(0, 0), xpos_name='topdown_x', ypos_name='topdown_y', zero_indexing=False)[source]

Map data from its acquisition channel to the electrodes recorded from and their position. Wrapper for aopy.data.map_acq2pos Excel files can be loaded as a pandas dataframe using pd.read_excel

Parameters:
  • datain (nt, nacqch) – Data recoded from an array.

  • signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)

  • eleclayout_table (pd dataframe) – Electrode position information in a pandas dataframe. (Mapping between electrode and position on array)

  • acq_ch_subset (nacq) – Subset of acquisition channels to call. If not called, all acquisition channels and connected electrodes will be return. If a requested acquisition channel isn’t returned a warned will be displayed

  • theta (float) – rotation (in degrees) to apply to positions. rotations are applied clockwise, e.g., theta = 90 rotates the map clockwise by 90 degrees, -90 rotates the map anti-clockwise by 90 degrees. Default 0.

  • rotation_offset (tuple) – X and Y coordinates of the rotation center. Defaults to (0,0)

  • xpos_name (str) – Column name for the electrode ‘x’ position. Defaults to ‘topdown_x’ used with the viventi ECoG array

  • ypos_name (str) – Column name for the electrode ‘y’ position. Defaults to ‘topdown_y’ used with the viventi ECoG array

  • zero_indexing (bool) – Set true if acquisition channel numbers start with 0. Defaults to False.

Returns:

Tuple containing:
dataout (nt, nelec): Data from the connected electrodes
acq_ch_position (nelec, 2): X and Y coordinates of the electrode each acquisition channel gets data from. X position is in the first column and Y position is in the second column
acq_chs (nelec): Acquisition channels that map to electrodes (e.g. 240/256 for viventi ECoG array)
connected_elecs (nelec): Electrodes used (e.g. 240/244 for viventi ECoG array)

Return type:

tuple

aopy.data.base.map_elec2acq(signalpath_table, elecs)[source]

This function finds the acquisition channels that correspond to the input electrode numbers given the signal path table input. This function works by calling aopy.data.map_acq2elec and subsampling the output. If a requested electrode isn’t connected to an acquisition channel a warning will be displayed alerting the user and the corresponding index in the output array will be a np.nan value.

Parameters:
  • signalpath_table (pd dataframe) – Signal path information in a pandas dataframe. (Mapping between electrode and acquisition ch)

  • elecs (nelec) – Electrodes to find the acquisition channels for

Returns:

Acquisition channels that map to electrodes (e.g. nelec/256 for viventi ECoG array)

Return type:

acq_chs

aopy.data.base.parse_str_list(strings, str_include=None, str_avoid=None)[source]

This function parses a list of strings to return the strings that include/avoid specific substrings It was designed to parse dictionary keys

Parameters:
  • strings (list of strings) – List of strings

  • str_include (list of strings) – List of substrings that must be included in a string to keep it

  • str_avoid (list of strings) – List of substrings that can not be included in a string to keep it

Returns:

List of strings fitting the input conditions

Return type:

(list of strings)

Example::
>>> str_list = ['sig001i_wf', 'sig001i_wf_ts', 'sig002a_wf', 'sig002a_wf_ts',
                'sig002b_wf', 'sig002b_wf_ts', 'sig002i_wf', 'sig002i_wf_ts']
>>> parsed_strings = parse_str_list(str_list, str_include=['sig002', 'wf'], str_avoid=['b_wf', 'i_wf'])
>>> print(parsed_strings)
['sig002a_wf', 'sig002a_wf_ts']
aopy.data.base.pkl_read(file_to_read, read_dir)[source]

Reads data stored in a pickle file.

Parameters:
  • file_to_read (str) – filename with ‘.pkl’ extension

  • read_dir (str) – Path to folder where the file is stored

Returns:

data in a format as it is stored

aopy.data.base.pkl_write(file_to_write, values_to_dump, write_dir)[source]

Write data into a pickle file. Note: H5D5 (HDF) files can not be pickled. Refer aopy.data.save_hdf() for saving HDF data

Parameters:
  • file_to_write (str) – filename with ‘.pkl’ extension

  • values_to_dump (any) – values to write in a pickle file

  • write_dir (str) – Path - where do you want to write this file

Returns:

None

examples: pkl_write(‘meta.pkl’, data, ‘/data_dir’)

aopy.data.base.save_hdf(data_dir, hdf_filename, data_dict, data_group='/', compression=0, append=False, debug=False)[source]

Writes data_dict and params into a hdf file in the data_dir folder

Parameters:
  • data_dir (str) – destination file directory

  • hdf_filename (str) – name of the hdf file to be saved

  • data_dict (dict) – the data to be saved as a hdf file

  • data_group (str, optional) – where to store the data in the hdf

  • compression (int, optional) – gzip compression level. 0 indicate no compression. Compression not added to existing datasets. (default: 0)

  • append (bool, optional) – append an existing hdf file or create a new hdf file

Returns:

None

aopy.data.base.yaml_read(filename)[source]

The FullLoader parameter handles the conversion from YAML scalar values to Python the dictionary format :param filename: Filename including the full path :type filename: str

Returns:

Params data dumped into a yaml file

Return type:

data (dict)

Example

>>>params_file = ‘/test_data/task_codes.yaml’ >>>task_codes = yaml_read(params_file, params)

aopy.data.base.yaml_write(filename, data)[source]

YAML stands for Yet Another Markup Language. It can be used to save Params or configuration files. :param filename: Filename including the full path :type filename: str :param data: Params data to be dumped into a yaml file :type data: dict

Returns: None

Example

>>>params = [{ ‘CENTER_TARGET_ON’: 16 , ‘CURSOR_ENTER_CENTER_TARGET’ : 80 , ‘REWARD’ : 48 , ‘DELAY_PENALTY’ : 66 }] >>>params_file = ‘/test_data/task_codes.yaml’ >>>yaml_write(params_file, params)

BMI3D

aopy.data.bmi3d.extract_lfp_features(preproc_dir, subject, te_id, date, decoder, samplerate=None, channels=None, start_time=None, end_time=None, latency=0.02, datatype='lfp', preproc=None, **kwargs)[source]

Extracts features from a BMI3D experiment using data aligned to the timestamps of the experiment. Using this function, you can replicate closely the features that would have been extracted from a real-time BMI3D experiment, even if the experiment did not include a decoder.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • decoder (riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes

  • samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default None, uses the sampling rate of the experiment.

  • channels (int array, optional) – which channel indices to load. If None (the default), uses the channels specified in the decoder.

  • start_time (float, optional) – time (in seconds) in the recording at which the desired segment starts

  • end_time (float, optional) – time (in seconds) in the recording at which the desired segment ends

  • latency (float, optional) – time (in seconds) to include before the trigger times

  • datatype (str, optional) – choice of ‘lfp’ or ‘broadband’ data to load. Defaults to ‘lfp’. If the sampling rate of the data is different from the decoder, the data will be downsampled by decimation.

  • preproc (fn, optional) – function mapping (state, fs) data to (state_new, fs_new). For example, a smoothing function.

  • kwargs – additional keyword arguments to pass to sample_timestamped_data

Returns:

tuple containing:
feats (nt, nfeats): lfp features for the given channels after preprocessing
samplerate (float): the sampling rate of the states after preprocessing

Return type:

tuple

Note

For best accuracy, use ‘broadband’ or other datatype without any filtering. Using filtered ‘lfp’ results in DC shifted features.

Examples

subject = 'affi'
te_id = 17269
date = '2024-05-03'
preproc_dir = data_dir
start_time = 10
end_time = 30

Extract features using extract_lfp_features() and states using extract_lfp_features() with decode=True:

Get online extracted features from get_extracted_features() and states from get_decoded_states() for comparison:

features_online, samplerate_online = get_extracted_features(
    preproc_dir, subject, te_id, date, decoder,
    start_time=start_time, end_time=end_time)

Plot the online and offline features:

time_offline = np.arange(len(features_offline))/samplerate_offline + start_time
time_online = np.arange(len(features_online))/samplerate_online + start_time

plt.figure(figsize=(8,3))
plt.plot(time_offline, features_offline[:,1], alpha=0.8, label='offline')
plt.plot(time_online, features_online[:,1], alpha=0.8, label='online')
plt.xlabel('time (s)')
plt.ylabel('power')
plt.legend()
plt.title('readout 1')
_images/extract_decoder_features.png
aopy.data.bmi3d.filter_lfp_from_broadband(broadband_filepath, result_filepath, drive_number=1, mean_subtract=True, dtype='int16', max_memory_gb=1.0, **filter_kwargs)[source]

Filters local field potential (LFP) data from a given broadband signal file into an hdf file.

Parameters:
  • broadband_filepath (str) – Path to the input broadband signal file.

  • result_filepath (str) – Path to save the filtered LFP data.

  • mean_subtract (bool, optional) – Whether to subtract the mean from the filtered LFP signal. Default is True.

  • dtype (str, optional) – Data type for the filtered LFP signal. Default is ‘int16’.

  • max_memory_gb (float, optional) – Maximum memory (in gigabytes) to use for filtering. Default is 1.0 GB.

  • **filter_kwargs – Additional keyword arguments to customize the filtering process. These arguments will be passed to the filtering function.

Raises:
  • IOError – If the input broadband file is not found.

  • MemoryError – If the specified max_memory_gb is insufficient for the filtering process.

Note

This function is used in the proc_lfp() wrapper.

aopy.data.bmi3d.filter_lfp_from_ecube(ecube_filepath, result_filepath, drive_number=1, mean_subtract=True, dtype='int16', max_memory_gb=1.0, **filter_kwargs)[source]

Filters local field potential (LFP) data from an eCube recording file.

Parameters:
  • ecube_filepath (str) – Path to the input eCube recording file.

  • result_filepath (str) – Path to save the filtered LFP data.

  • mean_subtract (bool, optional) – Whether to subtract the mean from the filtered LFP signal. Default is True.

  • dtype (str, optional) – Data type for the filtered LFP signal. Default is ‘int16’.

  • max_memory_gb (float, optional) – Maximum memory (in gigabytes) to use for filtering. Default is 1.0 GB.

  • **filter_kwargs – Additional keyword arguments to customize the filtering process. These arguments will be passed to the filtering function.

Raises:
  • IOError – If the input eCube recording file is not found.

  • MemoryError – If the specified max_memory_gb is insufficient for the filtering process.

Note

This function is used in the proc_lfp() wrapper.

aopy.data.bmi3d.get_decoded_states(preproc_dir, subject, te_id, date, decoder, samplerate=None, start_time=None, end_time=None, preproc=None, **kwargs)[source]

Fetches online decoded states from readouts in a BCI experiment. Wrapper around get_task_data.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • decoder (riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes

  • samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default None, uses the sampling rate of the experiment.

  • start_time (float, optional) – start time of the segment to load (in seconds). Default None, which loads from the beginning of the data.

  • end_time (float, optional) – end time of the segment to load (in seconds). Default None, which loads until the end of the data.

  • preproc (fn, optional) – function mapping (state, fs) data to (state_new, fs_new). For example, a smoothing function.

  • kwargs – additional keyword arguments to pass to sample_timestamped_data

Returns:

tuple containing:
state (nt, nstate): decoded states from the given experiment after preprocessing
samplerate (float): the sampling rate of the states after preprocessing

Return type:

tuple

aopy.data.bmi3d.get_e3v_video_frame_data(digital_data, sync_channel_idx, trigger_channel_idx, samplerate)[source]

Compute pulse times and duty cycles from e3vision video data frames collected on an ecube digital panel.

Parameters:
  • digital_data (nt, nch) – array of data read from ecube digital panel

  • sync_channel_idx (int) – sync channel to read from digital_data. Indicates each video frame.

  • trigger_channel_idx (int) – trigger channel to read from digital_data. Indicates start/end video triggers.

  • sample_rate (numeric) – data sampling rate (Hz)

Returns:

array of floats indicating pulse start times duty_cycle (np.array): array of floats indicating pulse duty cycle (quotient of pulse width and pulse period)

Return type:

pulse_times (np.array)

aopy.data.bmi3d.get_ecube_data_sources(data_dir)[source]

Lists the available data sources in a given data directory

Parameters:

data_dir (str) – eCube data directory

Returns:

available sources (AnalogPanel, Headstages, etc.)

Return type:

str array

aopy.data.bmi3d.get_ecube_digital_input_times(path, data_dir, ch)[source]

Computes the times when digital input turns on or off in ecube For synchronizing openephys with ecube, use ch=-1.

Args: path (str): base directory where ecube data is stored data_dir (str): folder you want to load ch (str): digital channel

Returns:

Tuple containing:
on_times (n_times): times at which sync line turned on
off_times (n_times): times at which sync line turned off

Return type:

tuple

aopy.data.bmi3d.get_extracted_features(preproc_dir, subject, te_id, date, decoder, samplerate=None, start_time=None, end_time=None, datatype='lfp_power', preproc=None, **kwargs)[source]

Fetches online extracted features from readouts of a BCI experiment. Wrapper around get_task_data.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • decoder (riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes

  • samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default None, uses the sampling rate of the experiment.

  • start_time (float, optional) – start time of the segment to load (in seconds). Default None, which loads from the beginning of the data.

  • end_time (float, optional) – end time of the segment to load (in seconds). Default None, which loads until the end of the data.

  • datatype (str, optional) – type of features to load. Defaults to ‘lfp_power’.

  • preproc (fn, optional) – function mapping (state, fs) data to (state_new, fs_new). For example, a smoothing function.

  • kwargs – additional keyword arguments to pass to sample_timestamped_data

Returns:

tuple containing:
state (nt, nfeats): decoded states from the given experiment after preprocessing
samplerate (float): the sampling rate of the states after preprocessing

Return type:

tuple

aopy.data.bmi3d.get_interp_task_data(exp_data, exp_metadata, datatype='cursor', samplerate=1000, step=1, **kwargs)[source]

Gets interpolated data from preprocessed experiment task cycles to the desired sampling rate. Cursor kinematics are returned in screen coordinates, while user input kinematics are returned either in their original raw coordinate system with datatype=’user_raw’ (e.g. optitrack coordinates), in world coordinates with datatype=’user_world’, or in screen coordinates with datatype=’user_screen’ (similar to cursor kinematics but without any bounding under position control).

Parameters:
  • exp_data (dict) – A dictionary containing the experiment data.

  • exp_metadata (dict) – A dictionary containing the experiment metadata.

  • datatype (str, optional) – The type of kinematic data to interpolate. - ‘cursor’ for cursor kinematics - ‘user_raw’ for raw input coordinates - ‘user_world’ for user input in world coordinates - ‘user_screen’ for user input in screen coordinates - ‘reference’ for reference kinematics - ‘disturbance’ for disturbance kinematics - ‘targets’ for target positions - other datatypes if they exist as exp_data[‘task’][<datatype>]

  • samplerate (float, optional) – The desired output sampling rate in Hz. Defaults to 1000.

  • step (int, optional) – task data will be decimated with steps this big. Default 1.

  • **kwargs – Additional keyword arguments to pass to sample_timestamped_data()

Returns:

Kinematic data interpolated and filtered

to the desired sampling rate.

Return type:

data_time (ns, …)

Examples

Cursor kinematics in screen coordinates (datatype ‘cursor’)

exp_data, exp_metadata = load_preproc_exp_data(preproc_dir, 'test',  3498, '2021-12-13')
cursor_interp = get_interp_task_data(exp_data, exp_metadata, datatype='cursor', samplerate=100)

plt.figure()
visualization.plot_trajectories([cursor_interp], [-10, 10, -10, 10])
_images/get_interp_cursor_centerout.png

Raw input kinematics (datatype ‘user_raw’, ‘hand’, or ‘manual_input’)

hand_interp = get_interp_task_data(exp_data, exp_metadata, datatype='hand', samplerate=100)
ax = plt.axes(projection='3d')
visualization.plot_trajectories([hand_interp], [-10, 10, -10, 10, -10, 10])
_images/get_interp_hand_centerout.png

User input kinematics in world coordinates (datatype ‘user_world’)

user_world = get_interp_task_data(exp_data, exp_metadata, datatype='user_world', samplerate=100)
ax = plt.axes(projection='3d')
visualization.plot_trajectories([user_world], [-10, 10, -10, 10, -10, 10])
_images/get_user_world.png

User input kinematics in screen coordinates (datatype ‘user_screen’)

user_screen = get_interp_task_data(exp_data, exp_metadata, datatype='user_screen', samplerate=100)
ax = plt.axes(projection='3d')
visualization.plot_trajectories([user_screen], [-10, 10, -10, 10, -10, 10])
_images/get_user_screen.png

Target positions (datatype ‘target’)

targets_interp = get_interp_task_data(exp_data, exp_metadata, datatype='targets', samplerate=100)
time = np.arange(len(targets_interp))/100
plt.plot(time, targets_interp[:,:,0]) # plot just the x coordinate
plt.xlim(10, 20)
plt.xlabel('time (s)')
plt.ylabel('x position (cm)')
_images/get_interp_targets_centerout.png

Cursor and target (datatype ‘reference’) kinematics

exp_data, exp_metadata = load_preproc_exp_data(data_dir, 'test', 8461, '2023-02-25')
cursor_interp = get_interp_task_data(exp_data, exp_metadata, datatype='cursor', samplerate=exp_metadata['fps'])
ref_interp = get_interp_task_data(exp_data, exp_metadata, datatype='reference', samplerate=exp_metadata['fps'])
time = np.arange(exp_metadata['fps']*120)/exp_metadata['fps']
plt.plot(time, cursor_interp[:int(exp_metadata['fps']*120),1], color='blueviolet', label='cursor') # plot just the y coordinate
plt.plot(time, ref_interp[:int(exp_metadata['fps']*120),1], color='darkorange', label='ref')
plt.xlabel('time (s)')
plt.ylabel('y position (cm)'); plt.ylim(-10,10)
plt.legend()
_images/get_interp_cursor_tracking.png

User, reference, and disturbance kinematics

user_interp = get_interp_task_data(exp_data, exp_metadata, datatype='user', samplerate=exp_metadata['fps'])
ref_interp = get_interp_task_data(exp_data, exp_metadata, datatype='reference', samplerate=exp_metadata['fps'])
dis_interp = get_interp_task_data(exp_data, exp_metadata, datatype='disturbance', samplerate=exp_metadata['fps'])
time = np.arange(exp_metadata['fps']*120)/exp_metadata['fps']
plt.plot(time, user_interp[:int(exp_metadata['fps']*120),1], color='darkturquoise', label='user')
plt.plot(time, ref_interp[:int(exp_metadata['fps']*120),1], color='darkorange', label='ref')
plt.plot(time, dis_interp[:int(exp_metadata['fps']*120),1], color='tab:red', linestyle='--', label='dis')
plt.xlabel('time (s)')
plt.ylabel('y position (cm)'); plt.ylim(-10,10)
plt.legend()
_images/get_interp_user_tracking.png
Changes:

2023-10-20: Added support for ‘targets’ datatype 2024-01-29: Removed kinematic filtering below 15 Hz. See filter_kinematics().

aopy.data.bmi3d.get_kinematic_segments(preproc_dir, subject, te_id, date, trial_start_codes, trial_end_codes, trial_filter=<function <lambda>>, datatype='cursor', deriv=0, norm=False, samplerate=1000, **kwargs)[source]

Loads x,y,z cursor, hand, or eye trajectories for each “trial” from a preprocessed HDF file. Trials can be specified by numeric start and end codes. Trials can also be filtered so that only successful trials are included, for example. The filter is applied to numeric code segments for each trial. Finally, the cursor data can be preprocessed by a supplied function to, for example, convert position to velocity estimates. The preprocessing function is applied to the (time, position) cursor or eye data.

See also

get_kinematic_segment(), get_kinematics()

Example

subject = ‘beignet’ te_id = 4301 date = ‘2021-01-01’ trial_filter = lambda t: TRIAL_END not in t trajectories, segments = get_kinematic_segments(preproc_dir, subject, te_id, date,

[CURSOR_ENTER_CENTER_TARGET], [REWARD, TRIAL_END], trial_filter=trial_filter)

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • trial_start_codes (list) – list of numeric codes representing the start of a trial

  • trial_end_codes (list) – list of numeric codes representing the end of a trial

  • trial_filter (fn, optional) – function mapping trial segments to boolean values. Any trials for which the filter returns False will not be included in the output

  • datatype (str, optional) – type of kinematics to load. Defaults to ‘cursor’.

  • deriv (int, optional) – order of the derivative to compute. Default 0, no derivative.

  • norm (bool, optional) – if the output segments should be vector normalized at each timepoint. Default False.

  • samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default 1000.

  • kwargs – additional keyword arguments to pass to get_kinematics

Returns:

tuple containing:
trajectories (ntrial): array of filtered cursor trajectories for each trial
trial_segments (ntrial): array of numeric code segments for each trial

Return type:

tuple

Note

The sampling rate of the returned data might be different from the requested sampling rate if the preprocessing function does any modification to the length of the data.

Modified September 2023 to include optional sampling rate argument Modified July 2025 to include optional deriv and norm arguments

aopy.data.bmi3d.get_kinematics(preproc_dir, subject, te_id, date, samplerate, datatype='cursor', deriv=0, norm=False, filter_kinematics=False, **kwargs)[source]

Return all kinds of kinematics from preprocessed data. Caches the data for faster loading.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • samplerate (float) – the desired samplerate of the data in Hz.

  • datatype (str, optional) – type of kinematics to load. Defaults to ‘cursor’.

  • deriv (int, optional) – order of the derivative to compute. Default 0, no derivative.

  • norm (bool, optional) – if the output segments should be vector normalized at each timepoint. Default False.

  • filter_kinematics (bool, optional) – if True, the kinematics will be filtered. Default False.

  • kwargs – additional keyword arguments to pass to get_interp_task_data

Raises:

ValueError – if the datatype is invalid

Returns:

tuple containing:
kinematics (nt, nch): kinematics from the given experiment after preprocessing
samplerate (float): the sampling rate of the kinematics after preprocessing

Return type:

tuple

aopy.data.bmi3d.get_lfp_aligned(preproc_dir, subject, te_id, date, trial_start_codes, trial_end_codes, time_before, time_after, drive_number=None, trial_filter=<function <lambda>>)[source]

Loads lfp data (same length for each trial) from a preprocessed HDF file. Trials can be specified by numeric start and end codes. Trials can also be filtered so that only successful trials are included, for example. The filter is applied to numeric code segments for each trial.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • trial_start_codes (list) – list of numeric codes representing the start of a trial

  • trial_end_codes (list) – list of numeric codes representing the end of a trial

  • time_before (float) – time before the trial start to include in the aligned lfp (in seconds)

  • time_after (float) – time after the trial end to include in the aligned lfp (in seconds)

  • trial_filter (fn, optional) – function mapping trial segments to boolean values. Any trials for which the filter returns False will not be included in the output

Returns:

aligned lfp data output from func:aopy.preproc.trial_align_data

Return type:

(ntrials, nt, nch)

aopy.data.bmi3d.get_lfp_segments(preproc_dir, subject, te_id, date, trial_start_codes, trial_end_codes, drive_number=None, trial_filter=<function <lambda>>)[source]

Loads lfp segments (different length for each trial) from a preprocessed HDF file. Trials can be specified by numeric start and end codes. Trials can also be filtered so that only successful trials are included, for example. The filter is applied to numeric code segments for each trial.

Parameters:
  • preproc_dir (str) – path to the preprocessed directory

  • preproc_dir – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • trial_start_codes (list) – list of numeric codes representing the start of a trial

  • trial_end_codes (list) – list of numeric codes representing the end of a trial

  • trial_filter (fn, optional) – function mapping trial segments to boolean values. Any trials for which the filter returns False will not be included in the output

Returns:

tuple containing:
lfp_segments (ntrial): array of filtered lfp segments for each trial
trial_segments (ntrial): array of numeric code segments for each trial

Return type:

tuple

aopy.data.bmi3d.get_source_files(preproc_dir, subject, te_id, date)[source]

Retrieves the dictionary of source files from a preprocessed file

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

Returns:

tuple containing:
** files (dict):** dictionary of (source, filepath) files that are associated with the given experiment
** data_dir (str):** directory where the source files were located

Return type:

tuple

aopy.data.bmi3d.get_spike_data_aligned(preproc_dir, subject, te_id, date, trigger_times, time_before, time_after, drive=1, bin_width=0.01)[source]

Loads spike data for a given subject and experiment, then aligns binned spike to trigger times.

_images/spike_align_example.png
Parameters:
  • preproc_dir (str) – Path to the preprocessed data directory.

  • subject (str) – Subject name.

  • te_id (str) – Task entry number.

  • date (str) – The date of the experiment.

  • trigger_times (numpy.ndarray) – 1D Array of trigger times (in seconds) for each trial to which spike data should be aligned.

  • time_before (float) – The amount of time (in seconds) before each trigger time to include in the aligned spike data.

  • time_after (float) – The amount of time (in seconds) after each trigger time to include in the aligned spike data.

  • drive (int) – The drive number corresponding to the spike data.

  • bin_width (float, optional) – The width of the bins [s]. Default is 0.01 (10ms) seconds.

Returns:

A tuple containing:
  • spike_aligned (numpy.ndarray): A 3D array of aligned spike data with shape (ntime, nunits, ntrials), where:
    • ntime is the number of time bins between time_before and time_after around each trigger.

    • nch is the number of units.

    • ntrials is the number of trials (trigger events).

  • unit_labels (list of str): A list of unit labels corresponding to the ‘nunits’ dimension in the aligned spike data.

  • bins (numpy.ndarray): The time bin centers relative to the trigger times.

Return type:

tuple

aopy.data.bmi3d.get_spike_data_segment(preproc_dir, subject, te_id, date, start_time, end_time, drive=1, bin_width=0.01)[source]

Loads and extracts a segment of spiking data for a given subject and experiment, optionally binning the spike times.

Parameters:
  • preproc_dir (str) – Path to the preprocessed data directory.

  • subject (str) – Subject name.

  • te_id (str) – Task entry number.

  • date (str) – The date of the experiment.

  • start_time (float) – The start time [s] of the segment to extract.

  • end_time (float) – The end time [s] of the segment to extract.

  • drive (int, optional) – Which drive (port) to load data from.

  • bin_width (float, optional) – The width of the bins [s]. Default is 0.01 (10ms) seconds. If set to None, no binning is applied and spike times are returned.

Returns:

A tuple containing:
  • spike_segment (dict): A dictionary where keys are unit labels and values are arrays of spike times (or binned spike counts) for that unit.

  • bins (numpy.ndarray or None): An array of bin edges if binning was applied, otherwise None.

Return type:

tuple

aopy.data.bmi3d.get_target_locations(preproc_dir, subject, te_id, date, target_indices)[source]

Loads the x,y,z location of targets in a preprocessed HDF file given by their index. Requires that the preprocessed exp_data includes a trials structured array containing index and target fields (the default behavior of :func:~aopy.preproc.proc_exp)

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • target_indices (ntarg) – a list of which targets to fetch

Returns:

(ntarg x 3) array of coordinates of the given targets

Return type:

ndarray

aopy.data.bmi3d.get_task_data(preproc_dir, subject, te_id, date, datatype, samplerate=None, step=1, preproc=None, **kwargs)[source]

Return interpolated task data. Wraps get_interp_task_data() but caches the data for faster loading.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • datatype (str) – column of task data to load.

  • samplerate (float) – choose the samplerate of the data in Hz. Default None, which uses the sampling rate of the experiment.

  • step (int, optional) – task data will be decimated with steps this big. Default 1.

  • preproc (fn, optional) – function mapping (position, fs) data to (kinematics, fs_new). For example, a smoothing function or an estimate of velocity from position

  • kwargs – additional keyword arguments to pass to get_interp_task_data

Raises:

ValueError – if the datatype is invalid

Returns:

tuple containing:
kinematics (nt, nch): kinematics from the given experiment after preprocessing
samplerate (float): the sampling rate of the kinematics after preprocessing

Return type:

tuple

Examples

subject = 'beignet'
te_id = 4301
date = '2021-01-01'
ts_data, samplerate = get_task_data(preproc_dir, subject, te_id, date, 'cycle')
time = np.arange(len(ts_data))/samplerate
plt.figure()
plt.plot(time[1:], 1/np.diff(ts_data), 'ko')
plt.xlabel('time (s)')
plt.ylabel('cycle step')
plt.ylim(0, 2)
_images/get_cycle_data.png
aopy.data.bmi3d.get_trajectory_frequencies(preproc_dir, subject, te_id, date)[source]

For continuous tracking tasks, get the set of frequencies (in Hz) used to generate the trajectories that were preesented on each trial of the experiment, using get_ref_dis_frequencies().

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

Returns:

Tuple containing:
freq_r (list of arrays): (ntrial) list of (nfreq,) frequencies used to generate reference trajectory
freq_d (list of arrays): (ntrial) list of (nfreq,) frequencies used to generate disturbance trajectory

Return type:

tuple

aopy.data.bmi3d.get_ts_data_segment(preproc_dir, subject, te_id, date, start_time, end_time, drive_number=None, channels=None, datatype='lfp')[source]

Simple wrapper around load_hdf_ts_segment for lfp or broadband data.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • trigger_time (float) – time (in seconds) in the recording at which the desired segment starts

  • time_before (float) – time (in seconds) to include before the trigger times

  • time_after (float) – time (in seconds) to include after the trigger times

  • channels (int array, optional) – which channel indices to load

  • datatype (str, optional) – choice of ‘lfp’ or ‘broadband’ data to load. Defaults to ‘lfp’.

Returns:

tuple containing:
segment (nt, nch): data segment from the given preprocessed file
samplerate (float): sampling rate of the returned data

Return type:

tuple

aopy.data.bmi3d.get_ts_data_trial(preproc_dir, subject, te_id, date, trigger_time, time_before, time_after, drive_number=None, channels=None, datatype='lfp')[source]

Simple wrapper around load_hdf_ts_trial for lfp or broadband data.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • te_id (int) – Block number of Task entry object

  • date (str) – Date of recording

  • trigger_time (float) – time (in seconds) in the recording at which the desired segment starts

  • time_before (float) – time (in seconds) to include before the trigger times

  • time_after (float) – time (in seconds) to include after the trigger times

  • channels (int array, optional) – which channel indices to load

  • datatype (str, optional) – choice of ‘lfp’ or ‘broadband’ data to load. Defaults to ‘lfp’.

Returns:

tuple containing:
segment (nt, nch): data segment from the given preprocessed file
samplerate (float): sampling rate of the returned data

Return type:

tuple

aopy.data.bmi3d.get_velocity_segments(*args, norm=True, **kwargs)[source]

Estimates velocity from cursor position, then finds the trial segments for velocity using get_kinematic_segments().

Parameters:
  • *args – arguments for get_kinematic_segments()

  • norm (bool) – if the output segments should be normalized. Set to false to output component velocities.

  • **kwargs – parameters for get_kinematic_segments()

Returns:

tuple containing:
velocities (ntrial): array of velocity estimates for each trial
trial_segments (ntrial): array of numeric code segments for each trial

Return type:

tuple

aopy.data.bmi3d.load_bmi3d_hdf_table(data_dir, filename, table_name)[source]

Loads data and metadata from a table in an hdf file generated by BMI3D

Parameters:
  • data_dir (str) – path to the data

  • filename (str) – name of the file to load from

  • table_name (str) – name of the table you want to load

Returns:

Tuple containing:
data (ndarray): data from bmi3d
metadata (dict): attributes associated with the table

Return type:

tuple

aopy.data.bmi3d.load_bmi3d_lasers(filename='lasers.yaml')[source]

Load the config metadata for BMI3D lasers.

Parameters:

filename (str, optional) – filename of the laser names to load. Defaults to ‘laser_names.yaml’.

Returns:

list of lasers available in the config. Each laser is a dictionary with keys
  • name: name of the laser

  • stimulation_site: name of the metadata key for the stimulation site

  • trigger: name of the metadata key for the trigger channel

  • trigger_dch: index of the trigger digital channel

  • sensor: name of the metadata key for the sensor channel

  • sensor_ach: index of the sensor analog channel

Return type:

list

aopy.data.bmi3d.load_bmi3d_root_metadata(data_dir, filename)[source]

Root metadata not accessible using pytables, instead use h5py

Parameters:
  • data_dir (str) – path to the data

  • filename (str) – name of the file to load from

Returns:

key-value attributes

Return type:

dict

aopy.data.bmi3d.load_bmi3d_task_codes(filename='task_codes.yaml')[source]

Load the default BMI3D task codes. File-specific codes can be found in exp_metadata[‘event_sync_dict’]

Parameters:

filename (str, optional) – filename of the task codes to load. Defaults to ‘task_codes.yaml’.

Returns:

(name, code) task code dictionary

Return type:

dict

aopy.data.bmi3d.load_ecube_analog(path, data_dir, channels=None)[source]

Just a wrapper around load_ecube_data() and load_ecube_metadata()

Parameters:
  • path (str) – base directory where ecube data is stored

  • data_dir (str) – folder you want to load

  • channels (int array, optional) – which channels to load

Returns:

Tuple containing:
data (nt, nch): analog data for the requested channels
metadata (dict): metadata (see load_ecube_metadata() for details)

Return type:

tuple

aopy.data.bmi3d.load_ecube_data(data_dir, data_source, channels=None)[source]

Loads data from eCube for a given directory and datasource

Requires load_ecube_metadata(), process_channels()

Parameters:
  • data_dir (str) – folder containing the data you want to load

  • data_source (str) – type of data (“Headstages”, “AnalogPanel”, “DigitalPanel”)

  • channels (int array or None) – list of channel numbers (0-indexed) to load. If None, will load all channels by default

Returns:

all the data for the given source

Return type:

(nt, nch)

aopy.data.bmi3d.load_ecube_data_chunked(data_dir, data_source, channels=None, chunksize=728)[source]

Loads a data file one “chunk” at a time. Useful for replaying files as if they were online data.

Parameters:
  • data_dir (str) – folder containing the data you want to load

  • data_source (str) – type of data (“Headstages”, “AnalogPanel”, “DigitalPanel”)

  • channels (int array or None) – list of channel numbers (0-indexed) to load. If None, will load all channels by default

  • chunksize (int) – how many samples to include in each chunk

Yields:

(chunksize, nch) – one chunk of data for the given source

aopy.data.bmi3d.load_ecube_digital(path, data_dir)[source]

Just a wrapper around load_ecube_data() and load_ecube_metadata()

Parameters:
  • path (str) – base directory where ecube data is stored

  • data_dir (str) – folder you want to load

Returns:

Tuple containing:
data (nt): digital data, arranged as 64-bit numbers representing the 64 channels
metadata (dict): metadata (see load_ecube_metadata() for details)

Return type:

tuple

aopy.data.bmi3d.load_ecube_headstages(path, data_dir, channels=None)[source]

Just a wrapper around load_ecube_data() and load_ecube_metadata()

Parameters:
  • path (str) – base directory where ecube data is stored

  • data_dir (str) – folder you want to load

  • channels (int array, optional) – which channels to load

Returns:

Tuple containing:
data (nt, nch): analog data for the requested channels
metadata (dict): metadata (see load_ecube_metadata() for details)

Return type:

tuple

aopy.data.bmi3d.load_ecube_metadata(data_dir, data_source)[source]

Sums the number of channels and samples across all files in the data_dir

Parameters:
  • data_dir (str) – eCube data directory

  • source (str) – selects the source (AnalogPanel, Headstages, etc.)

Returns:

Dictionary of metadata with fields:
samplerate (float): sampling rate of data for this source
data_source (str): copied from the function argument
n_channels (int): number of channels
n_samples (int): number of samples for one channel

Return type:

dict

aopy.data.bmi3d.load_emg_analog(data_dir, emg_filename)[source]

Loads emg analog data

Parameters:
  • data_dir (str) – base directory where emg data is stored

  • emg_filename (str) – hdf file you want to load

Returns:

Tuple containing:
data (nt): analog data
metadata (dict): metadata from the emg file containing samplerate

Return type:

tuple

aopy.data.bmi3d.load_emg_data(data_dir, emg_filename)[source]

Loads emg data

Parameters:
  • data_dir (str) – base directory where emg data is stored

  • emg_filename (str) – hdf file you want to load

Returns:

Tuple containing:
data (nt): emg data
metadata (dict): metadata from the emg file containing samplerate

Return type:

tuple

aopy.data.bmi3d.load_emg_digital(data_dir, emg_filename)[source]

Loads and converts emg analog data to 64-bit digital data.

Parameters:
  • data_dir (str) – base directory where emg data is stored

  • emg_filename (str) – hdf file you want to load

Returns:

Tuple containing:
data (nt): digital data, arranged as 64-bit numbers
metadata (dict): metadata from the emg file containing samplerate

Return type:

tuple

aopy.data.bmi3d.proc_ecube_data(data_path, data_source, result_filepath, result_name='broadband_data', max_memory_gb=1.0)[source]

Loads and saves eCube data into an HDF file

Requires load_ecube_metadata()

Parameters:
  • data_path (str) – path to folder containing the ecube data you want to load

  • data_source (str) – type of data (“Headstages”, “AnalogPanel”, “DigitalPanel”)

  • result_filepath (str) – path to hdf file to be written (or appended)

  • max_memory_gb (float, optional) – max memory used to load binary data at one time

Returns:

tuple containing:
dset (h5py.Dataset): the new hdf dataset
metadata (dict): the ecube metadata

Return type:

tuple

aopy.data.bmi3d.tabulate_behavior_data(preproc_dir, subjects, ids, dates, start_events, end_events, reward_events, penalty_events, metadata=[], df=None, event_code_type='code', return_bad_entries=False, repeating_start_codes=False)[source]

Concatenate trials from across experiments. Experiments are given as lists of subjects, task entry ids, and dates. Each list must be the same length. Trials are defined by intervals between the given trial start and end codes.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • start_events (list) – list of numeric codes representing the start of a trial

  • end_events (list) – list of numeric codes representing the end of a trial

  • reward_events (list) – list of numeric codes representing rewards

  • penalty_events (list) – list of numeric codes representing penalties

  • metadata (list, optional) – list of metadata keys that should be included in the df

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

  • event_code_type (str, optional) – type of event codes to use. Defaults to ‘code’. Other choices include ‘event’ and ‘data’.

  • return_bad_entries (bool, optional) – If True, returns the list of task entries that could not be loaded. Defaults to False.

  • repeating_start_codes (bool) – whether the start codes might occur multiple times within one segment. Otherwise always use the last start code within a segment. May lead to segments spanning multiple trials if used improperly. Defaults to False.

Returns:

pandas DataFrame containing the concatenated trial data with columns:
subject (str): subject name
te_id (str): task entry id
date (str): date of recording
event_codes (ntrial): numeric code segments for each trial (specified by event_code_type)
event_times (ntrial): time segments for each trial
event_idx (ntrial): index segments for each trial
reward (ntrial): boolean values indicating whether each trial was rewarded
penalty (ntrial): boolean values indicating whether each trial was penalized
%metadata_key% (ntrial): requested metadata values for each key requested

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_behavior_data_center_out(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for center-out experiments. Makes use of the task codes saved in /config/task_codes.yaml to automatically assign event codes for trial start, trial end, reward, penalty, and targets.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • metadata (list, optional) – list of metadata keys that should be included in the df

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:
subject (str): subject name
te_id (str): task entry id
date (str): date of recording
event_codes (ntrial): numeric code segments for each trial
event_times (ntrial): time segments for each trial
reward (ntrial): boolean values indicating whether each trial was rewarded
penalty (ntrial): boolean values indicating whether each trial was penalized
%metadata_key% (ntrial): requested metadata values for each key requested
target_idx (ntrial): index of the target that was presented
target_location (ntrial): location of the target that was presented
center_target_on_time (ntrial): time at which the trial started
prev_trial_end_time (ntrial): time at which the previous trial ended
trial_end_time (ntrial): time at which the trial ended
trial_initiated (ntrial): boolean values indicating whether the trial was initiated
hold_start_time (ntrial): time at which the hold period started
hold_completed (ntrial): boolean values indicating whether the hold period was completed
delay_start_time (ntrial): time at which the delay period started
delay_completed (ntrial): boolean values indicating whether the delay period was completed
go_cue_time (ntrial): time at which the go cue was presented
reach_completed (ntrial): boolean values indicating whether the reach was completed
reach_end_time (ntrial): time at which the reach was completed
reward_start_time (ntrial): time at which the reward was presented
penalty_start_time (ntrial): time at which the penalty was presented
penalty_event (ntrial): numeric code for the penalty event
pause_start_time (ntrial): time at which the pause occurred
pause_event (ntrial): numeric code for the pause event

Return type:

pd.DataFrame

Example

subject = 'test'
start_date = '2025-08-15'
end_date = '2025-08-16'
entries = db.lookup_mc_sessions(subject=subject, date=(date.fromisoformat(start_date), date.fromisoformat(end_date)), task_desc='center out with random delay')
subjects, te_ids, te_dates = db.list_entry_details(entries)

df = tabulate_behavior_data_center_out(preproc_dir, subjects, te_ids, te_dates)
display(df.head(8))
_images/tabulate_behavior_data_center_out.png
aopy.data.bmi3d.tabulate_behavior_data_corners(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for corner reaching experiments. Makes use of the task codes saved in /config/task_codes.yaml to automatically assign event codes for trial start, trial end, reward, penalty, and targets.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • metadata (list, optional) – list of metadata keys that should be included in the df

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:
subject (str): subject name
te_id (str): task entry id
date (str): date of recording
event_codes (ntrial): numeric code segments for each trial
event_times (ntrial): time segments for each trial
event_idx (ntrial): index segments for each trial
reward (ntrial): boolean values indicating whether each trial was rewarded
penalty (ntrial): boolean values indicating whether each trial was penalized
%metadata_key% (ntrial): requested metadata values for each key requested
sequence_params (ntrial): string of params used to generate all trajectories in the same task entry
chain_length (ntrial): number of targets presented in each trial
target_idx (ntrial): list of indices of the targets presented
target_location (ntrial): list of locations of the targets presented
prev_trial_end_time (ntrial): time at which the previous trial ended
trial_end_time (ntrial): time at which the trial ended
first_target_on_time (ntrial): time at which the trial started
trial_initiated (ntrial): boolean values indicating whether the trial was initiated
hold_start_time (ntrial): time at which the hold period started
hold_completed (ntrial): boolean values indicating whether the hold period was completed
delay_start_time (ntrial): time at which the delay period started
delay_completed (ntrial): boolean values indicating whether the delay period was completed
go_cue_time (ntrial): time at which the go cue was presented
reach_completed (ntrial): boolean values indicating whether the reach was completed
reach_end_time (ntrial): time at which the reach was completed
reward_start_time (ntrial): time at which the reward was presented
penalty_start_time (ntrial): time at which the penalty occurred
penalty_event (ntrial): numeric code for the penalty event
pause_start_time (ntrial): time at which the pause occurred
pause_event (ntrial): numeric code for the pause event

Return type:

pd.DataFrame

Example

subject = 'churro'
start_date = '2025-01-17'
end_date = '2025-01-18'
entries = db.lookup_mc_sessions(subject=subject, date=(date.fromisoformat(start_date), date.fromisoformat(end_date)))
subjects, te_ids, te_dates = db.list_entry_details(entries)

df = tabulate_behavior_data_corners(preproc_dir, subjects, te_ids, te_dates)
display(df.head(8))
_images/tabulate_behavior_data_corners.png
aopy.data.bmi3d.tabulate_behavior_data_flash(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for flash experiments. Uses the task event names (b’TARGET_ON’, b’REWARD’, and b’TRIAL_END’, specifically) to find start and end times for flash experiments.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • metadata (list, optional) – list of metadata keys that should be included in the df

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:
subject (str): subject name
te_id (str): task entry id
date (str): date of recording
event_names (ntrial): event name segments for each trial
event_times (ntrial): time segments for each trial
%metadata_key% (ntrial): requested metadata values for each key requested
flash_start_time (ntrial): time the flash started
flash_end_time (ntrial): time the flash ended

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_behavior_data_out(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for out experiments (similar to center-out but without a trial-initiating center target). Makes use of the task codes saved in /config/task_codes.yaml to automatically assign event codes for trial start, trial end, reward, penalty, and targets.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • metadata (list, optional) – list of metadata keys that should be included in the df

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:
subject (str): subject name
te_id (str): task entry id
date (str): date of recording
event_codes (ntrial): numeric code segments for each trial
event_times (ntrial): time segments for each trial
reward (ntrial): boolean values indicating whether each trial was rewarded
penalty (ntrial): boolean values indicating whether each trial was penalized
%metadata_key% (ntrial): requested metadata values for each key requested
target_idx (ntrial): index of the target that was presented
target_location (ntrial): location of the target that was presented
trial_start_time (ntrial): time at which the trial started
trial_end_time (ntrial): time at which the trial ended
reach_completed (ntrial): boolean values indicating whether the reach was completed
reach_end_time (ntrial): time at which the reach was completed
reward_start_time (ntrial): time at which the reward was presented
penalty_start_time (ntrial): time at which the penalty was presented
penalty_event (ntrial): numeric code for the penalty event
pause_start_time (ntrial): time at which the pause occurred
pause_event (ntrial): numeric code for the pause event

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_behavior_data_random_targets(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for random target location experiments. Uses the task event names (b’TARGET_ON’ and b’TRIAL_END’, specifically) to find start and end times for experiments.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • metadata (list, optional) – list of metadata keys that should be included in the df

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:
subject (str): subject name
te_id (str): task entry id
date (str): date of recording
event_names (ntrial): event name segments for each trial
event_times (ntrial): time segments for each trial
%metadata_key% (ntrial): requested metadata values for each key requested
**target_idx (ntrial): ** target index for each trial within a unique data session
**target_loc (ntrial): ** target locations (x,y,z) for each trial
**prev_trial_end_time (ntrial): **time at which previous trial ended
**trial_end_time (ntrial): ** time at which trial ended
**target on (ntrial): ** time at which target appears
**reach completed (ntrial): **boolean indicating whether reach was completed
**cursor_enter_target(ntrial): **time at which cursor enters target
**reward_start_time (ntrial): **time of reward
**penalty_start_time (ntrial): **penalty start time (if applicable)
**penalty_event(ntrial): ** event description of penalty

Examples

Visualization of 5 reaches to random targets.

   subjects = ['Leo', 'Leo']
   ids = [1957, 1959]
   dates = ['2025-02-13', '2025-02-13']

   df = tabulate_behavior_data_random_targets(data_dir, subjects, ids, dates, metadata = ['sequence_params'])
   example_reaches = df[-5:] #last 5 reaches in the earlier dataframe
   example_traj = tabulate_kinematic_data(data_dir, example_reaches['subject'], example_reaches['te_id'],
                                   example_reaches['date'], example_reaches['target_on'],
                                   example_reaches['cursor_enter_target'], datatype = 'cursor')
   ex_targets = example_reaches['target_location'].to_numpy()
   bounds = [-5,5,-5,5,-5,5] #equal bounds to make visualization appear as spheres
   default_colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
   colors = default_colors[:len(ex_targets)] #match colors from the trajectories

   fig = plt.figure()
   ax = fig.add_subplot(111, projection = '3d')
   for idx, path in enumerate(example_traj):
       ax.plot(*path.T)
       visualization.plot_sphere(ex_targets[idx], color = colors[idx], radius = 0.5,
                          bounds = bounds, ax = ax)

.. image:: _images/tabulate_behavior_random_targets.png

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_behavior_data_tracking_task(preproc_dir, subjects, ids, dates, metadata=[], df=None)[source]

Wrapper around tabulate_behavior_data() specifically for tracking task experiments. Makes use of the task codes saved in /config/task_codes.yaml to automatically assign event codes for trial start, trial end, reward, penalty.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • metadata (list, optional) – list of metadata keys that should be included in the df

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data with columns:
subject (str): subject name
te_id (str): task entry id
date (str): date of recording
event_codes (ntrial): numeric code segments for each trial
event_times (ntrial): time segments for each trial
event_idx (ntrial): index segments for each trial
reward (ntrial): boolean values indicating whether each trial was rewarded
penalty (ntrial): boolean values indicating whether each trial was penalized
%metadata_key% (ntrial): requested metadata values for each key requested
sequence_params (ntrial): string of params used to generate all trajectories in the same task entry
ref_freqs (ntrial): array of frequencies used to generate reference trajectory for each trial
dis_freqs (ntrial): array of frequencies used to generate disturbance trajectory for each trial
prev_trial_end_time (ntrial): time at which the previous trial ended
target_on_time (ntrial): time at which the trial started
trial_initiated (ntrial): boolean values indicating whether the trial was initiated (i.e. hold was attempted)
hold_start_time (ntrial): time at which the hold period started
hold_completed (ntrial): boolean values indicating whether the hold period was completed
tracking_start_time (ntrial): time at which the hold period ended and tracking started
trajectory_start_time (ntrial): time at which the ref & dis trajectories started (excluding the ramp up period)
trajectory_end_time (ntrial): time at which the ref & dis trajectories ended (excluding the ramp down period if the trial was rewarded)
tracking_end_time (ntrial): time at which tracking ended (whether with a reward or tracking out penalty)
reward_start_time (ntrial): time at which the reward was presented
penalty_start_time (ntrial): time at which the penalty occurred
penalty_event (ntrial): numeric code for the penalty event
pause_start_time (ntrial): time at which the pause occurred
pause_event (ntrial): numeric code for the pause event
trial_end_time (ntrial): time at which the trial ended

Return type:

pd.DataFrame

Example

_images/tabulate_behavior_data_tracking_task.png
aopy.data.bmi3d.tabulate_feature_data(preproc_dir, subjects, te_ids, dates, start_times, end_times, decoders, datatype='lfp_power', samplerate=None, preproc=None, **kwargs)[source]

Grab (online extracted) decoder feature segments across arbitrary preprocessed files. Wrapper around tabulate_task_data.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • datatype (str, optional) – column of task data to load. Default ‘lfp_power’.

  • samplerate (float, optional) – choose the samplerate of the data in Hz. Default None, which uses the sampling rate of the experiment.

  • start_times (list of float) – times in the recording at which the desired segments starts

  • end_times (list of float) – times in the recording at which the desired segments ends

  • decoders (list of riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes. If only one decoder is supplied, it will be applied to all recordings.

  • preproc (fn, optional) – function mapping (position, fs) data to (kinematics, fs_new). For example, a smoothing function or an estimate of velocity from position

  • kwargs – additional keyword arguments to pass to get_interp_task_data

Returns:

tuple containing:
segments (ntrial,): list of tensors of (nt, nfeat) feature data from each trial
samplerate (float): samplerate of the feature data

Return type:

tuple

aopy.data.bmi3d.tabulate_kinematic_data(preproc_dir, subjects, te_ids, dates, start_times, end_times, samplerate=1000, deriv=0, norm=False, datatype='cursor', filter_kinematics=False, **kwargs)[source]

Grab kinematics data from trials across arbitrary preprocessed files. Before segmenting, filters data using filter_kinematics() (default 15 Hz low-pass) and optionally applies a derivate to the data to get velocity, acceleration, or jerk.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • start_times (list of float) – times in the recording at which the desired segments starts

  • end_times (list of float) – times in the recording at which the desired segments ends

  • samplerate (float, optional) – optionally choose the samplerate of the data in Hz. Default 1000.

  • datatype (str, optional) – type of kinematics to tabulate. Defaults to ‘cursor’.

  • deriv (int, optional) – order of the derivative to compute. Default 0, no derivative.

  • norm (bool, optional) – if the output segments should be vector normalized at each timepoint. Default False.

  • filter_kinematics (bool, optional) – if True, filters the kinematics data before segmenting. Default False.

  • kwargs (dict, optional) – optional keyword arguments to pass to get_kinematic_segment()

Returns:

list of tensors of (nt, nch) kinematics from each trial

Return type:

(ntrial,)

Examples

subjects = ['test']
ids = [3498]
dates = ['2021-12-13']

df = tabulate_behavior_data_center_out(write_dir, subjects, ids, dates, df=None)

# Only consider completed reaches
df = df[df['reach_completed']]
kin = tabulate_kinematic_data(write_dir, df['subject'], df['te_id'], df['date'], df['go_cue_time'], df['reach_end_time'],
                            datatype='cursor', samplerate=1000)

plt.figure()
bounds = [-10, 10, -10, 10]
visualization.plot_trajectories(kin, bounds=bounds)
_images/tabulate_kinematics.png
dst = tabulate_kinematic_data(write_dir, df['subject'], df['te_id'], df['date'], df['go_cue_time'], df['reach_end_time'],
                            deriv=0, norm=True, datatype='cursor', samplerate=1000)
spd = tabulate_kinematic_data(write_dir, df['subject'], df['te_id'], df['date'], df['go_cue_time'], df['reach_end_time'],
                            deriv=1, norm=True, datatype='cursor', samplerate=1000)
acc = tabulate_kinematic_data(write_dir, df['subject'], df['te_id'], df['date'], df['go_cue_time'], df['reach_end_time'],
                            deriv=2, norm=True, datatype='cursor', samplerate=1000)
plt.figure()
visualization.plot_timeseries(dst[0], 1000)
visualization.plot_timeseries(spd[0], 1000)
visualization.plot_timeseries(acc[0], 1000)
plt.legend(['distance', 'speed', 'acceleration'])
plt.xlabel('time from go cue (s)')
plt.ylabel('kinematics (cm)')
_images/tabulate_kinematics_derivative.png
subject = 'CES003'
te_id = 2234
date = '2025-03-04'
df = tabulate_behavior_data_center_out(data_dir, [subject], [te_id], [date])
df = df[df['reach_completed']]
plot_kin(df, 'go_cue_time', 'reach_end_time')
_images/tabulate_kinematics_ces.png

Different interpolation options:

raw = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='cursor', samplerate=1000)
raw_filt = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='cursor', samplerate=1000, low_cut=5, buttord=2,
                            filter_kinematics=True)
nan = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='user_screen', samplerate=1000, remove_nan=False)
nan_filt = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='user_screen', samplerate=1000, low_cut=5, buttord=2,
                            filter_kinematics=True, remove_nan=False)
pos = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='user_screen', samplerate=1000)
pos_filt = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            datatype='user_screen', samplerate=1000, low_cut=5, buttord=2,
                            filter_kinematics=True)
spd = tabulate_kinematic_data(data_dir, df['subject'], df['te_id'], df['date'],
                            df['go_cue_time'], df['reach_end_time'],
                            deriv=1, norm=True, datatype='cursor', samplerate=1000,
                            filter_kinematics=True)
weird_trials = np.where([np.any(s > 500) for s in spd])[0]
plt.figure(figsize=(5,6))
plt.subplot(3,1,1)
for i in weird_trials:
    visualization.plot_timeseries(raw[i][:,0], 1000)
    visualization.plot_timeseries(raw_filt[i][:,0], 1000, color='k', alpha=0.5)
plt.ylabel('x position (cm)')
plt.xlabel('')
plt.title('cursor')
plt.legend(['raw', 'filtered'])
plt.subplot(3,1,2)
for i in weird_trials:
    visualization.plot_timeseries(nan[i][:,0], 1000)
    visualization.plot_timeseries(nan_filt[i][:,0], 1000, color='k', alpha=0.5)
plt.ylabel('x position (cm)')
plt.xlabel('time from go cue (s)')
plt.title('user_screen')
plt.subplot(3,1,3)
for i in weird_trials:
    visualization.plot_timeseries(pos[i][:,0], 1000)
    visualization.plot_timeseries(pos_filt[i][:,0], 1000, color='k', alpha=0.5)
plt.ylabel('x position (cm)')
plt.xlabel('time from go cue (s)')
plt.title('user_screen interp')
plt.tight_layout()
_images/kinematics_interpolation.png
aopy.data.bmi3d.tabulate_lfp_features(preproc_dir, subjects, te_ids, dates, start_times, end_times, decoders, samplerate=None, channels=None, datatype='lfp', preproc=None, verbose=True, **kwargs)[source]

Extract (new, offline) lfp feature segments across arbitrary preprocessed files. Uses a decoder object to extract features from either lfp or broadband timeseries data. Can be applied offline to arbitrary channels. If used on broadband data from a BCI experiment, the features extracted will be (nearly) the same as the online features if the same decoder is used.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • datatype (str, optional) – column of task data to load. Default ‘lfp_power’.

  • start_times (list of float) – times in the recording at which the desired segments starts

  • end_times (list of float) – times in the recording at which the desired segments ends

  • decoders (list of riglib.bmi.Decoder) – decoder objects for each recording. If only one decoder is supplied, it will be applied to all recordings.

  • samplerate (float, optional) – choose the samplerate of the data in Hz. Default None, which uses the sampling rate of the experiment.

  • channels (list of int, optional) – list of channel indices to extract. Default None, which extracts all channels.

  • datatype – type of data to load. Default ‘lfp’.

  • preproc (fn, optional) – function mapping (position, fs) data to (kinematics, fs_new). For example, a smoothing function or an estimate of velocity from position

  • decode (bool, optional) – whether to decode the lfp features. Default False.

  • verbose (bool, optional) – whether to display a progress bar. Default True.

  • kwargs – additional keyword arguments

Returns:

tuple containing:
segments (ntrial,): list of tensors of (nt, nfeat) feature data from each trial
samplerate (float): samplerate of the feature data

Return type:

tuple

Examples

Plot online extracted lfp features and overlay offline extracted feature segments

subject = 'affi'
te_id = 17269
date = '2024-05-03'
subjects = [subject, subject, subject]
te_ids = [te_id, te_id, te_id]
dates = [date, date, date]
start_time = 10
end_time = 30
start_times = [10, 15, 20]
end_times = [14, 18, 28]

Load the decoder that was used in the experiment

with open(os.path.join(data_dir, 'test_decoder.pkl'), 'rb') as file:
decoder = pickle.load(file)

Load the full features for comparison

features_offline, samplerate_offline = extract_lfp_features(
    preproc_dir, subject, te_id, date, decoder,
    start_time=start_time, end_time=end_time)
features_online, samplerate_online = get_extracted_features(
    preproc_dir, subject, te_id, date, decoder,
    start_time=start_time, end_time=end_time)

time_offline = np.arange(len(features_offline))/samplerate_offline + start_time
time_online = np.arange(len(features_online))/samplerate_online + start_time

plt.figure(figsize=(8,3))
plt.plot(time_offline, features_offline[:,1], alpha=0.8, label='offline')
plt.plot(time_online, features_online[:,1], alpha=0.8, label='online')
plt.xlabel('time (s)')
plt.ylabel('power')
plt.title('readout 1')

Tabulate the segments

features_offline, samplerate_offline = tabulate_lfp_features(
    preproc_dir, subjects, te_ids, dates, start_times, end_times, decoder)
features_online, samplerate_online = tabulate_feature_data(
    preproc_dir, subjects, te_ids, dates, start_times, end_times, decoder)

for idx in range(len(start_times)):
    time_offline = np.arange(len(features_offline[idx]))/samplerate_offline + start_times[idx]
    time_online = np.arange(len(features_online[idx]))/samplerate_online + start_times[idx]
    plt.plot(time_offline, features_offline[idx][:,1], 'k--')
    plt.plot(time_online, features_online[idx][:,1], 'k--')

Add legend

plt.plot([], [], 'k--', label='segments')
plt.legend()
_images/tabulate_lfp_features.png

See also

tabulate_feature_data() tabulet_state_data()

aopy.data.bmi3d.tabulate_poisson_trial_times(preproc_dir, subjects, ids, dates, metadata=[], poisson_mu=0.25, refractory_period=0.1, df=None)[source]

Generate poisson-spaced trial times for the given recordings. Recordings are given as lists of subjects, task entry ids, and dates. Each list must be the same length. See generate_poisson_timestamps() for more information on the poisson-spaced trial times that are generated.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • metadata (list, optional) – list of metadata keys that should be included in the df. By default empty.

  • poisson_mu (float, optional) – mean of the inter-trial times in seconds. Default 0.25.

  • refractory_period (float, optional) – minimum time between trials in seconds. Default 0.1.

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

Returns:

pandas DataFrame containing the concatenated trial data
subject (str): subject name
te_id (str): task entry id
date (str): date of each trial
%metadata_key% (ntrial): requested metadata values for each key requested
trial_time (float): time generated within recording

Return type:

pd.DataFrame

aopy.data.bmi3d.tabulate_spike_data_segments(preproc_dir, subjects, te_ids, dates, start_times, end_times, drives, bin_width=0.01)[source]

Grab nonrectangular timeseries data from trials across arbitrary preprocessed files.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • start_times (list of float) – times in the recording at which the desired segments start

  • end_times (list of float) – times in the recording at which the desired segments end

  • drives (list) – Defines which drive to load data from. For neuropixel data this is usually ‘1’ or ‘2’

  • bin_width (int) – Bin width to bin spike times at. If None, the segments of spike times will be returned.

Returns:

A tuple containing:
  • segments (list of dicts): A list where each element is a dictionary of spike data for a unit in a specific experiment.

  • bins (numpy.ndarray or None): An array of bin edges if binning was applied, otherwise None.

Return type:

tuple

aopy.data.bmi3d.tabulate_state_data(preproc_dir, subjects, te_ids, dates, start_times, end_times, decoders, datatype='decoder_state', samplerate=None, preproc=None, **kwargs)[source]

Grab (online decoded) state segments across arbitrary preprocessed files. Wrapper around tabulate_task_data.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • datatype (str, optional) – column of task data to load. Default ‘decoder_state’.

  • samplerate (float, optional) – choose the samplerate of the data in Hz. Default None, which uses the sampling rate of the experiment.

  • start_times (list of float) – times in the recording at which the desired segments starts

  • end_times (list of float) – times in the recording at which the desired segments ends

  • decoders (list of riglib.bmi.Decoder) – decoder object with binlen and call_rate attributes. If only one decoder is supplied, it will be applied to all recordings.

  • preproc (fn, optional) – function mapping (position, fs) data to (kinematics, fs_new). For example, a smoothing function or an estimate of velocity from position

  • kwargs – additional keyword arguments to pass to get_interp_task_data

Returns:

tuple containing:
segments (ntrial,): list of tensors of (nt, nfeat) state data from each trial
samplerate (float): samplerate of the state data

Return type:

tuple

aopy.data.bmi3d.tabulate_stim_data(preproc_dir, subjects, ids, dates, metadata=['stimulation_site'], debug=True, df=None, **kwargs)[source]

Concatenate stimulation data from across experiments. Experiments are given as lists of subjects, task entry ids, and dates. Each list must be the same length.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • metadata (list, optional) – list of metadata keys that should be included in the df. By default, only stimulation_site is included.

  • debug (bool, optional) – Passed to find_stim_times(), if True prints an laser sensor alignment plot for each trial. Defaults to True.

  • df (DataFrame, optional) – pandas DataFrame object to append. Defaults to None.

  • kwargs (dict, optional) – optional keyword arguments to pass to find_stim_times()

Returns:

pandas DataFrame containing the concatenated trial data
subject (str): subject name
te_id (str): task entry id
date (str): date of stimulation
stimulation_site (int): site of stimulation
%metadata_key% (ntrial): requested metadata values for each key requested
trial_time (float): time of stimulation within recording
trial_width (float): width of stimulation pulse
trial_gain (float): fraction of maximum laser power setting
trial_power (float): power (in mW) of stimulation pulse at the fiber output

Return type:

pd.DataFrame

Note

Only supports single-site stimulation.

aopy.data.bmi3d.tabulate_task_data(preproc_dir, subjects, te_ids, dates, start_times, end_times, datatype, samplerate=None, steps=1, preproc=None, **kwargs)[source]

Grab task data from trials across arbitrary preprocessed files.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • start_times (list of float) – times in the recording at which the desired segments starts

  • end_times (list of float) – times in the recording at which the desired segments ends

  • datatype (str) – column of task data to load.

  • samplerate (float, optional) – choose the samplerate of the data in Hz. Default None, which uses the sampling rate of the experiment.

  • steps (list of int, optional) – task data will be decimated with steps this big. If a single integer is given, it will be applied to all trials. Default 1.

  • preproc (fn, optional) – function mapping (position, fs) data to (kinematics, fs_new). For example, a smoothing function or an estimate of velocity from position

  • kwargs – additional keyword arguments to pass to get_interp_task_data

Returns:

tuple containing:
segments (ntrial,): list of tensors of (nt, nch) task data from each trial
samplerate (float): samplerate of the task data

Return type:

tuple

aopy.data.bmi3d.tabulate_ts_data(preproc_dir, subjects, te_ids, dates, trigger_times, time_before, time_after, drive_number=None, channels=None, datatype='lfp')[source]

Grab rectangular timeseries data from trials across arbitrary preprocessed files.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • trigger_times (list of float) – times in the recording at which the desired trials start

  • time_before (float) – time (in seconds) to include before the trigger times

  • time_after (float) – time (in seconds) to include after the trigger times

  • channels (list of int, optional) – list of channel indices to include. Defaults to None.

  • datatype (str, optional) – choice of ‘lfp’ or ‘broadband’ data to load. Defaults to ‘lfp’.

Returns:

tuple containing:
data (nt, nch, ntr): tensor of data from each channel and trial
samplerate (float): sampling rate of the data

Return type:

tuple

aopy.data.bmi3d.tabulate_ts_segments(preproc_dir, subjects, te_ids, dates, start_times, end_times, drive_number=None, channels=None, datatype='lfp')[source]

Grab nonrectangular timeseries data from trials across arbitrary preprocessed files.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subjects (list of str) – Subject name for each recording

  • ids (list of int) – Block number of Task entry object for each recording

  • dates (list of str) – Date for each recording

  • start_times (list of float) – times in the recording at which the desired segments start

  • end_times (list of float) – times in the recording at which the desired segments end

  • channels (list of int, optional) – list of channel indices to include. Defaults to None.

  • datatype (str, optional) – choice of ‘lfp’ or ‘broadband’ data to load. Defaults to ‘lfp’.

Returns:

tuple containing:
data (list of (nt, nch)): list of data segments
samplerate (float): sampling rate of the data

Return type:

tuple

Database

Interface between database methods/models and data analysis code

class aopy.data.db.BMI3DDecoder(decoder, dbname='default')[source]

Wrapper for BMI3D Decoder objects. Written like this so that other database types can implement their own decoder classes with the same methods without needing to modfiy their database model.

property channels

The decoder channels

Returns:

channels used by the decoder

Return type:

list

property decoder

The decoder object

Returns:

decoder object

Return type:

object

property filt

The decoder filter

Returns:

decoder filter

Return type:

object

get(decoder_dir=None)[source]

Fetch the decoder object from the database, if there is one.

Returns:

decoder object (type depends on which decoder is being loaded)

Return type:

Decoder

class aopy.data.db.BMI3DTaskEntry(task_entry, dbname='default')[source]

Wrapper class for bmi3d database entry classes. Written like this so that other database types can implement their own classes with the same methods without needing to modfiy their database model.

property duration

Duration of recording in seconds

Returns:

duration

Return type:

float

property experimenter

Experimenter

Returns:

name of the experimenter

Return type:

str

property features

List of features that were enabled during recording

Returns:

enabled features

Return type:

list

get_db_object()[source]

Get the raw database object representing this task entry

Returns:

bmi3d task entry object

Return type:

models.TaskEntry

get_decoder(decoder_dir=None)[source]

Fetch the decoder object from the database, if there is one.

Returns:

decoder object (type depends on which decoder is being loaded)

Return type:

Decoder

get_decoder_record()[source]

Fetch the database models.Decoder record for this recording, if there is one.

Returns:

decoder record

Return type:

models.Decoder

get_exp_mapping(raw=False)[source]

Get the experiment mapping matrix for this task entry that maps from world to screen coordinates. Only useful for manual control experiments.

Parameters:

raw (bool, optional) – if True, return the mapping in BMI3D coordinates (x,z,y). Only useful for debugging. Defaults to False.

Returns:

3x3 mapping matrix

Return type:

np.ndarray

get_preprocessed_sources()[source]

Returns a list of datasource names that should be preprocessed for this task entry. Always includes experiment data (exp) and eye data (eye), and additionally includes broadband, lfp, and spike data if there are associated datafiles with appropriate neural data.

Returns:

preprocessed sources for this task entry

Return type:

list

get_raw_files(system_subfolders=None)[source]

Gets the raw data files associated with this task entry

Parameters:

system_subfolders (dict, optional) – dictionary of system subfolders where the data for that system is located. If None, defaults to the system name

Returns:

list of (system, filepath) for each datafile associated with this task entry

Return type:

files

get_sequence_param(paramname, default=None)[source]

Get a specific sequence parameter

Parameters:
  • paramname (str) – name of the parameter to get

  • default (object, optional) – default value to return if the parameter is not found. Defaults to None.

Returns:

parameter value

Return type:

object

get_task_param(paramname, default=None)[source]

Get a specific task parameter

Parameters:
  • paramname (str) – name of the parameter to get

  • default (object, optional) – default value to return if the parameter is not found. Defaults to None.

Returns:

parameter value

Return type:

object

has_exp_perturbation()[source]

Check if this task entry has an experiment perturbation

Returns:

True if the task entry has a non-identity mapping matrix

Return type:

bool

has_feature(featname)[source]

Check whether a feature was included in this recording

Parameters:

featname (str) – name of the feature to check

Returns:

whether or not the feature was enabled

Return type:

bool

property n_rewards

Number of rewarded trials

Returns:

number of rewarded trials

Return type:

int

property n_trials

Number of total trials presented

Returns:

number of total trials

Return type:

int

property notes

Notes

Returns:

notes

Return type:

str

preprocess(data_dir, preproc_dir, overwrite=False, exclude_sources=[], system_subfolders=None, **kwargs)[source]

Preprocess the data associated with this task entry

Parameters:
  • data_dir (str) – directory where the raw data is stored

  • preproc_dir (str) – directory where the preprocessed data will be written

  • overwrite (bool, optional) – whether or not to overwrite existing preprocessed data. Defaults to False.

  • exclude_sources (list, optional) – list of sources to exclude from preprocessing. Defaults to [].

  • system_subfolders (dict, optional) – dictionary of system subfolders where the data for that system is located. If None, defaults to the system name

  • kwargs (dict, optional) – additional keyword arguments to pass to the preprocessing function

Returns:

error message if there was an error during preprocessing

Return type:

str

property sequence_name

Sequence name, e.g. centerout_2D

Returns:

sequence name

Return type:

str

property sequence_params

All sequence parameters, e.g. ntargets or target_radius

Returns:

sequence params

Return type:

dict

property task_desc

Task description, e.g. flash

Returns:

task description

Return type:

str

property task_name

Task name, e.g. manual control

Returns:

task name

Return type:

str

property task_params

All task parameters

Returns:

task params

Return type:

dict

aopy.data.db.add_metadata_columns(df, sessions, column_names, apply_fns)[source]

Adds metadata columns (in-place) to a dataframe keyed on session id (e.g. from tabulate_behavior_data()). Specify the same number of column names as functions. Each function should take a single session as input and return a single value of any type. The return value will be appended to the dataframe in all rows where the task entry id (te_id) matches the input session.

Parameters:
  • df (pd.DataFrame) – dataframe of session summaries

  • sessions (list) – list of bmi3d task entry objects

  • column_names (list of str) – list of column names to append to the dataframe

  • apply_fns (list of functions) – functions to apply to each session to generate metadata columns

Examples

Addding a metadata column to a dataframe of session summaries

date_obj = date.fromisoformat('2023-02-06')
entries = db.lookup_sessions(date=date_obj)
df = db.summarize_entries(entries)
db.append_metadata_columns(df, entries, 'hs_data', lambda x: x.get_task_param('record_headstage'))
display(df)

Adding session and experimenter info after tabulating behavior data

date_obj = date.fromisoformat('2023-02-06')
entries = db.lookup_sessions(date=date_obj)
df = aopy.data.tabulate_behavior_data(entries)
db.append_metadata_columns(df, entries, ['session', 'experimenter'],
                                        [lambda x: x.session, lambda x: x.experimenter])
display(df)

More information about entries can be found in BMI3DTaskEntry

aopy.data.db.create_decoder_parent(project, session, task_name='nothing', task_desc='decoder parent', **kwargs)[source]

Create a new decoder parent entry (a TaskEntry) in the database. These are used to keep track of decoders that weren’t trained on a specific session.

Parameters:
  • project (str) – project name

  • session (str) – session name

  • task_name (str, optional) – task name. Defaults to ‘nothing’.

  • task_desc (str, optional) – task description. Defaults to ‘decoder parent’.

  • kwargs (dict, optional) – optional keyword arguments, including dbname to specify the database

Returns:

the new decoder parent entry

Return type:

TaskEntry

aopy.data.db.encode_onehot_sequence_name(sessions, sequence_types)[source]

Generates a dataframe summarizing the id, subject, date and by onehot encoding the sequences of interest of each entry in the input session list.

Parameters:
  • sessions (list) – list of bmi3d task entries

  • sequence_types (list) – Array of sequence_name strings. Can only be a list of strings

Returns:

Dataframe of entry summaries containing sequence name occurance

Return type:

pd.Dataframe

Examples

sessions = db.lookup_mc_sessions()
sequence_types = ['rand_target_chain_2D', 'centerout_2D', 'out_2D',
                'rand_target_chain_3D', 'corners_2D', 'centerout_2D_different_center',
                'sequence_2D', 'centerout_2D_select', 'single_laser_pulse']

df = db.encode_onehot_sequence_name(entries, sequence_types)
display(df)
_images/db_encode_onehot_sequence_name.png
aopy.data.db.filter_has_features(features)[source]

Filter function to select sessions only if they had the given features enabled

Parameters:

features (list or str) – a list of feature names, or a single feature

Returns:

a filter function to pass to lookup_sessions

Return type:

function

aopy.data.db.filter_has_neural_data(datasource)[source]

Filter function to select sessions only if they contain neural data recordings

Parameters:

datasource (str) – ‘ecog’ or ‘neuropixel’

Returns:

a filter function to pass to lookup_sessions

Return type:

function

aopy.data.db.get_aba_perturbation_days(entries)[source]

Finds all days with ABA block design manual control perturbation experiments based on get_aba_sessions().

Parameters:

entries (list) – list of task entries

Returns:

tuple containing
aba_days (list): list of dates with ABA block design sessions
aba_sessions (list): list of lists of session names for each date

Return type:

tuple

aopy.data.db.get_aba_perturbation_sessions(day_entries)[source]

Given a list of task entries, returns a list of session names that follow an ABA block design. The sessions must be on the same day and contain at least one session with an experiment perturbation within the day. The sessions are assigned as follows:

  • ‘a’ for sessions until the first perturbation

  • ‘b’ for sessions with a perturbation

  • ‘aprime’ for the remaining (non-perturbed) sessions

Parameters:

day_entries (n_rec,) – list of task entries for a single day

Returns:

list of session names (‘a’, ‘b’, ‘aprime’)

with the same length as the input list. If no sessions matching ABA format are found, returns None.

Return type:

(n_rec,) list or None

aopy.data.db.group_entries(sessions, grouping_fn=<function <lambda>>)[source]

Automatically group together a flat list of database IDs

Parameters:
  • sessions (list of task entries) – TaskEntry objects to group

  • grouping_fn (callable, optional) – grouping_fn(task_entry) takes a TaskEntry as its only argument and returns a hashable and sortable object by which to group the ids

Returns:

list of tuples, each tuple containing a group of sessions

Return type:

list

aopy.data.db.list_entry_details(sessions)[source]

Returns (subject, te_id, date) for each given session.

Parameters:

sessions (list of TaskEntry) – list of sessions

Returns:

tuple containing
subject (list): list of subject names
te_id (list): list of task entry ids
date (list): list of dates

Return type:

tuple

aopy.data.db.lookup_bmi_sessions(bmi_task_name='bmi control', **kwargs)[source]

Returns list of entries for all bmi control sessions on the given date See lookup_sessions() for details.

aopy.data.db.lookup_decoder_parent(task_name='nothing', task_desc='decoder parent', **kwargs)[source]

Lookup by project and session

aopy.data.db.lookup_decoders(id=None, parent_id=None, **kwargs)[source]

Returns list of decoders with the given filter parameters

Parameters:
  • id (int or list, optional) – Lookup decoders with the given ids, if provided.

  • parent_id (int, optional) – Lookup decoders with the given parent ids, if provided.

  • kwargs (dict, optional) – optional keyword arguments to pass to database lookup function.

Returns:

list of Decoder records matching the query

Return type:

list

aopy.data.db.lookup_flash_sessions(mc_task_name='manual control', **kwargs)[source]

Returns list of entries for all flash sessions on the given date. See lookup_sessions() for details.

aopy.data.db.lookup_mc_sessions(mc_task_name='manual control', **kwargs)[source]

Returns list of entries for all manual control sessions on the given date See lookup_sessions() for details.

aopy.data.db.lookup_sessions(id=None, subject=None, date=None, task_name=None, task_desc=None, session=None, project=None, experimenter=None, exclude_ids=[], filter_fn=<function <lambda>>, **kwargs)[source]

Returns list of entries for all sessions on the given date

Parameters:
  • id (int or list, optional) – Lookup sessions with the given ids, if provided.

  • subject (str, optional) – Lookup sessions with the given subject, if provided.

  • date (multiple, optional) – Lookup sessions from the given date, if provided. Accepts multiple formats: | datetime.date object | (start, end) tuple of datetime.date objects | (start, end) tuple of strings in the format ‘YYYY-MM-DD’ | (year, month, day) tuple of integers

  • task_name (str, optional) – Lookup sessions with the given task name, if provided. Examples include manual control, tracking, nothing, etc.

  • task_desc (str, optional) – Lookup sessions with the given task description, if provided. Examples include flash, simple center out, resting state, etc.

  • session (str, optional) – Lookup sessions with the given session name, if provided.

  • project (str, optional) – Lookup sessions with the given project name, if provided.

  • experimenter (str, optional) – Lookup sessions with the given experimenter, if provided.

  • exclude_ids (list, optional) – Exclude sessions with matching task entry ids, if provided.

  • filter_fn (function, optional) – Additional filtering, of signature fn(session)->bool. Defaults to `lambda x:True.

  • kwargs (dict, optional) – optional keyword arguments to pass to database lookup function.

Returns:

list of TaskEntry sessions matching the query

Return type:

list

aopy.data.db.lookup_tracking_sessions(tracking_task_name='tracking', **kwargs)[source]

Returns list of entries for all tracking sessions on the given date See lookup_sessions() for details.

aopy.data.db.save_decoder(decoder_parent, decoder, suffix, **kwargs)[source]

Save a new decoder to the database, associated with the given parent TaskEntry. If the decoder was trained on a specific session, use that as the parent. If not, use lookup_decoder_parent() or create_decoder_parent() to look up or create a new parent entry, respectively.

Parameters:
  • decoder_parent (TaskEntry) – the parent decoder entry

  • decoder (object) – the decoder object to save

  • suffix (str) – suffix to append to the decoder name

  • kwargs (dict, optional) – optional keyword arguments, including dbname to specify the database

Note

This only works if you have the bmi system path locally. See the BMI3D setup page to find this path and make it available on your system.

aopy.data.db.summarize_entries(entries, sum_trials=False)[source]

Generates a dataframe summarizing the subject, date, task, number of trials, and duration in minutes of each entry in the input list. Optionally sum the number of trials and duration for unique tasks across days for each subject

Parameters:
  • entries (list) – list of bmi3d task entries

  • sum_trials (bool, optional) – sum the number of trials and duration across unique tasks for each day for each subject

Returns:

dataframe of entry summaries

Return type:

pd.DataFrame

Examples

date_obj = date.fromisoformat('2023-02-06')
entries = db.lookup_sessions(date=date_obj)
df = db.summarize_entries(entries)
display(df)
_images/db_summarize_sessions.png
df_unique = db.summarize_entries(entries, sum_trials=True)
display(df_unique)
_images/db_summarize_sessions_sum.png

Peslab

aopy.data.peslab.get_exp_var(exp_data, *args)[source]

Generate a list of variable names from a .MAT formatted experiment data

Parameters:

exp_data (dict) – MAT file data dict

Returns:

list of variable names in exp_data

Return type:

var_names (list)

aopy.data.peslab.load_ecog_clfp_data(data_file_name, t_range=(0, -1), exp_file_name=None, mask_file_name=None, compute_mask=True)[source]

Load ECoG data file from a goose wireless dataset file.

Parameters:
  • data_file_name (str) – file path to data file

  • t_range (listlike, optional) – Start and stop times to read data. (0, -1) reads the entire file. Defaults to (0,-1).

  • exp_file_name (str, optional) – File path to experiment data JSON file.

  • mask_file_name (str, optional) – File path to data quality mask file. Defaults to None.

  • compute_mask (bool, optional) – Compute a data quality mask array if no mask file is given or found. Defaults to True.

Raises:
  • NameError – If experiment file cannot be found, NameError is raised.

  • NameError – If mask file cannot be found, NameError is raised.

Returns:

numpy array of multichannel ECoG data mask (numpy.array): binary mask indicating bad data samples exp (dict): dictionary of experiment data

Return type:

data (nt x nch)

aopy.data.peslab.load_experiment_data(exp_file_name)[source]

Reads experiment metadata from an experiment JSON file. Returns the complete data structure as a dictionary and returns electrode data as a pandas DataFrame.

Parameters:

exp_file_name (str) – JSON experiment data file path

Returns:

dict data object containing experiment metadata. See lab documentation for more information. electrode_df (DataFrame): pandas DataFrame containing microdrive electrode information. Individual channels are indexed along columns, column names are electrode IDs.

Return type:

experiment (dict)

aopy.data.peslab.load_mask_data(mask_file_name)[source]

Loads binary mask data from recording mask files. Binary True values indicate “bad” or noisy data not used in analyses.

Parameters:

mask_file_name (str) – file path to binary mask file

Returns:

numpy array of binary values. Length is equal to the number of time points in the respective data array.

Return type:

mask (numpy.array)

aopy.data.peslab.parse_file_info(file_path)[source]

Parses file strings for goose_wireless ECoG and LFP signal data into data parameters.

Parameters:

file_path (str) – path to the file’s location

Returns:

JSON experiment data file path mask_file_name (str): binary data mask file path microdrive_name (str): string name of the microdrive type used to collect data in file_path rec_type (str): recording modality reflected in this file (‘ECOG’, ‘LFP’, etc.)

Return type:

exp_file_name (str)

aopy.data.peslab.read_from_file(data_file_path, data_type, n_ch, n_read, n_offset, reshape_order='F')[source]

Reads recorded neural activity from a goose_wireless file.

Parameters:
  • data_file_path (str) – file path to data file

  • data_type (numeric type) – numpy numeric type reflecting the variable encoding in data_file_path

  • n_ch (int) – Number of channels in data_file_path

  • n_read (int) – Number of data samples read from data_file_path

  • n_offset (int) – Offset point defining where data reading starts

  • reshape_order (str, optional) – Data reshaping order. Defaults to ‘F’

Returns:

numpy array of neural activity stored in data_file_path

Return type:

data (np.array)

aopy.data.peslab.read_from_start(data_file_path, data_type, n_ch, n_read)[source]

Read data from goose wireless data file. Reads a fixed number of samples from the start of the recording.

Parameters:
  • data_file_path (str) – file path to data file

  • data_type (numeric type) – numpy numeric type reflecting the variable encoding in data_file_path

  • n_ch (int) – number of channels saved in data_file_path

  • n_read (int) – number of time points to read from data_file_path

Returns:

numpy array of neural recording data saved in data_file_path

Return type:

data (np.array)

Optitrack

aopy.data.optitrack.load_optitrack_data(data_dir, filename)[source]

This function loads a series of x, y, z positional data from the optitrack .csv file that has 1 rigid body exported with the following settings:

Markers: Off
Unlabeled markers: Off
Quality Statistics: Off
Rigid Bodies: On
Rigid Body Markers: Off
Bones: Off
Bone Markers: Off
Header Information: On
Optitrack format Version(s): 1.23

Required packages: pandas, numpy

Parameters:
  • data_dir (string) – Directory to load data from

  • filename (string) – File name to load within the data directory

Returns:

Tuple containing:
mocap_data_pos (nt, 3): Positional mocap data
mocap_data_rot (nt, 4): Rotational mocap data

Return type:

tuple

aopy.data.optitrack.load_optitrack_metadata(data_dir, filename, metadata_row=0)[source]

This function loads optitrack metadata from .csv file that has 1 rigid body exported with the following settings:

Markers: Off
Unlabeled markers: Off
Quality Statistics: Off
Rigid Bodies: On
Rigid Body Markers: Off
Bones: Off
Bone Markers: Off
Header Information: On
Optitrack format Version(s): 1.23

Required packages: csv, pandas

Parameters:
  • data_dir (string) – Directory to load data from

  • filename (string) – File name to load within the data directory

Returns:

Dictionary of metadata for for an optitrack datafile

Return type:

dict

aopy.data.optitrack.load_optitrack_time(data_dir, filename)[source]

This function loads timestamps from the optitrack .csv file

Required packages: pandas, numpy

Parameters:
  • data_dir (string) – Directory to load data from

  • filename (string) – File name to load within the data directory

Returns:

Array of timestamps for each captured frame

Return type:

(nt)

Eye

aopy.data.eye.proc_eye_day(preproc_dir, subject, date, correlation_min=0.9, dry_run=False)[source]

Finds files from the given subject and date with the best eye calibration and automatically applies it to every recording on that day for that subject. If no good calibration is found, raises a ValueError exception.

Parameters:
  • preproc_dir (str) – base directory where the files live

  • subject (str) – Subject name

  • date (str) – Date of recording

  • correlation_min (float, optional) – correlation below which is unacceptable

  • dry_run (bool, optional) – if True, files will not be modified.

Raises:

ValueError

Returns:

tuple containing:
best_id (int): the task entry id with the highest mean absolute value correlation coefficient
te_ids (list of int): the ids to which the coeff were applied

Return type:

tuple

Neuropixel

aopy.data.neuropixel.get_channel_bank_name(ch_bank_data, ch_config_dir='/data/channel_config_np', filename='channel_bank.npy')[source]

Get the information about which channels are used for recording. This function assumes channel configuration is either of below, long-br, middle, long-tr, top, long-tl, long-bl, bottom.

Parameters:
  • ch_bank_data (nch) – channel bank information contained in neuropixel

  • ch_config_dir (str, optional) – directory that contains the channel configuration file

  • filename (str, optional) – filename that includes all bank information.

Returns:

channel name (long-br, middle, long-tr, top, long-tl, long-bl, bottom)

Return type:

chname (str)

aopy.data.neuropixel.get_neuropixel_digital_input_times(data_dir, data_folder, datatype, node_idx=0, ex_idx=0, port_number=1)[source]

Computes the times when sync line come to the degital channel in openephys. Openephys recodings doesn’t always begin with 0 time index.

Parameters:
  • data_dir (str) – data directory where the data files are located

  • data_folder (str) – data folder where 1 experiment data is saved

  • datatype (str) – datatype. ‘ap’ or ‘lfp’

  • node_idx (int) – record node index. This is usually 0.

  • ex_idx (int) – experiment index. This is usually 0.

  • port_number (int) – port number which a probe connected to. Natural number from 1 to 4.

Returns:

Tuple containing:
on_times (n_times): times at which sync line turned on
off_times (n_times): times at which sync line turned off

Return type:

tuple

aopy.data.neuropixel.load_neuropixel_configuration(data_dir, data_folder, ex_idx=0, port_number=1)[source]

get neuropixel probe information from xml condiguration files made by OpenEphys channel number and electrode x pos is sorded in the order of y pos when saved by openephys This function also sorts x pos and y pos in the order of channel number

Parameters:
  • data_dir (str) – where to find the file

  • data_folder (str) – the xml file that describes neuropixel probe configuration

  • ex_idx (int) – experiment idx. This is usually 0.

  • port_number (int) – port number which a probe connected to. natural number from 1 to 4.

Returns:

dictionary thet contains electrode configuration

Return type:

config (dict)

aopy.data.neuropixel.load_neuropixel_data(data_dir, data_folder, datatype, node_idx=0, ex_idx=0, port_number=1)[source]

Load neuropixel data object and metadata. The data obeject has 4 properties of samples, sample_numbers, timestamps, and metadata. See this link: https://github.com/open-ephys/open-ephys-python-tools/tree/main/src/open_ephys/analysis

Parameters:
  • data_dir (str) – data directory where the data files are located

  • data_folder (str) – data folder where 1 experiment data is saved

  • datatype (str) – datatype. ‘ap’ or ‘lfp’

  • node_idx (int) – record node index. This is usually 0.

  • ex_idx (int) – experiment index. This is usually 0.

  • port_number (int) – port number which a probe connected to. natural number from 1 to 4.

Returns:

Tuple containing:
rawdata (object): data object
metadata (dict): metadata

Return type:

tuple

aopy.data.neuropixel.load_neuropixel_event(data_dir, data_folder, datatype, node_idx=0, ex_idx=0, port_number=1)[source]

Load neuropixel’s event data saved by openephys, accroding to datatype

Parameters:
  • data_dir (str) – data directory where the data files are located

  • data_folder (str) – data folder where 1 experiment data is saved

  • datatype (str) – datatype. ‘ap’ or ‘lfp’

  • node_idx (int) – record node index. This is usually 0.

  • ex_idx (int) – experiment index. This is usually 0.

  • port_number (int) – port number which a probe connected to. natural number from 1 to 4.

Returns:

events data

Return type:

events (ndarray)

aopy.data.neuropixel.load_parsed_ksdata(kilosort_dir, data_dir)[source]

load kilosort data (spike indices, clusters, and label) parsed into the task entries This data is not still synchronized

Parameters:
  • kilosort_dir (str) – kilosort directory (ex. ‘/data/preprocessed/kilosort’)

  • data_dir (str) – data directory that contains parsed data (ex. ‘2023-06-30_Neuropixel_ks_affi_bottom_port1_9847’)

Returns:

spike indices detected by kilosort (not spike times) spike_clusters (nspikes): unit label detected by kilsort

Return type:

spike_indices (nspikes)