UW Neuroscience Intro MATLAB Bootcamp
Rat Social Interactions: Behavioral Data
The data here are a subset of RatSI, an open-source social interaction behavioral database described by Lorbach et al. (2018). The database includes 9 ~15 minute videos of two rats in a 90cm x 90cm open field interacting with each other, along with complete manual annotations of the social interactions present in these videos and tracking of the rats' centers, tails, and noses. Social behaviors noted include:
We're going to break down this dataset and use it to answer some basic questions about the behaviors of the rats: do the rates of different behaviors vary over time? Is there any correlation or significant difference between the speeds of the rats over the course of the recording? What is the average distance between rats for each annotated behavior and how long, on average, do they perform each behavior? There are a lot of questions you could delve into with this dataset, so definitely feel free to make your own hypotheses/experimental questions and test them!
MATLAB Concepts Covered:
importing variables from spreadsheets, parsing text variables, doing complex calculations on entire vectors (without using loops), visualizing results, using simple built-in statistical functions, binning data, automation
Beginner Project: Make some basic computations based on this behavioral data to test hypotheses about social behaviors.
1. Load in the data and examine the variables
There are two data files you need - one is in MATLAB format and the other is in CSV (comma separated value) format and will need to be loaded in. First, load tracking.mat, which has three variables:
frame: lists the frame number of the annotated videos
rat_marked: a structure that contains position information for the marked rat
rat_marked.center: (x, y) position of the center of the rat
rat_marked.tail: (x, y) position of the tail of the rat
rat_marked.nose: (x, y) position of the nose
rat_white: a structure with the position information for the other rat
Load annotations.csv by double clicking on it, which should pop up an interactive window for importing variables. There are three columns: frame, action, and subject. We want to load in the latter two columns as text (you can change the type by clicking on the arrow in the column heading). The frame variable here is the same as the frame variable that you already have in your workspace, so you don't need to load it in.
action: annotated behaviors from the videos
subject: a record of which rat was involved in each behavior (marked or white)
What is different about these variables than the position variables above and what can you do to fix this problem? (click for answer)
These variables are one value longer than the position variables because of the headings. Remove the headings.
Additionally, I've included the raw video file if you want to see what it looks like (and it might come in handy below...)
2. Get to know the data.
How long (in minutes) was this recording? (click for answer)
14 minutes 20 seconds
Click for hint.
You're not going to be able to determine this from the numerical data alone.
How long was the average frame? (click for answer)
0.04 seconds
How many different behavioral states are present in this dataset? (click for answer)
10
What percentage of behaviors were both mice engaged in? (click for answer)
60.36%
Where was each rat's center at approximately 1 minute into the recording? (click for answer)
white: (280, 420) marked: (465, 224)
3. Make a list of information you will need from your data.
Our goals are to determine the average distance between mice when they are interacting with each other, how participation in social behaviors changes over the course of the recording, and if the mice's velocities covary at all over time. Click for my full list of information I think will be useful to accomplish these goals, but there may be other information you will want.
-
Approximate time point of each frame
-
Distance between the mice, as measured from their centers
-
Distance between the mice, as measured from their noses
-
Distance between the mice, as measured from their tails
-
Velocity of each rat at each time point
-
A way to make it easier to work with the action and subject variables (make them not strings)
4. Calculate each piece of information and put it in a variable.
-
Time points should be relatively easy to compute based on the average frame length.
-
Use the distance formula (here is a WikiHow with pictures about how to do this) to calculate three different distance vectors, one for each measurement point.
Click for hint.
Potentially helpful Google keywords: square root
-
Use the same protocol above to find the distance between each consecutive measurement (I used the center measurement but there's no real reason why). Then divide these distances by the frame rate.
What units are these velocities?
they're in 10s of cm/s, divide by 10 to get cm/s
-
Brainstorm ways to make action and subject variables easier to work with, then click to continue.
I would create a categories x time points matrix of logicals, with each column recording whether or not a given category is happening at each time point. This process can be automated, using a for loop and the unique() function.
5. Do something with the data.
Use this information to answer the following questions. Think carefully about the best ways to visualize your results using one of the many plot types available in MATLAB.
-
When both rats are recorded as being active in a behavior, what is the average distance between them? Plot your findings.
-
What is the average time spent on each behavior? Plot your findings.
-
Split the dataset into 10 second segments, and and count the instances of each behavior during each bin. Plot your findings.
-
Is there any correlation between the mice's speed over time? Is there a significant (t-test significant) difference between the mean velocities of the two mice over time?
Feel free to ask your own questions instead of/in addition to these!
6. Iterate
Repeat this protocol with other files of rat data. Try to automate the process as much as possible. Are there any points where you will need to do manual work for each file? Are there any points where you need to modify your original code to make it more flexible?
Advanced Project Ideas
The purpose of this dataset is for cross-validation of new machine vision techniques to automate the labeling of behaviors. The authors of the paper associated with this dataset use combinations of the distances and angles between different points on the two rats' bodies.
Look for correlations between these distances and angles and different behavioral states and/or build a simple classifier that, when given these measurements, guesses the behavioral state. Test your classifier on another file of data (or see if the correlations you notice hold up in the other datasets).