WEBVTT 00:00:00.000 --> 00:00:05.260 align:middle line:90% 00:00:05.260 --> 00:00:09.040 align:middle line:84% Hi and welcome to this RITMO course on motion capture. 00:00:09.040 --> 00:00:11.230 align:middle line:84% My name is Alexander Jensenius, and I'm 00:00:11.230 --> 00:00:15.340 align:middle line:84% a professor of Music Technology here at the University of Oslo. 00:00:15.340 --> 00:00:17.260 align:middle line:90% And my name is Jonna Vuoskoski. 00:00:17.260 --> 00:00:20.530 align:middle line:84% And I'm an associate professor in Music Cognition, 00:00:20.530 --> 00:00:22.510 align:middle line:84% also here at the University of Oslo. 00:00:22.510 --> 00:00:27.010 align:middle line:84% And the topic of this course that we are going to teach you 00:00:27.010 --> 00:00:30.910 align:middle line:90% now is on motion capture. 00:00:30.910 --> 00:00:33.228 align:middle line:84% But, what is actually motion capture? 00:00:33.228 --> 00:00:35.020 align:middle line:84% Well, that's actually a very good question, 00:00:35.020 --> 00:00:37.150 align:middle line:84% because it's not so easy to answer. 00:00:37.150 --> 00:00:39.792 align:middle line:84% Because some people think about motion capture as these suits 00:00:39.792 --> 00:00:41.500 align:middle line:84% that you put on, with markers, et cetera. 00:00:41.500 --> 00:00:44.140 align:middle line:84% But you may also think about motion capture 00:00:44.140 --> 00:00:46.900 align:middle line:84% as, more generally, just the way of being able to capture 00:00:46.900 --> 00:00:49.120 align:middle line:90% some kind of human body motion. 00:00:49.120 --> 00:00:52.060 align:middle line:84% And in general, I would say that you can kind of separate 00:00:52.060 --> 00:00:55.150 align:middle line:84% between observational-based motion capture, where 00:00:55.150 --> 00:00:57.970 align:middle line:90% you can watch someone move. 00:00:57.970 --> 00:01:00.490 align:middle line:84% And you can kind of just by looking at them, 00:01:00.490 --> 00:01:02.830 align:middle line:84% you can try to analyse what's going on. 00:01:02.830 --> 00:01:05.420 align:middle line:84% You can write it down with pen and paper, et cetera. 00:01:05.420 --> 00:01:07.990 align:middle line:84% So that's one type of motion capture. 00:01:07.990 --> 00:01:10.870 align:middle line:84% Another one is to use different types of technologies 00:01:10.870 --> 00:01:12.490 align:middle line:84% to be able to capture the motion. 00:01:12.490 --> 00:01:14.530 align:middle line:84% And that's the focus of the course 00:01:14.530 --> 00:01:16.810 align:middle line:84% that we're going to have here now, where 00:01:16.810 --> 00:01:20.650 align:middle line:84% we're going to look at two different types of motion 00:01:20.650 --> 00:01:21.910 align:middle line:90% capture. 00:01:21.910 --> 00:01:23.410 align:middle line:84% Camera-based on one side, where you 00:01:23.410 --> 00:01:27.070 align:middle line:84% use cameras of different types to be 00:01:27.070 --> 00:01:28.990 align:middle line:90% able to capture the motion. 00:01:28.990 --> 00:01:31.660 align:middle line:84% And sensor-based, where you use different types of sensors 00:01:31.660 --> 00:01:35.320 align:middle line:84% that you put on the body, for example, to measure this. 00:01:35.320 --> 00:01:37.145 align:middle line:84% But then, Jonna, you have also been 00:01:37.145 --> 00:01:39.520 align:middle line:84% working with motion capture quite a bit in your research. 00:01:39.520 --> 00:01:42.100 align:middle line:84% And then, why do we actually want 00:01:42.100 --> 00:01:44.690 align:middle line:84% to work with motion capture in the first place in music 00:01:44.690 --> 00:01:45.190 align:middle line:90% research? 00:01:45.190 --> 00:01:47.410 align:middle line:84% And also, outside of music research? 00:01:47.410 --> 00:01:51.160 align:middle line:84% Yeah, well, that's a very good question. 00:01:51.160 --> 00:01:55.540 align:middle line:84% First of all, we know that the body is actually 00:01:55.540 --> 00:01:58.720 align:middle line:84% very important for different kinds of cognitive processes, 00:01:58.720 --> 00:02:00.880 align:middle line:84% also from the perspective of music cognition. 00:02:00.880 --> 00:02:02.350 align:middle line:84% If we want to really understand how 00:02:02.350 --> 00:02:06.820 align:middle line:84% we make sense of music, how we react to music, 00:02:06.820 --> 00:02:08.919 align:middle line:84% we also have to look at the body. 00:02:08.919 --> 00:02:13.400 align:middle line:84% And human body movement is a super complex phenomenon. 00:02:13.400 --> 00:02:16.990 align:middle line:84% And if we want to really scientifically study it, 00:02:16.990 --> 00:02:23.350 align:middle line:84% we have to have a method of objectively, reliably measuring 00:02:23.350 --> 00:02:24.100 align:middle line:90% it. 00:02:24.100 --> 00:02:26.560 align:middle line:84% And motion capture is one way of doing it, 00:02:26.560 --> 00:02:29.350 align:middle line:84% whether that's some sort of an observational method 00:02:29.350 --> 00:02:33.460 align:middle line:84% or like a technology-based method. 00:02:33.460 --> 00:02:36.070 align:middle line:84% And going more into these technology 00:02:36.070 --> 00:02:41.470 align:middle line:84% based measures or methods also, the human eye 00:02:41.470 --> 00:02:42.460 align:middle line:90% is not that accurate. 00:02:42.460 --> 00:02:44.860 align:middle line:84% If we want to look at very sort of fine details 00:02:44.860 --> 00:02:48.040 align:middle line:84% or very fine movements, we need to have 00:02:48.040 --> 00:02:51.970 align:middle line:84% something that goes beyond the capabilities of the human eye. 00:02:51.970 --> 00:02:55.000 align:middle line:84% So motion capture methods are able to capture 00:02:55.000 --> 00:02:58.570 align:middle line:84% these very sort of minute, micro movements even. 00:02:58.570 --> 00:03:01.060 align:middle line:84% When we are, for example, trying to stand still 00:03:01.060 --> 00:03:04.280 align:middle line:90% while listening to music. 00:03:04.280 --> 00:03:09.790 align:middle line:84% And also, we are able to extract different kinds of features 00:03:09.790 --> 00:03:10.750 align:middle line:90% from motion. 00:03:10.750 --> 00:03:14.230 align:middle line:84% So like the velocity or acceleration 00:03:14.230 --> 00:03:16.960 align:middle line:84% or the size of movements, when we are engaging 00:03:16.960 --> 00:03:19.900 align:middle line:84% in different kinds of music-related activities, 00:03:19.900 --> 00:03:24.430 align:middle line:84% like dancing or playing a musical instrument. 00:03:24.430 --> 00:03:26.800 align:middle line:84% And then we can, in experimental settings, 00:03:26.800 --> 00:03:29.170 align:middle line:84% we can perhaps vary the conditions. 00:03:29.170 --> 00:03:32.410 align:middle line:84% And then we can see how these different features change 00:03:32.410 --> 00:03:34.040 align:middle line:84% between those different conditions. 00:03:34.040 --> 00:03:37.810 align:middle line:84% So, for example, if you are interested in looking 00:03:37.810 --> 00:03:39.850 align:middle line:84% at dance movements, you could see 00:03:39.850 --> 00:03:43.390 align:middle line:84% how the acceleration and size of dance movements 00:03:43.390 --> 00:03:48.700 align:middle line:84% varies depending on the genre of music that you are dancing to. 00:03:48.700 --> 00:03:56.410 align:middle line:84% And finally, also, you can use motion capture data 00:03:56.410 --> 00:04:01.030 align:middle line:84% to illustrate a movement or make illustrations 00:04:01.030 --> 00:04:04.000 align:middle line:84% that you can use in publications or even animations 00:04:04.000 --> 00:04:07.000 align:middle line:84% of movements, the sort of stick figure animations 00:04:07.000 --> 00:04:08.740 align:middle line:90% or other types of animations. 00:04:08.740 --> 00:04:12.070 align:middle line:84% And these you can also use them in other types of experiments, 00:04:12.070 --> 00:04:15.430 align:middle line:84% like perceptual experiments, where you ask people to watch 00:04:15.430 --> 00:04:17.709 align:middle line:90% these videos, make evaluations. 00:04:17.709 --> 00:04:22.390 align:middle line:84% And then you can, again, try to find these associations 00:04:22.390 --> 00:04:24.640 align:middle line:84% between these objectively extracted 00:04:24.640 --> 00:04:29.830 align:middle line:84% movement features and people's evaluations, for example. 00:04:29.830 --> 00:04:32.800 align:middle line:90% 00:04:32.800 --> 00:04:37.450 align:middle line:84% But are there any challenges associated with motion capture? 00:04:37.450 --> 00:04:39.933 align:middle line:84% There are just challenges everywhere, really, 00:04:39.933 --> 00:04:41.350 align:middle line:84% when you work with motion capture. 00:04:41.350 --> 00:04:43.720 align:middle line:84% And that's something we have been exploring 00:04:43.720 --> 00:04:45.670 align:middle line:84% quite a lot at RITMO because we have 00:04:45.670 --> 00:04:48.110 align:middle line:84% been looking at using different types of motion capture 00:04:48.110 --> 00:04:48.610 align:middle line:90% systems. 00:04:48.610 --> 00:04:50.260 align:middle line:84% And many of these challenges we're 00:04:50.260 --> 00:04:53.470 align:middle line:84% going to talk more about in the later chapters in this course 00:04:53.470 --> 00:04:53.970 align:middle line:90% between. 00:04:53.970 --> 00:04:58.060 align:middle line:84% But very briefly, you can say that, for example, the location 00:04:58.060 --> 00:05:00.410 align:middle line:84% that you're using is very important for what type 00:05:00.410 --> 00:05:01.660 align:middle line:90% of motion capture you can use. 00:05:01.660 --> 00:05:04.725 align:middle line:84% For example, in a setting like we have here, 00:05:04.725 --> 00:05:06.600 align:middle line:84% it's a very different kind of setup than when 00:05:06.600 --> 00:05:08.100 align:middle line:90% you are standing in our lab. 00:05:08.100 --> 00:05:10.710 align:middle line:84% And we have a much more controlled environment. 00:05:10.710 --> 00:05:12.730 align:middle line:84% Of course, it also depends on how many people 00:05:12.730 --> 00:05:13.980 align:middle line:90% that you are going to capture. 00:05:13.980 --> 00:05:15.930 align:middle line:90% Are you looking at one person? 00:05:15.930 --> 00:05:18.420 align:middle line:84% Or are you looking at multiple people? 00:05:18.420 --> 00:05:20.890 align:middle line:84% Are you looking at the musician having an instrument, 00:05:20.890 --> 00:05:21.390 align:middle line:90% for example? 00:05:21.390 --> 00:05:23.328 align:middle line:90% That's something else then. 00:05:23.328 --> 00:05:24.870 align:middle line:84% If you want to try to capture someone 00:05:24.870 --> 00:05:30.130 align:middle line:84% just kind of moving to this to turn to music, for example. 00:05:30.130 --> 00:05:33.720 align:middle line:84% And what we see is, also, that we have so many different types 00:05:33.720 --> 00:05:34.530 align:middle line:90% of systems. 00:05:34.530 --> 00:05:37.517 align:middle line:84% And also how they integrate it is another challenge. 00:05:37.517 --> 00:05:39.600 align:middle line:84% Often we want to use motion capture together with, 00:05:39.600 --> 00:05:42.990 align:middle line:84% for example, EMG, which measures the muscle tension. 00:05:42.990 --> 00:05:45.740 align:middle line:84% Or ECG for measuring the heart rate. 00:05:45.740 --> 00:05:47.490 align:middle line:84% Of course, since we are music researchers, 00:05:47.490 --> 00:05:49.430 align:middle line:90% we also want to record audio. 00:05:49.430 --> 00:05:53.160 align:middle line:84% Possibly we want to have video as well, perhaps, in addition 00:05:53.160 --> 00:05:54.750 align:middle line:90% to kind of some sensors. 00:05:54.750 --> 00:05:58.120 align:middle line:84% And then the integration of all these things is tricky. 00:05:58.120 --> 00:05:59.820 align:middle line:84% Of course, we generate a lot of data. 00:05:59.820 --> 00:06:02.130 align:middle line:84% And how do we store this data in a meaningful way? 00:06:02.130 --> 00:06:03.850 align:middle line:84% And synchronise all of these et cetera. 00:06:03.850 --> 00:06:06.090 align:middle line:84% So there are tonnes of challenges really. 00:06:06.090 --> 00:06:09.450 align:middle line:84% And many of these, we will also discuss later in the course 00:06:09.450 --> 00:06:13.020 align:middle line:84% to really give you also some tools 00:06:13.020 --> 00:06:15.700 align:middle line:84% and give you some of our experience, 00:06:15.700 --> 00:06:19.060 align:middle line:84% when it comes to try to solving these issues. 00:06:19.060 --> 00:06:22.920 align:middle line:84% So that's kind of part of all the challenges of the data 00:06:22.920 --> 00:06:24.930 align:middle line:90% collection itself. 00:06:24.930 --> 00:06:28.290 align:middle line:84% But then, after we have done the data collection, 00:06:28.290 --> 00:06:31.360 align:middle line:84% we are moving them towards the analysis. 00:06:31.360 --> 00:06:33.870 align:middle line:84% Of course, you need to clean up the data, et cetera 00:06:33.870 --> 00:06:36.270 align:middle line:84% And then we can move on to the analysis. 00:06:36.270 --> 00:06:38.690 align:middle line:84% And we'll talk more about more advanced analysis things 00:06:38.690 --> 00:06:39.190 align:middle line:90% later on. 00:06:39.190 --> 00:06:41.190 align:middle line:84% But just very briefly, what type of analysis 00:06:41.190 --> 00:06:44.250 align:middle line:84% can you do really with motion capture data? 00:06:44.250 --> 00:06:49.380 align:middle line:84% Well, broadly speaking, you can do both different qualitative 00:06:49.380 --> 00:06:53.260 align:middle line:84% as well as quantitative analysis with motion capture. 00:06:53.260 --> 00:06:56.760 align:middle line:84% So I suppose qualitative methods would 00:06:56.760 --> 00:07:00.060 align:middle line:90% evolve types of interpretation. 00:07:00.060 --> 00:07:01.500 align:middle line:84% Like if you want to, for example, 00:07:01.500 --> 00:07:03.180 align:middle line:84% categorise different kinds of movements 00:07:03.180 --> 00:07:08.340 align:middle line:84% or find what the functional purpose of those movements is, 00:07:08.340 --> 00:07:09.370 align:middle line:90% for example. 00:07:09.370 --> 00:07:11.220 align:middle line:84% And, of course, then quantitative methods 00:07:11.220 --> 00:07:15.120 align:middle line:84% involve statistics, measurement statistics, 00:07:15.120 --> 00:07:18.085 align:middle line:84% sometimes also machine-learning techniques. 00:07:18.085 --> 00:07:19.710 align:middle line:84% And you can also draw kind of parallels 00:07:19.710 --> 00:07:23.340 align:middle line:84% between these quantitative and qualitative approaches, 00:07:23.340 --> 00:07:25.950 align:middle line:84% as well as the more descriptive types 00:07:25.950 --> 00:07:28.810 align:middle line:84% of analysis and functional types of analysis. 00:07:28.810 --> 00:07:33.420 align:middle line:84% So descriptive types of analysis could, for example, 00:07:33.420 --> 00:07:36.420 align:middle line:84% relate to the kinematics of the movement, 00:07:36.420 --> 00:07:38.100 align:middle line:90% like acceleration or velocity. 00:07:38.100 --> 00:07:41.460 align:middle line:84% Or spatial properties, like the size of movement, 00:07:41.460 --> 00:07:44.190 align:middle line:84% or location, or where the movement is happening 00:07:44.190 --> 00:07:49.650 align:middle line:84% in space or time, time-based aspects 00:07:49.650 --> 00:07:52.950 align:middle line:84% like the frequency, or periodicity, or speed, 00:07:52.950 --> 00:07:57.070 align:middle line:84% or things like that of those movements. 00:07:57.070 --> 00:07:59.430 align:middle line:84% And then, the functional analysis 00:07:59.430 --> 00:08:05.670 align:middle line:84% is looking more at what is the purpose of the movement. 00:08:05.670 --> 00:08:08.760 align:middle line:84% Or, for example, if you're playing a musical instrument, 00:08:08.760 --> 00:08:12.510 align:middle line:84% is it specifically a sound-producing action or just 00:08:12.510 --> 00:08:14.610 align:middle line:84% a sound or a sound-accompanying action 00:08:14.610 --> 00:08:16.830 align:middle line:84% or some other type of communicative action, 00:08:16.830 --> 00:08:19.210 align:middle line:90% for example. 00:08:19.210 --> 00:08:22.620 align:middle line:84% And so indeed, these descriptive methods tend to be 00:08:22.620 --> 00:08:26.910 align:middle line:84% quantitative, so extracting these quantitative features 00:08:26.910 --> 00:08:29.250 align:middle line:84% and analysing them, which is statistics. 00:08:29.250 --> 00:08:31.410 align:middle line:84% Whereas making these interpretations 00:08:31.410 --> 00:08:34.860 align:middle line:84% about the functionality is a bit more qualitative. 00:08:34.860 --> 00:08:42.350 align:middle line:84% And you might use illustrations as help in that, as well. 00:08:42.350 --> 00:08:44.723 align:middle line:84% And we'll get back to some of these 00:08:44.723 --> 00:08:46.140 align:middle line:84% later on in the course to give you 00:08:46.140 --> 00:08:48.780 align:middle line:84% some examples, but also some specific tools 00:08:48.780 --> 00:08:51.260 align:middle line:84% to be able to do this type of analysis. 00:08:51.260 --> 00:08:54.720 align:middle line:84% So I think that's enough for the introduction. 00:08:54.720 --> 00:08:57.000 align:middle line:84% You will be able to learn more throughout. 00:08:57.000 --> 00:08:59.320 align:middle line:84% We have split this course up into different sections, 00:08:59.320 --> 00:09:02.162 align:middle line:84% so we will go through the different technologies. 00:09:02.162 --> 00:09:04.620 align:middle line:84% First of all, of course, we need to start with a little bit 00:09:04.620 --> 00:09:06.870 align:middle line:84% background about the body and how the body is actually 00:09:06.870 --> 00:09:08.550 align:middle line:90% working, if that makes sense. 00:09:08.550 --> 00:09:11.400 align:middle line:84% And we'll have a combination of videos and text material, 00:09:11.400 --> 00:09:13.500 align:middle line:84% so that you will be able to learn 00:09:13.500 --> 00:09:15.660 align:middle line:90% from both of these modalities. 00:09:15.660 --> 00:09:19.578 align:middle line:84% And of course also, there are some quizzes and tests 00:09:19.578 --> 00:09:21.120 align:middle line:84% that you can take along the way also, 00:09:21.120 --> 00:09:24.090 align:middle line:84% to check that you have learned what is necessary to learn. 00:09:24.090 --> 00:09:26.550 align:middle line:84% So again, welcome to this RITMO course on motion capture. 00:09:26.550 --> 00:09:29.390 align:middle line:90% And enjoy the rest of this. 00:09:29.390 --> 00:09:35.000 align:middle line:90%