Multimodal Transformer for Unaligned Multimodal Language ...
Multimodal Temporal Data Processing in Autonomous Driving
Transcript of Multimodal Temporal Data Processing in Autonomous Driving
Chair of Robotics, Artificial Intelligence and Real-time Systems
MasterSeminar
M.Sc. Emec Ercelik
Office 03.07.057
WS 20-21
Multimodal Temporal Data Processing in Autonomous Driving
Chair of Robotics, Artificial Intelligence and Real-time Systems
Autonomous driving
* Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
Understanding the surroundings
Listing the objects
Localizing the objects
Matching objects in frames
Estimating intentions
Supervised learning
Data
Chair of Robotics, Artificial Intelligence and Real-time Systems
Multi-modal data?
* Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
Images
Multi-view images
Chair of Robotics, Artificial Intelligence and Real-time Systems
Multi-modal data?
*Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
Images
Multi-view images
Lidar point clouds
Chair of Robotics, Artificial Intelligence and Real-time Systems
Multi-modal data?
* Chadwick, Simon, Will Maddern, and Paul Newman. "Distant vehicle detection using radar and vision." arXiv preprint arXiv:1901.10951 (2019).
** Caesar, Holger, et al. "nuScenes: A multimodal dataset for autonomous driving." arXiv preprint arXiv:1903.11027 (2019).
Images
Multi-view images
Lidar point clouds
Radar
Chair of Robotics, Artificial Intelligence and Real-time Systems
Multi-modal data?
** Jain, Ashesh, et al. ”Car that knows before you do: Anticipating maneuvers via learning temporal driving models.” Proceedings of the IEEE International Conference on Computer Vision. 2015.
Images
Multi-view images
Lidar point clouds
Radar
GPS
IMU
...
Chair of Robotics, Artificial Intelligence and Real-time Systems
Temporal data?Sequences of images
Video
Sequences of point clouds
A tensor (nx4xT)
Sequences of radar feature maps
Chair of Robotics, Artificial Intelligence and Real-time Systems
Potential advantages of multi-modal data & temporal data
More information about the surroundings
Compensation of data quality in different conditions
Estimation of the actors’ intention in the environment
But there is not a single solution to fit everything in one framework
Always review the literature
Propose own solutions
Chair of Robotics, Artificial Intelligence and Real-time Systems
Topics
1. Graph neural networks for object detection and tracking
2. Attention-based networks for object detection
3. Representation learning from different data types
4. Transfer learning for 3D object detection
Review studies regarding the topic
Prepare a report
Write a peer-review on one of the other reports
Chair of Robotics, Artificial Intelligence and Real-time Systems
Procedure
1. Form a group of two or three (Or I will form groups among the ones who selected same topics)
2. Choose two topics that are of your interest and send them via e-mail (09.11.2020)
3. You will get a notification e-mail regarding the assigned topic (13.11.2020)
4. Start looking at the resources and the state-of-the-art papers (13.11-27.11.2020)
5. Initial meeting with groups to discuss the collected materials and expected results (27.11.2020)
6. Midterm session to discuss the progress together (18.12.2020)
7. Write a seminar paper on your work and submit the first draft (15.01.2021)
8. Present your work (05.02, 12.02.2021)
9. Submit the final version of your paper (26.02.2021)
10. Write a peer-review on the assigned report of your peers (12.03.2021)
Note: All meetings take place online
Chair of Robotics, Artificial Intelligence and Real-time Systems
Information about the SeminarTime and Location: Zoom / Fridays 9:00-11:00
Check the web page of the seminar regularly
Report 6-8 pages two-column IEEE format
Chair of Robotics, Artificial Intelligence and Real-time Systems
Gitlab RepositoryEach group will be granted access to their own repository
There will be an initial meeting for each group with the supervisor regarding their topic (Date and time will be discussed with the supervisor)
Update your repository regularly with your work
All the work should be uploaded to Git Repo before deadlines
There is a good documentation for Gitlab here
Chair of Robotics, Artificial Intelligence and Real-time Systems
GradingExtracting the related state-of-the-art resources (20%)
Writing a high quality scientific report (40%)
Revising and writing a review (10%)
Presenting the work (30%)
Attendance to the presentation session is Mandatory
Chair of Robotics, Artificial Intelligence and Real-time Systems
Notes on PlagiarismAvoid any kind of Copy & Paste!
Cite ALL of the scientific works, ideas and the concepts you use!
What if …?
Seminar Grade = 5.0
The responsible Department at TUM will initial the investigation officially
Chair of Robotics, Artificial Intelligence and Real-time Systems
General Information and Resources (Hyperlinks)
IEEE latex template for writing scientific papers
Latex Editor for the final report
A good reference on How to Write a Scientific Paper
Your presentation must not be like this!
A useful tool to manage your references and citations