Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Is it common to discard all ‘non-full-time-trajectory’ pedestrian from data in pedestrian prediction domain?

[D] Is it common to discard all 'non-full-time-trajectory' pedestrian from data in pedestrian prediction domain?

Hello, I’m relatively new to pedestrian trajectory prediction, and I have a question about the preprocessing of pedestrian data. Actually, I sent the question e-mail for the author of SocialGAN paper and no reply was received (yet). If anyone knows anything about this topic, your answer would help me a lot.

It is quite common that may ML pedestrian trajectory prediction employs LSTM in order to capture the temporal correlation. For example, look at the diagram below from the paper ”Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks’ (https://arxiv.org/pdf/1803.10892.pdf ). In the paper, the author said that the model was trained to observe 3.2 seconds (8f) then predict 4.8 seconds (12f).

Adopted from SocialGAN paper. Since some of the trajectories terminates earlier than others, LSTM steps should handle this uneven data length.

However, since the common pedestrian dataset (like ETH and UCY) collected from a single still camera, trajectory terminates when the agent moves out of sight. Thus, unlike the conceptual diagram in many papers, we cannot simply gather whole data for 3.2sec (8f) and let them consist of LSTM hidden state.

Therefore, I’ve curious about how other researchers handled this missing value problem.

While following the author’s code on GitHub, I’ve found that the code discarded every individual when their trajectory throughout the given interval is shorter than its whole interval length. (https://github.com/agrimgupta92/sgan/blob/master/sgan/data/trajectories.py, an excerpt from class ‘TrajectoryDataset’)

When the SocialGAN paper preprocesses the raw pedestrian data into pytorch dataset, agents with short trajectory length discarded from the dataset and not used for inference.

In result, the final dataset for training does not contain a nonnegligible portion of trajectories despite that some of them could exist near to ‘considered’ individuals and may affect their trajectories.

I want to raise two questions on this point.

Am I correctly understanding the code and its consequences? I would like to show a small toy example in order to visualize and verify my understandings of SocialGAN code.

Let us assume that seq_len = 4 (obs_len = 2, pred_len = 2) and consider the frame 1~4. Then, according to my understandings, SocialGAN code discards green agent(below) because it does not have a full 4-frame trajectory.

Crude visualization of pedestrian dataset.

But, in my opinion, I think the green agent would affect red agent’s trajectory a lot compared to blue one does. Although the missing value of the trajectory is an inevitable problem of the dataset itself (since no one can collect an indefinite amount of tracked position), ignoring the whole trajectory because of those value might decrease the overall accuracy.

Furthermore, since this preprocessing handles frame-based trajectory data with a moving window (i.e. it makes frame 1~4 as a first data and 2~5 as second, 3~6 as third, and so on.), an agent like green might affect red agent in some data, while discarded in other data2.

– Secondly, (under the assumption that my understanding of the situation is correct), Is this kind of preprocessing (discarding non-full-trajectory individuals) common for preprocessing of the public pedestrian datasets such as ETH and UCY, or is this preprocessing was particular for SocialGAN and there are other ways to preprocess pedestrian data? I’d like to know that is there another way to deal with this missing value problem.

I’m currently trying to implement a model and preprocessing which can alleviate this problem, but I’m not certain whether this problem had been already addressed and well-known in CV/ML society, or it hadn’t.

Again, any response or shared opinion will be grateful 🙂

submitted by /u/nokpil
[link] [comments]

Next Meetup

 

Days
:
Hours
:
Minutes
:
Seconds

 

Plug yourself into AI and don't miss a beat

 


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.