Comparing 6 decoder architectures for mapping primate neural activity to semantic image embeddings
| Architecture | Top-1 (%) | Top-5 (%) | Med. Rank | Cosine Sim | Params | Time (s) |
|---|---|---|---|---|---|---|
| Linear | 5.2 +/- 0.8 | 19.8 +/- 1.5 | 42.3 +/- 3.1 | 0.312 | 26K | 0.8 |
| MLP | 11.8 +/- 1.2 | 35.6 +/- 2.1 | 24.7 +/- 2.4 | 0.458 | 132K | 3.2 |
| TA-MLP | 15.6 +/- 1.4 | 43.2 +/- 2.4 | 18.2 +/- 2.1 | 0.524 | 148K | 4.5 |
| Temporal CNN | 14.2 +/- 1.6 | 40.8 +/- 2.6 | 20.4 +/- 2.6 | 0.498 | 199K | 5.8 |
| Deep MLP | 13.4 +/- 1.8 | 39.2 +/- 2.8 | 21.8 +/- 2.9 | 0.482 | 525K | 7.1 |
| Wide MLP | 12.8 +/- 1.5 | 37.8 +/- 2.5 | 22.6 +/- 2.7 | 0.471 | 1050K | 9.4 |