Researchers have developed Integral Transform Network (ITNet), a unified neural architecture that shows convolutional networks, recurrent networks, and transformers are special cases of a single learnable integral transform rather than fundamentally different approaches. The architecture uses an MLP-based kernel to model pairwise interactions and matches or exceeds specialized models across vision, language, and multimodal benchmarks including ImageNet-1K and GLUE.
Why it matters: This theoretical breakthrough could simplify deep learning architecture design and training, potentially reducing the need to choose between specialized models and enabling more flexible, adaptable systems across domains.