Neural Network Models

ITKIT includes implementations of state-of-the-art neural networks for medical image segmentation. All models are implemented with PyTorch and are compatible with ITKIT's training frameworks.

Available Models

1. DA-TransUNet

Description: Integrates spatial and channel dual attention with Transformer U-Net for medical image segmentation.

Reference: Sun G, Pan Y, Kong W, Xu Z, Ma J, Racharak T, Nguyen L-M and Xin J (2024) DA-TransUNet: integrating spatial and channel dual attention with transformer U-net for medical image segmentation. Front. Bioeng. Biotechnol. 12:1398237.

DOI: 10.3389/fbioe.2024.1398237

2. DconnNet

Description: Directional Connectivity-based Segmentation for medical images.

Reference: Z. Yang and S. Farsiu, "Directional Connectivity-based Segmentation of Medical Images," in CVPR, 2023, pp. 11525-11535.

Link: IEEE Xplore

3. LM_Net

Description: A light-weight and multi-scale network for medical image segmentation.

Reference: Zhenkun Lu, Chaoyin She, Wei Wang, Qinghua Huang. LM-Net: A light-weight and multi-scale network for medical image segmentation. Computers in Biology and Medicine. Volume 168, 2024. 107717, ISSN 0010-4825.

DOI: 10.1016/j.compbiomed.2023.107717

4. MedNeXt

Description: Transformer-driven Scaling of ConvNets for Medical Image Segmentation.

Reference: Roy, S., Koehler, G., Ulrich, C., Baumgartner, M., Petersen, J., Isensee, F., Jaeger, P.F. & Maier-Hein, K. (2023). MedNeXt: Transformer-driven Scaling of ConvNets for Medical Image Segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2023.

Link: Springer

5. SegFormer

Description: Simple and Efficient Design for Semantic Segmentation with Transformers.

Reference: SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo. NeurIPS 2021.

Link: NeurIPS

6. SegFormer3D

Description: An Efficient Transformer for 3D Medical Image Segmentation.

Reference: Perera, Shehan and Navard, Pouyan and Yilmaz, Alper. SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Link: CVPR

7. SwinUMamba

Description: Mamba-Based UNet with ImageNet-Based Pretraining.

Reference: Liu, J. et al. (2024). Swin-UMamba: Mamba-Based UNet with ImageNet-Based Pretraining. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15009. Springer, Cham.

DOI: 10.1007/978-3-031-72114-4_59

8. VMamba

Description: Visual State Space Model.

Reference: Liu, Yue and Tian, Yunjie and Zhao, Yuzhong and Yu, Hongtian and Xie, Lingxi and Wang, Yaowei and Ye, Qixiang and Jiao, Jianbin and Liu, Yunfan. VMamba: Visual State Space Model. Advances in Neural Information Processing Systems. 2024. pp. 103031-103063.

Link: NeurIPS

9. DSNet

Description: A Novel Way to Use Atrous Convolutions in Semantic Segmentation.

Reference: Z. Guo, L. Bian, H. Wei, J. Li, H. Ni and X. Huang, "DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 4, pp. 3679-3692, April 2025.

DOI: 10.1109/TCSVT.2024.3509504

arXiv: 2406.03702

10. EfficientFormer

Description: Vision Transformers at MobileNet Speed.

Reference: Li, Yanyu and Yuan, Geng and Wen, Yang and Hu, Ju and Evangelidis, Georgios and Tulyakov, Sergey and Wang, Yanzhi and Ren, Jian. EfficientFormer: Vision Transformers at MobileNet Speed. Advances in Neural Information Processing Systems, 35, 2022.

arXiv: 2212.08059

11. EfficientNet

Description: Rethinking Model Scaling for Convolutional Neural Networks.

Reference: Mingxing Tan, Quoc V. Le, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019.

arXiv: 1905.11946

12. EGE-UNet

Description: An Efficient Group Enhanced UNet for skin lesion segmentation.

Reference: Accepted by 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI2023).

Link: Springer

13. MoCo

Description: Momentum Contrast for Unsupervised Visual Representation Learning.

Reference: K. He, H. Fan, Y. Wu, S. Xie and R. Girshick, "Momentum Contrast for Unsupervised Visual Representation Learning," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 9726-9735.

DOI: 10.1109/CVPR42600.2020.00975

Link: IEEE Xplore

14. SegMamba

Description: Long-Range Sequential Modeling Mamba for 3D Medical Image Segmentation.

Reference: Xing, Z., Ye, T., Yang, Y., Liu, G., Zhu, L. (2024). SegMamba: Long-Range Sequential Modeling Mamba for 3D Medical Image Segmentation. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15008. Springer, Cham.

DOI: 10.1007/978-3-031-72111-3_54

15. UNet3+

Description: A Full-Scale Connected UNet for Medical Image Segmentation.

Reference: H. Huang et al., "UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1055-1059.

DOI: 10.1109/ICASSP40776.2020.9053405

Link: IEEE Xplore

16. UNETR

Description: Transformers for 3D Medical Image Segmentation.

Reference: A. Hatamizadeh et al., "UNETR: Transformers for 3D Medical Image Segmentation," 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2022, pp. 1748-1758.

DOI: 10.1109/WACV51458.2022.00181

Link: IEEE Xplore

Usage

All models are located in the itkit/models directory and can be imported and used with ITKIT's training frameworks.

Example with OneDL-MMEngine

from itkit.models import DA_TransUNet

# Model config in your experiment config file
model = dict(
    type=DA_TransUNet,
    in_channels=1,
    num_classes=3,
    # ... other parameters
)

Example with PyTorch

from itkit.models import SegFormer3D

# Initialize model
model = SegFormer3D(
    in_channels=1,
    num_classes=3
)

# Use in training
output = model(input_tensor)