Neural Network Models
ITKIT includes implementations of state-of-the-art neural networks for medical image segmentation. All models are implemented with PyTorch and are compatible with ITKIT's training frameworks.
Available Models
1. DA-TransUNet
Description: Integrates spatial and channel dual attention with Transformer U-Net for medical image segmentation.
Reference: Sun G, Pan Y, Kong W, Xu Z, Ma J, Racharak T, Nguyen L-M and Xin J (2024) DA-TransUNet: integrating spatial and channel dual attention with transformer U-net for medical image segmentation. Front. Bioeng. Biotechnol. 12:1398237.
DOI: 10.3389/fbioe.2024.1398237
2. DconnNet
Description: Directional Connectivity-based Segmentation for medical images.
Reference: Z. Yang and S. Farsiu, "Directional Connectivity-based Segmentation of Medical Images," in CVPR, 2023, pp. 11525-11535.
Link: IEEE Xplore
3. LM_Net
Description: A light-weight and multi-scale network for medical image segmentation.
Reference: Zhenkun Lu, Chaoyin She, Wei Wang, Qinghua Huang. LM-Net: A light-weight and multi-scale network for medical image segmentation. Computers in Biology and Medicine. Volume 168, 2024. 107717, ISSN 0010-4825.
DOI: 10.1016/j.compbiomed.2023.107717
4. MedNeXt
Description: Transformer-driven Scaling of ConvNets for Medical Image Segmentation.
Reference: Roy, S., Koehler, G., Ulrich, C., Baumgartner, M., Petersen, J., Isensee, F., Jaeger, P.F. & Maier-Hein, K. (2023). MedNeXt: Transformer-driven Scaling of ConvNets for Medical Image Segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2023.
Link: Springer
5. SegFormer
Description: Simple and Efficient Design for Semantic Segmentation with Transformers.
Reference: SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo. NeurIPS 2021.
Link: NeurIPS
6. SegFormer3D
Description: An Efficient Transformer for 3D Medical Image Segmentation.
Reference: Perera, Shehan and Navard, Pouyan and Yilmaz, Alper. SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Link: CVPR
7. SwinUMamba
Description: Mamba-Based UNet with ImageNet-Based Pretraining.
Reference: Liu, J. et al. (2024). Swin-UMamba: Mamba-Based UNet with ImageNet-Based Pretraining. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15009. Springer, Cham.
DOI: 10.1007/978-3-031-72114-4_59
8. VMamba
Description: Visual State Space Model.
Reference: Liu, Yue and Tian, Yunjie and Zhao, Yuzhong and Yu, Hongtian and Xie, Lingxi and Wang, Yaowei and Ye, Qixiang and Jiao, Jianbin and Liu, Yunfan. VMamba: Visual State Space Model. Advances in Neural Information Processing Systems. 2024. pp. 103031-103063.
Link: NeurIPS
9. DSNet
Description: A Novel Way to Use Atrous Convolutions in Semantic Segmentation.
Reference: Z. Guo, L. Bian, H. Wei, J. Li, H. Ni and X. Huang, "DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 4, pp. 3679-3692, April 2025.
DOI: 10.1109/TCSVT.2024.3509504
arXiv: 2406.03702
10. EfficientFormer
Description: Vision Transformers at MobileNet Speed.
Reference: Li, Yanyu and Yuan, Geng and Wen, Yang and Hu, Ju and Evangelidis, Georgios and Tulyakov, Sergey and Wang, Yanzhi and Ren, Jian. EfficientFormer: Vision Transformers at MobileNet Speed. Advances in Neural Information Processing Systems, 35, 2022.
arXiv: 2212.08059
11. EfficientNet
Description: Rethinking Model Scaling for Convolutional Neural Networks.
Reference: Mingxing Tan, Quoc V. Le, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019.
arXiv: 1905.11946
12. EGE-UNet
Description: An Efficient Group Enhanced UNet for skin lesion segmentation.
Reference: Accepted by 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI2023).
Link: Springer
13. MoCo
Description: Momentum Contrast for Unsupervised Visual Representation Learning.
Reference: K. He, H. Fan, Y. Wu, S. Xie and R. Girshick, "Momentum Contrast for Unsupervised Visual Representation Learning," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 9726-9735.
DOI: 10.1109/CVPR42600.2020.00975
Link: IEEE Xplore
14. SegMamba
Description: Long-Range Sequential Modeling Mamba for 3D Medical Image Segmentation.
Reference: Xing, Z., Ye, T., Yang, Y., Liu, G., Zhu, L. (2024). SegMamba: Long-Range Sequential Modeling Mamba for 3D Medical Image Segmentation. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15008. Springer, Cham.
DOI: 10.1007/978-3-031-72111-3_54
15. UNet3+
Description: A Full-Scale Connected UNet for Medical Image Segmentation.
Reference: H. Huang et al., "UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1055-1059.
DOI: 10.1109/ICASSP40776.2020.9053405
Link: IEEE Xplore
16. UNETR
Description: Transformers for 3D Medical Image Segmentation.
Reference: A. Hatamizadeh et al., "UNETR: Transformers for 3D Medical Image Segmentation," 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2022, pp. 1748-1758.
DOI: 10.1109/WACV51458.2022.00181
Link: IEEE Xplore
Usage
All models are located in the itkit/models directory and can be imported and used with ITKIT's training frameworks.
Example with OneDL-MMEngine
from itkit.models import DA_TransUNet
# Model config in your experiment config file
model = dict(
type=DA_TransUNet,
in_channels=1,
num_classes=3,
# ... other parameters
)
Example with PyTorch
from itkit.models import SegFormer3D
# Initialize model
model = SegFormer3D(
in_channels=1,
num_classes=3
)
# Use in training
output = model(input_tensor)