Human Part Semantic Segmentation Using CDGNET Architecture for Human Activity Recognition

Mayank Lovanshi; Vivek Tiwari

doi:10.52228/JRUB.2023-36-1-3

Journal of Ravishankar University

Pt. Ravishankar Shukla University, Raipur, Chhattisgarh

PART-B

(SCIENCE)

ISSN: 0970-5910

Abstract View

Submit Article

Human Part Semantic Segmentation Using CDGNET Architecture for Human Activity Recognition

Author(s): Mayank Lovanshi, Vivek Tiwari

Email(s): mayank@iiitnr.edu.in

Address: International Institute of Information Technology (IIIT), Naya Raipur, Chhattisgarh, India.

*Corresponding Author: mayank@iiitnr.edu.in

Published In: Volume - 36, Issue - 1, Year - 2023

DOI: 10.52228/JRUB.2023-36-1-3

View HTML

View PDF

ABSTRACT:
The segmentation of human body parts is a task that entails assigning labels to pixels in an image to identify the corresponding body part classes. To enhance accuracy, a technique known as sample class distribution was developed, considering the hierarchical structure of the human body and the unique positioning of each part. This technique involves gathering and applying primary human parsing labels in both vertical and horizontal dimensions to exploit the distribution of classes. By combining these guided features, a spatial guidance map is generated and incorporated into the backbone network. These semantic-guided features contribute to the effective recognition of human activity through semantic segmentation-enabled human pose. To assess the effectiveness of this approach, extensive experiments were performed on a large dataset called CIHP, using metrics such as mean IOU, pixel accuracy, and mean accuracy.

Keywords:

Cite this article:
Lovanshi and Tiwari (2023). Human Part Semantic Segmentation Using CDGNET Architecture for Human Activity Recognition. Journal of Ravishankar University (Part-B: Science), 36(1), pp. 18-25.DOI: https://doi.org/10.52228/JRUB.2023-36-1-3

References

[1] Badrinarayanan, V., Kendall, A., and Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12):2481– 2495.

[2] Bose, K., Shubham, K., Tiwari, V., and Patel, K. S. (2022). Insect image semantic segmentation and identification using unet and deeplab v3+. In ICT Infrastructure and Computing: Proceedings of ICT4SD 2022, pages 703–711. Springer.

[3] Chen, L.-C., Yang, Y., Wang, J., Xu, W., and Yuille, A. L. (2016). Attention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3640–3649.

[4] Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018). Encoder decoder with atrous separable convolution for semantic image seg- mentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818.

[5] Choi, S., Kim, J. T., and Choo, J. (2020). Cars can’t fly up in the sky:
Improving urban-scene segmentation via height-driven attention networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9373–9383.

[6] Gong, K., Gao, Y., Liang, X., Shen, X., Wang, M., and Lin, L. (2019). Graphonomy: Universal human parsing via graph transfer learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7450–7459. 10

[7] Gong, K., Liang, X., Li, Y., Chen, Y., Yang, M., and Lin, L. (2018). Instance-level human parsing via part grouping network. In Proceedings of the European conference on computer vision (ECCV), pages 770–785.

[8] Hu, J., Shen, L., and Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141.

[9] Ji, R., Du, D., Zhang, L., Wen, L., Wu, Y., Zhao, C., Huang, F., and Lyu, S. (2020). Learning semantic neural tree for human parsing. In Computer Vision–ECCV 2020: 16^th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16, pages 205–221. Springer.

[10] Kashyap, R. and Tiwari, V. (2017). Energy-based active contour method for image segmentation. International Journal of Electronic Healthcare, 9(2-3):210–225.

[11] Kashyap, R. and Tiwari, V. (2018). Active contours using global models for medical image segmentation. International Journal of Computational Systems Engineering, 4(2-3):195–201.

[12] Li, P., Xu, Y., Wei, Y., and Yang, Y. (2020). Self-correction for human parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(6):3260–3271.

[13] Liu, K., Choi, O., Wang, J., and Hwang, W. (2022). Cdgnet: Class distribution guided network for human parsing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4473–4482.

[14] Lovanshi, M. and Tiwari, V. (2023). Human pose estimation: Bench- marking deep learning-based methods. In proceedings of the IEEE Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation.

[15] Patel, A. S., Vyas, R., Vyas, O., Ojha, M., and Tiwari, V. (2022). Motion-compensated online object tracking for activity detection and crowd behavior analysis. The Visual Computer, pages 1–21. [16] Rochan, M. et al. (2018). Future semantic segmentation with convolutional lstm. arXiv preprint arXiv:1807.07946. 11

[16] Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5 9, 2015, Proceedings, Part III 18, pages 234–241. Springer.

[17] Ruan, T., Liu, T., Huang, Z., Wei, Y., Wei, S., and Zhao, Y. (2019). Devil in the details: Towards accurate single and multiple human parsing. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 4814–4821.

[18] Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., and Cottrell, G. (2018). Understanding convolution for semantic segmentation. In 2018 IEEE winter conference on applications of computer vision (WACV), pages 1451–1460. Ieee.

[19] Woo, S., Park, J., Lee, J.-Y., and Kweon, I. S. (2018). Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), pages 3–19.

[20] Zhang, X., Chen, Y., Zhu, B., Wang, J., and Tang, M. (2020a). Part-aware context network for human parsing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8971–8980.

[21] Zhang, Z., Su, C., Zheng, L., and Xie, X. (2020b). Correlating edge, pose with parsing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8900–8909.

[22] Zhou, B., Zhao, H., Puig, X., Xiao, T., Fidler, S., Barriuso, A., and Torralba, A. (2019). Semantic understanding of scenes through the ade20k dataset. International Journal of Computer Vision, 127:302–321.

Related Images: