Clustering-based Model Compression Method for Deep Neural Networks

Byungchul Chae; Seonyeong Heo

Clustering-based Model Compression Method for Deep Neural Networks

Byungchul Chae

Seonyeong Heo

Vol. 13, No. 11, pp. 585-589, Nov. 2024

https://doi.org/10.3745/TKIPS.2024.13.11.585

On-device Machine Learning

Model Compression

Kernel Clustering

PDF

Abstract

On-device machine learning is becoming more popular for its strengths in cost efficiency, data privacy, and responsiveness. However, processing deep neural network models on small embedded systems is challenging due to their limited memory capacity. Previous work has proposed various model compression techniques, such as quantization and pruning. However, the techniques generally require careful fine-tuning with proper data samples to minimize accuracy loss from compression. This work proposes a new post-training model compression method that compresses the input model by clustering and pruning similar convolution kernels. The proposed method does not require data samples because it considers the similarity between kernels only. This work evaluates the proposed method with representative neural network models and demonstrates that the method can effectively reduce memory usage on average with small accuracy loss.

Statistics

Cite this article

[IEEE Style]

B. Chae and S. Heo, "Clustering-based Model Compression Method for Deep Neural Networks," The Transactions of the Korea Information Processing Society, vol. 13, no. 11, pp. 585-589, 2024. DOI: https://doi.org/10.3745/TKIPS.2024.13.11.585.

[ACM Style]

Byungchul Chae and Seonyeong Heo. 2024. Clustering-based Model Compression Method for Deep Neural Networks. The Transactions of the Korea Information Processing Society, 13, 11, (2024), 585-589. DOI: https://doi.org/10.3745/TKIPS.2024.13.11.585.

Clustering-based Model Compression Method for Deep Neural Networks

Submenu

Forms

Search
(IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

Advanced Search

Recent Publications
(LAST 3 YEARS)

Old Journals

Indexing

Related Journals

Clustering-based Model Compression Method for Deep Neural Networks

Submenu

Forms

Search (IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

Advanced Search

POPULAR KEYWORDS(TOP 10 KEYWORDS)

Recent Publications(LAST 3 YEARS)

Old Journals

Indexing

Related Journals

Search
(IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

POPULAR KEYWORDS
(TOP 10 KEYWORDS)

Recent Publications
(LAST 3 YEARS)