UM  > Faculty of Science and Technology
Residential Collegefalse
Improving Concurrent GC for Latency Critical Services in Multi-tenant Systems
Zhao, Junxian1; Pi, Aidi1; Zhou, Xiaobo1; Chang, Sang Yoon1; Xu, Chengzhong2
Conference Name23rd ACM/IFIP International Middleware Conference, Middleware 2022
Source PublicationMiddleware 2022 - Proceedings of the 23rd ACM/IFIP International Middleware Conference
Conference Date7 November - 11 November 2022
Conference PlaceQuebec

For resource utilization efficiency, latency critical (LC) services are commonly co-located with best-effort batch jobs in datacenter servers. Many LC services, such as Cassandra and HBase, run in Java Virtual Machine (JVM). We find that LC services often experience heavy-tailed latency due to performance interference of the concurrent garbage collection (GC) as well as multi-tenancy. The root cause is a semantic gap of resource allocation between JVM and the underlying Linux OS in multi-tenant systems. That is, the OS is unaware of the characteristics of different kinds of threads in JVM (i.e., GC threads and LC worker threads), which may lead to GC threads competing for CPUs; JVM is unaware of the resource utilization in the OS, which may trigger CPU-intensive GC operations when CPUs are busy. Furthermore, we find that co-located batch jobs can interfere with LC services due to Simultaneous Multi-Threading (SMT). We propose iGC, a middleware that bridges the semantic gap between JVM and Linux OS and improves concurrent GC performance in multi-tenant systems. iGC adaptively triggers GC based on the CPU utilization at runtime, which speeds up the GC process and reduces its CPU contention. Furthermore, iGC deploys a dynamic CPU scheduling and thread placement strategy to avoid or mitigate the interference due to concurrent GC and multi-tenancy, but also improve the cache performance. We implement iGC upon two state-of-the-art concurrent GC mechanisms ZGC and G1 GC. We conduct its evaluation using three NoSQL databases as LC services. Experimental results show that iGC significantly improves the performance of concurrent GC for LC services and the throughput in multi-tenant systems. iGC reduces the p95 tail latency by 83%, 37% and 22% for the three LC services Cassandra, HBase and Solr, respectively. It also increases the throughput of LC services up to 2.56X.

KeywordCpu Scheduling Garbage Collection Interference Job Co-location Tail Latency
URLView the original
Scopus ID2-s2.0-85132298339
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionFaculty of Science and Technology
Affiliation1.University of Colorado, Colorado Springs, United States
2.University of Macau, Macao
Recommended Citation
GB/T 7714
Zhao, Junxian,Pi, Aidi,Zhou, Xiaobo,et al. Improving Concurrent GC for Latency Critical Services in Multi-tenant Systems[C],2022:43-55.
APA Zhao, Junxian,Pi, Aidi,Zhou, Xiaobo,Chang, Sang Yoon,&Xu, Chengzhong.(2022).Improving Concurrent GC for Latency Critical Services in Multi-tenant Systems.Middleware 2022 - Proceedings of the 23rd ACM/IFIP International Middleware Conference,43-55.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zhao, Junxian]'s Articles
[Pi, Aidi]'s Articles
[Zhou, Xiaobo]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zhao, Junxian]'s Articles
[Pi, Aidi]'s Articles
[Zhou, Xiaobo]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zhao, Junxian]'s Articles
[Pi, Aidi]'s Articles
[Zhou, Xiaobo]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.