Flexible Capacity Partitioning in Many-Core Tiled CMPs

2013 
Chip Multi-Processors (CMP) have become a mainstream computing platform. As transistor density shrinks and the number of cores increases, more scalable CMP architectures will emerge. Recently, tiled architectures have shown such scalable characteristics and been used in many industry chips. The memory hierarchy in tiled architectures presents interesting design challenges. One major challenge is the organization of the Last Level Cache (LLC). Shared but distributed LLCs are preferred over private LLCs due to better utilization of the aggregate cache capacity. However, such architectures suffer from high on-chip hit latency. Breaking down the the shared LLC into smaller domains called clusters where each cluster is associated with one processor VM can reduce the on-chip hit latency significantly. However, having static cluster sizes may not be the best option as some processes may need more cache capacity than others. In this paper, we propose a novel inter-cluster capacity partitioning scheme called Flexible TiledCMP Capacity Partitioning (FlexTCP). FlexTCP maintains the small hit latency of cluster caches while at the same time enables flexible capacity partitioning across clusters such that clusters with high cache demand can steal capacity from underutilized clusters. FlexTCP proposes multiple ways of shrinking/expanding the cluster size. When applied to a 64-coretiled-CMP running a mix of SPEC CPU2006 and Parsec 2.1 workloads, FlexTCP achieves an average of 21% and 18% improvement in Weighted Speedup over two rival schemes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []