TY - GEN
T1 - Increasing power efficiency of multi-core network processors through data filtering
AU - Memik, Gokhan
AU - Mangione-Smith, William H.
PY - 2002
Y1 - 2002
N2 - We propose and evaluate a data filtering method to reduce the power consumption of high-end processors with multiple execution cores. Although the proposed method can be applied to a wide variety of multi-processor systems including MPPs, SMPs and any type of single-chip multiprocessor, we concentrate on Network Processors. The proposed method uses an execution unit called Data Filtering Engine that processes data with low temporal locality before it is placed on the system bus. The execution cores use locality to decide which load instructions have low temporal locality and which portion of the surrounding code should be off-loaded to the data filtering engine.Our technique reduces the power consumption, because a) the low temporal data is processed on the data filtering engine before it is placed onto the high capacitance system bus, and b) the conflict misses caused by low temporal data are reduced resulting in fewer accesses to the L2 cache. Specifically, we show that our technique reduces the bus accesses in representative applications by as much as 46.8% (26.5% on average) and reduces the overall power by as much as 15.6% (8.6% on average) on a single-core processor. It also improves the performance by as much as 76.7% (29.7% on average) for a processor with 16 execution cores.
AB - We propose and evaluate a data filtering method to reduce the power consumption of high-end processors with multiple execution cores. Although the proposed method can be applied to a wide variety of multi-processor systems including MPPs, SMPs and any type of single-chip multiprocessor, we concentrate on Network Processors. The proposed method uses an execution unit called Data Filtering Engine that processes data with low temporal locality before it is placed on the system bus. The execution cores use locality to decide which load instructions have low temporal locality and which portion of the surrounding code should be off-loaded to the data filtering engine.Our technique reduces the power consumption, because a) the low temporal data is processed on the data filtering engine before it is placed onto the high capacitance system bus, and b) the conflict misses caused by low temporal data are reduced resulting in fewer accesses to the L2 cache. Specifically, we show that our technique reduces the bus accesses in representative applications by as much as 46.8% (26.5% on average) and reduces the overall power by as much as 15.6% (8.6% on average) on a single-core processor. It also improves the performance by as much as 76.7% (29.7% on average) for a processor with 16 execution cores.
KW - Chip multiprocessors
KW - Data locality
KW - Network processors
KW - Power reduction
KW - Remote procedure call
UR - http://www.scopus.com/inward/record.url?scp=77953016638&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77953016638&partnerID=8YFLogxK
U2 - 10.1145/581630.581647
DO - 10.1145/581630.581647
M3 - Conference contribution
AN - SCOPUS:77953016638
SN - 1581135750
SN - 9781581135756
T3 - Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02
SP - 108
EP - 116
BT - Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02
T2 - 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES '02
Y2 - 8 October 2002 through 11 October 2002
ER -