1
Efficient Custom Instruction Identification with
Exact Enumeration
Pan Yu
and Tulika Mitra
May 21, 2007
DRAFT
2
Abstract
Extensible processors allow addition of application-specific custom instructions to the core instruc-
tion set architecture. These custom instructions are selected through an analysis of the program’s dataflow
graphs. The characteristics of certain applications and the modern compiler optimization techniques
(e.g., loop unrolling, region formation, etc.) have lead to substantially larger dataflow graphs. Hence,
it is computationally expensive to automatically select the optimal set of custom instructions. Heuristic
techniques are often employed to quickly search the design space. In order to leverage full potential of
custom instructions, our previous work [P. Yu and T. Mitra, CASES, 2004, p.69] proposed an efficient
algorithm for exact enumeration of all possible candidate instructions (or patterns) given the dataflow
graphs. But the algorithm was restricted to connected computation patterns. In this report, we describe
efficient algorithms to generate all feasible patterns (connected and disjoint) based on our previous
algorithm. More optimization techniques are used in both connected and disjoint pattern enumeration,
resulting in further reduction of search space.
Keywords: ASIPs, customizable processors, custom instruction, instruction-set extensions, sub-
graph enumeration algorithm.
I. INTRODUCTION
The transition from desktop to embedded computing has made it crucial to design high
performance, low cost embedded software/hardware systems within very short time-to-market
window. The conventional approach of designing “hand-crafted” ASIC is too expensive and
inflexible. On the other hand, general purpose processors, while inexpensive, are yet to meet
the demanding performance requirement and usually consume too much power. These factors
have resulted in the emergence of instruction-set extensible processors that consist of an existing
processor core extended with application-specific cust