原则异步电路设计--以系统观点
Preface xi
Part I Asynchronous circuit design – A tutorial
Author: Jens Sparsø
1
IntroducTIon 3
1.1 Why consider asynchronous circuits? 3
1.2 Aims and background 4
1.3 Clocking versus handshaking 5
1.4 Outline of Part I 8
2
Fundamentals 9
2.1 Handshake protocols 9
2.1.1 Bundled-data protocols 9
2.1.2 The 4-phase dual-rail protocol 11
2.1.3 The 2-phase dual-rail protocol 13
2.1.4 Other protocols 13
2.2 The Muller C-element and the indicaTIon principle 14
2.3 The Muller pipeline 16
2.4 Circuit implementaTIon styles 17
2.4.1 4-phase bundled-data 18
2.4.2 2-phase bundled data (Micropipelines) 19
2.4.3 4-phase dual-rail 20
2.5 Theory 23
2.5.1 The basics of speed-independence 23
2.5.2 ClassificaTIon of asynchronous circuits 25
2.5.3 Isochronic forks 26
2.5.4 Relation to circuits 26
2.6 Test 27
2.7 Summary 28
3
Static data-flow structures 29
3.1 Introduction 29
3.2 Pipelines and rings 30
v
vi PRINCIPLES OF ASYNCHRONOUS CIRCUIT DESIGN
3.3 Building blocks 31
3.4 A simple example 33
3.5 Simple applications of rings 35
3.5.1 Sequential circuits 35
3.5.2 Iterative computations 35
3.6 FOR, IF, and WHILE constructs 36
3.7 A more complex example: GCD 38
3.8 Pointers to additional examples 39
3.8.1 A low-power filter bank 39
3.8.2 An asynchronous microprocessor 39
3.8.3 A fine-grain pipelined vector multiplier 40
3.9 Summary 40
4
Performance 41
4.1 Introduction 41
4.2 A qualitative view of performance 42
4.2.1 Example 1: A FIFO used as a shift register 42
4.2.2 Example 2: A shift register with parallel load 44
4.3 Quantifying performance 47
4.3.1 Latency, throughput and wavelength 47
4.3.2 Cycle time of a ring 49
4.3.3 Example 3: Performance of a 3-stage ring 51
4.3.4 Final remarks 52
4.4 Dependency graph analysis 52
4.4.1 Example 4: Dependency graph for a pipeline 52
4.4.2 Example 5: Dependency graph for a 3-stage ring 54
4.5 Summary 56
5
Handshake circuit implementations 57
5.1 The latch 57
5.2 Fork, join, and merge 58
5.3 Function blocks – The basics 60
5.3.1 Introduction 60
5.3.2 Transparency to handshaking 61
5.3.3 Review of ripple-carry addition 64
5.4 Bundled-data function blocks 65
5.4.1 Using matched delays 65
5.4.2 Delay selection 66
5.5 Dual-rail function blocks 67
5.5.1 Delay insensitive minterm synthesis (DIMS) 67
5.5.2 Null Convention Logic 69
5.5.3 Transistor-level CMOS implementations 70
5.5.4 Martin’s adder 71
5.6 Hybrid function blocks 73
5.7 MUX and DEMUX 75
5.8 Mutual exclusion, arbitration and metastability 77
5.8.1 Mutual exclusion 77
5.8.2 Arbitration 79
5.8.3 Probability of metastability 79
Contents vii
5.9 Summary 80
6
Speed-independent control circuits 81
6.1 Introduction 81
6.1.1 Asynchronous sequential circuits 81
6.1.2 Hazards 82
6.1.3 Delay models 83
6.1.4 Fundamental mode and input-output mode 83
6.1.5 Synthesis of fundamental mode circuits 84
6.2 Signal transition graphs 86
6.2.1 Petri nets and STGs 86
6.2.2 Some frequently used STG fragments 88
6.3 The basic synthesis procedure 91
6.3.1 Example 1: a C-element 92
6.3.2 Example 2: a circuit with choice 92
6.3.3 Example 2: Hazards in the simple gate implementation 94
6.4 Implementations using state-holding gates 96
6.4.1 Introduction 96
6.4.2 Excitation regions and quiescent regions 97
6.4.3 Example 2: Using state-holding elements 98
6.4.4 The monotonic cover constraint 98
6.4.5 Circuit topologies using state-holding elements 99
6.5 Initialization 101
6.6 Summary of the synthesis process 101
6.7 Petrify: A tool for synthesizing SI circuits from STGs 102
6.8 Design examples using Petrify 104
6.8.1 Example 2 revisited 104
6.8.2 Control circuit for a 4-phase bundled-data latch 106
6.8.3 Control circuit for a 4-phase bundled-data MUX 109
6.9 Summary 113
7
Advanced 4-phase bundled-data
protocols and circuits
115
7.1 Channels and protocols 115
7.1.1 Channel types 115
7.1.2 Data-validity schemes 116
7.1.3 Discussion 116
7.2 Static type checking 118
7.3 More advanced latch control circuits 119
7.4 Summary 121
8
High-level languages and tools 123
8.1 Introduction 123
8.2 Concurrency and message passing in CSP 124
8.3 Tangram: program examples 126
8.3.1 A 2-place shift register 126
8.3.2 A 2-place (ripple) FIFO 126
viii PRINCIPLES OF ASYNCHRONOUS CIRCUIT DESIGN
8.3.3 GCD using while and if statements 127
8.3.4 GCD using guarded commands 128
8.4 Tangram: syntax-directed compilation 128
8.4.1 The 2-place shift register 129
8.4.2 The 2-place FIFO 130
8.4.3 GCD using guarded repetition 131
8.5 Martin’s translation process 133
8.6 Using VHDL for asynchronous design 134
8.6.1 Introduction 134
8.6.2 VHDL versus CSP-type languages 135
8.6.3 Channel communication and design flow 136
8.6.4 The abstract channel package 138
8.6.5 The real channel package 142
8.6.6 Partitioning into control and data 144
8.7 Summary 146
Appendix: The VHDL channel packages 148
A.1 The abstract channel package 148
A.2 The real channel package 150
Part II Balsa - An Asynchronous Hardware Synthesis System
Author: Doug Edwards, Andrew Bardsley
9
An introduction to Balsa 155
9.1 Overview 155
9.2 Basic concepts 156
9.3 Tool set and design flow 159
9.4 Getting started 159
9.4.1 A single-place buffer 161
9.4.2 Two-place buffers 163
9.4.3 Parallel composition and module reuse 164
9.4.4 Placing multiple structures 165
9.5 Ancillary Balsa tools 166
9.5.1 Makefile generation 166
9.5.2 Estimating area cost 167
9.5.3 Viewing the handshake circuit graph 168
9.5.4 Simulation 168
10
The Balsa language 173
10.1 Data types 173
10.2 Data typing issues 176
10.3 Control flow and commands 178
10.4 Binary/unary operators 181
10.5 Program structure 181
10.6 Example circuits 183
10.7 Selecting channels 190
Contents ix
11
Building library components 193
11.1 Parameterised descriptions 193
11.1.1 A variable width buffer definition 193
11.1.2 Pipelines of variable width and depth 194
11.2 Recursive definitions 195
11.2.1 An n-way multiplexer 195
11.2.2 A population counter 197
11.2.3 A Balsa shifter 200
11.2.4 An arbiter tree 202
12
A simple DMA controller 205
12.1 Global registers 205
12.2 Channel registers 206
12.3 DMA controller structure 207
12.4 The Balsa description 211
12.4.1 Arbiter tree 211
12.4.2 Transfer engine 212
12.4.3 Control unit 213
Part III Large-Scale Asynchronous Designs
13
Descale 221
Joep Kessels & Ad Peeters, Torsten Kramer and Volker Timm
13.1 Introduction 222
13.2 VLSI programming of asynchronous circuits 223
13.2.1 The Tangram toolset 223
13.2.2 Handshake technology 225
13.2.3 GCD algorithm 226
13.3 Opportunities for asynchronous circuits 231
13.4 Contactless smartcards 232
13.5 The digital circuit 235
13.5.1 The 80C51 microcontroller 236
13.5.2 The prefetch unit 239
13.5.3 The DES coprocessor 241
13.6 Results 243
13.7 Test 245
13.8 The power supply unit 246
13.9 Conclusions 247
14
An Asynchronous Viterbi Decoder 249
Linda E. M. Brackenbury
14.1 Introduction 249
14.2 The Viterbi decoder 250
14.2.1 Convolution encoding 250
14.2.2 Decoder principle 251
14.3 System parameters 253
14.4 System overview 254
x PRINCIPLES OF ASYNCHRONOUS CIRCUIT DESIGN
14.5 The Path Metric Unit (PMU) 256
14.5.1 Node pair design in the PMU 256
14.5.2 Branch metrics 259
14.5.3 Slot timing 261
14.5.4 Global winner identification 262
14.6 The History Unit (HU) 264
14.6.1 Principle of operation 264
14.6.2 History Unit backtrace 264
14.6.3 History Unit implementation 267
14.7 Results and design evaluation 269
14.8 Conclusions 271
14.8.1 Acknowledgement 272
14.8.2 Further reading 272
15
Processors 273
Jim D. Garside
15.1 An introduction to the Amulet processors 274
15.1.1 Amulet1 (1994) 274
15.1.2 Amulet2e (1996) 275
15.1.3 Amulet3i (2000) 275
15.2 Some other asynchronous microprocessors 276
15.3 Processors as design examples 278
15.4 Processor implementation techniques 279
15.4.1 Pipelining processors 279
15.4.2 Asynchronous pipeline architectures 281
15.4.3 Determinism and non-determinism 282
15.4.4 Dependencies 288
15.4.5 Exceptions 297
15.5 Memory – a case study 302
15.5.1 Sequential accesses 302
15.5.2 The Amulet3i RAM 303
15.5.3 Cache 307
15.6 Larger asynchronous systems 310
15.6.1 System-on-Chip (DRACO) 310
15.6.2 Interconnection 310
15.6.3 Balsa and the DMA controller 312
15.6.4 Calibrated time delays 313
15.6.5 Production test 314
15.7 Summary 315
Epilogue 317
References 319
Index 333