AUTHOR INDEX


A B C D E F G H I J K L M N O P Q R S T U V W - Y Z
please wait for entire file to load before selecting a letter

A
Abandah, Gheith A.
Configuration Independent Analysis for Characterizing Shared-Memory Applications
Session: Multiprocessor Performance Evaluation

Ahlander, Anders
The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing
Session: Signal and Image Processing

Ahmad, Ishfaq
Benchmarking the Task Graph Scheduling Algorithms
Session: Scheduling

Al-Furaih, Ibraheem
Memory Hierarchy Management for Iterative Graph Structures
Session: Memory Hierarchy and I/O

Almohammad, Bader
Resource Placements in 2D Tori
Session: Algorithms II

Alsabti, Khaled
An Efficient Parallel Algorithm for High Dimensional Similarity Join
Session: Databases and Sorting

Altman, E.R.
An Enhanced Co-Scheduling Method Using Reduced MS-State Diagrams
Session: Compilers II

Auletta, Vincenzo
Toward a Universal Mapping Algorithm for Accessing Trees in Parallel Memory Systems
Session: Algorithms II

Avresky, Dimiter R.
The Effect of the Router Arbitration Policy on ServerNet(tm) Topolgies
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Azizoglu, M. Cemil
Lower Bounds on Communication Loads and Optimal Placements in Torus Networks
Session: Algorithms II

B
Bae, Seungjo
Vector Prefix and Reduction Computation on Coarse-Grained, Distributed-Memory Parallel Machines
Session: Algorithms I

Bakker, Mirjam G.
Optimizing Parallel Applications for Wide-Area Clusters
Session: Distributed Systems

Bal, Henri E.
Optimizing Parallel Applications for Wide-Area Clusters
Session: Distributed Systems

Bandera, Gerardo
Local Enumeration Techniques for Sparse Algorithms
Session: Compilers I

Banerjee, P.
A Generalized Framework for Global Communication Optimization
Session: Compilers I

Banerjee, Prithviraj
Evaluation of Compiler and Runtime Library Approaches for Supporting Parallel Regular Applications
Session: Compilers I

Bartenstein, P.
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Bassetti, Federico
C++ Expression Templates Performance Issues in Scientific Computing
Session: Performance Prediction and Evaluation

Bender, Carl
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

Benincasa, Milissa
Rapid Development of Real-Time Systems Using RTExpress
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Benveniste, Caroline D.
The Design of COMPASS: An Execution Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors
Session: Multiprocessor Performance Evaluation

Berman, Francine
Performance Prediction in Production Environments
Session: Performance Prediction and Evaluation

Besler, Richard
Rapid Development of Real-Time Systems Using RTExpress
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Bhuyan, Laxmi N.
Impact of Switch Design on the Application Performance of Cache Coherent Multiprocessors
Session: Multiprocessor Performance Evaluation

Bischof, Christian
Automatic Differentiation for Message-Passing Parallel Programs
Session: Mathematical Applications

Bishop, Benjamin
Aggressive Dynamic Execution of Multimedia Kernel Traces
Session: Performance Prediction and Evaluation

Bland, I.M.
Synthesis of a Systolic Array Genetic Algorithm
Session: Algorithms I

Board, Jr., John A.
The Implicit Pipeline Method
Session: Scientific Simulation

Bode, A.
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Bohossian, Vasken
Fault-Tolerant Switched Local Area Networks
Session: Fault Tolerance

Bonang, James
DEEP: A Development Environment for Parallel Programs
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Bornstein, Claudson F.
On the Bisection Width and Expansion of Butterfly Networks
Session: Networks

Bose, Bella
Resource Placements in 2D Tori
Session: Algorithms II

Brassaw, Diane
Rapid Development of Real-Time Systems Using RTExpress
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Broberg, Magnus
VPPB - A Visualization and Performance Prediction Tool for Multithreaded Solaris Programs
Session: Performance and Debugging Tools

Brode, Brian
DEEP: A Development Environment for Parallel Programs
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Bruck, Jehoshua
Fault-Tolerant Switched Local Area Networks
Session: Fault Tolerance

Buck, Bryan
Locality and Performance of Page- and Object-Based DSMs
Session: Software Distributed Shared Memory

Busch, Costas
An Efficient Counting Network
Session: Routing

C
Campbell, Marc
Evaluating ASIC, DSP, and RISC Architectures for Embedded Applications
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Cang, Songluan
Minimizing Total Communication Distance of a Time-Step Optimal Broadcast in Mesh Networks
Session: Communication

Cappello, Peter
Processor Lower Bound Formulas for Array Computations and Parametric Diophantine Systems
Session: Mathematical Applications

Carretero, Jesus
Design and Implementation of a Parallel I/O Runtime System for Irregular Applications
Session: Memory Hierarchy and I/O

Carter, Harold W.
Optimistic Synchronization of Mixed-Mode Simulators
Session: Software Distributed Shared Memory

Catthoor, F.
Code Transformations for Low Power Caching in Embedded Multimedia Processors
Session: Memory Hierarchy and I/O

Chakrabarti, Dhruva R.
Evaluation of Compiler and Runtime Library Approaches for Supporting Parallel Regular Applications
Session: Compilers I

Chanchio, Kasidit
Memory Space Representation for Heterogeneous Network Process Migration
Session: Distributed Systems

Chang, Weng-Long
The Generalized Lambda Test
Session: Compilers II

Chantrapornchai, Chantana
Optimizing Data Scheduling on Processor-In-Memory Arrays
Session: Compilers I

Chen, Pang
Design and Implementation of a Parallel I/O Runtime System for Irregular Applications
Session: Memory Hierarchy and I/O

Choudhary, A.
A Generalized Framework for Global Communication Optimization
Session: Compilers I

Choudhary, Alok
High Performance Data Mining Using Data Cubes on Parallel Computers
Session: Databases and Sorting

Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers
Session: Signal and Image Processing

Design and Implementation of a Parallel I/O Runtime System for Irregular Applications
Session: Memory Hierarchy and I/O

Chu, Chih-Ping
The Generalized Lambda Test
Session: Compilers II

Cortes, A.
Clustering and Reassignment-Based Mapping Strategy for Message-Passing Architectures
Session: Operating Systems and Scheduling

Cosnard, Michel
Low Memory Cost Dynamic Scheduling of Large Coarse Grain Task Graphs
Session: Scheduling

Coudert, David
Multiprocessor Architectures Using Multi-Hop Multi-OPS Lightwave Networks and Distributed Control
Session: Networks

Culler, David E.
Managing Concurrent Access for Shared Memory Active Messages
Session: Collective Communication

D
Dandamudi, Sivarama P.
Performance Sensitivity of Space-Sharing Processor Scheduling in Distributed-Memory Multicomputers
Session: Operating Systems and Scheduling

Dani, Amod K.
Register-Sensitive Software Pipelining
Session: Compilers II

Das, Sajal K.
Toward a Universal Mapping Algorithm for Accessing Trees in Parallel Memory Systems
Session: Algorithms II

Davidson, Edward S.
Configuration Independent Analysis for Characterizing Shared-Memory Applications
Session: Multiprocessor Performance Evaluation

Davis, Don
ACEcard(tm): A High Performance Architecture for Run-Time Reconfiguration
Session: Industrial Track - Reconfigurable Systems

Davis, Edward W.
Rendering Computer Animations on a Network of Workstations
Session: Scientific Simulation

Davis, Kei
C++ Expression Templates Performance Issues in Scientific Computing
Session: Performance Prediction and Evaluation

Davis, Timothy A.
Rendering Computer Animations on a Network of Workstations
Session: Scientific Simulation

De Man, H.
Code Transformations for Low Power Caching in Embedded Multimedia Processors
Session: Memory Hierarchy and I/O

De Vivo, Amelia
Toward a Universal Mapping Algorithm for Accessing Trees in Parallel Memory Systems
Session: Algorithms II

Demaine, Erik D.
Protocols for Non-Deterministic Communication over Synchronous Channels
Session: Communication

DiNicola, Paul
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

Dietz, Henry G.
A Case for Aggregate Networks
Session: Networks

Dozy, Peter
Optimizing Parallel Applications for Wide-Area Clusters
Session: Distributed Systems

Dutt, Shantanu
Adaptive Quality Equalizing: High-Performance Load Balancing for Parallel Branch-and-Bound Across Applications and Computing Systems
Session: Distributed Systems

E
Egecioglu, Omer
Lower Bounds on Communication Loads and Optimal Placements in Torus Networks
Session: Algorithms II

Processor Lower Bound Formulas for Array Computations and Parametric Diophantine Systems
Session: Mathematical Applications

Enzler, Rolf
Design of a FEM Computation Engine for Real-Time Laparoscopic Surgery Simulation
Session: Scientific Simulation

Evers, Harald
Medical Image Processing and Visualization on Heterogeneous Clusters of Symmetric Multiprocessors using MPI and POSIX Threads
Session: Signal and Image Processing

F
Faerber, Georg
NoWait-RPC: Extending ONC RPC to a Fully Compatible Message Passing System
Session: Collective Communication

Feitelson, Dror G.
Utilization and Predictability in Scheduling the IBM SP2 with Backfilling
Session: Scheduling

Fernandes, Marcio Merino
Partitioned Schedules for Clustered VLIW Architectures
Session: Operating Systems and Scheduling

Ferreira, Afonso
Multiprocessor Architectures Using Multi-Hop Multi-OPS Lightwave Networks and Distributed Control
Session: Networks

Fischer, Franz
NoWait-RPC: Extending ONC RPC to a Fully Compatible Message Passing System
Session: Collective Communication

Frey, Peter
Optimistic Synchronization of Mixed-Mode Simulators
Session: Software Distributed Shared Memory

Fross, Bradley K.
WILDFIRE(tm) Heterogeneous Adaptive Parallel Processing System
Session: Industrial Track - Reconfigurable Systems

Frumkin, Michael
Trace-Driven Debugging of Message Passing Programs
Session: Performance and Debugging Tools

Fujie, Tetsuya
Solving the Maximum Clique Problem using PUBB
Session: Algorithms I

G
Gao, Guang R.
An Enhanced Co-Scheduling Method Using Reduced MS-State Diagrams
Session: Compilers II

Garg, Vijay K.
Predicate Control for Active Debugging of Distributed Programs
Session: Performance and Debugging Tools

Garg, Vivek
Hiding Communication Latency in Data Parallel Applications
Session: Communication

Ghose, Kanad
Caching-Efficient Multithreaded Fast Multiplication of Sparse Matrices
Session: Mathematical Applications

Giampapa, Mark E.
The Design of COMPASS: An Execution Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors
Session: Multiprocessor Performance Evaluation

Giess, Christoph
Medical Image Processing and Visualization on Heterogeneous Clusters of Symmetric Multiprocessors using MPI and POSIX Threads
Session: Signal and Image Processing

Gildea, Kevin
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

Godin, T.J.
Parallel Performance Visualization Using Moments of Utilization Data
Session: Performance and Debugging Tools

Goil, Sanjay
High Performance Data Mining Using Data Cubes on Parallel Computers
Session: Databases and Sorting

Gomes, Benedict
Efficient Fine-Grain Thread Migration with Active Threads
Session: Operating Systems and Scheduling

Gontmakher, Alex
Characterizations for Java Memory Behavior
Session: Software Distributed Shared Memory

Gonzalez, Antonio
Jacobi Orderings for Multi-Port Hypercubes
Session: Mathematical Applications

Govindarajan, R.
Register-Sensitive Software Pipelining
Session: Compilers II

An Enhanced Co-Scheduling Method Using Reduced MS-State Diagrams
Session: Compilers II

Govindaraju, Rama K.
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

Grahn, Hakan
VPPB - A Visualization and Performance Prediction Tool for Multithreaded Solaris Programs
Session: Performance and Debugging Tools

Gunnels, John
A Flexible Class of Parallel Matrix Multiplication Algorithms
Session: Mathematical Applications

Gupta, Neelima
An Improved Output-size Sensitive Parallel Algorithm for Hidden-Surface Removal for Terrains
Session: Signal and Image Processing

H
Ha, Soonhoi
Efficient Barrier Synchronization Mechanism for the BSP Model on Message-Passing Architectures
Session: Collective Communication

Han, Hwansoo
Compile-time Synchronization Optimizations for Software DSMs
Session: Software Distributed Shared Memory

Harris, Jonathan
ACEcard(tm): A High Performance Architecture for Run-Time Reconfiguration
Session: Industrial Track - Reconfigurable Systems

Harrison, Robert
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

Hawver, Dennis M.
WILDFIRE(tm) Heterogeneous Adaptive Parallel Processing System
Session: Industrial Track - Reconfigurable Systems

Hayashi, Tatsuya
An O((log log n)^2) Time Convex Hull Algorithm on Reconfigurable Meshes
Session: Algorithms II

Helman, David R.
Sorting on Clusters of SMP's
Session: Databases and Sorting

Hirabayashi, Ryuichi
Solving the Maximum Clique Problem using PUBB
Session: Algorithms I

Hoare, Raymond R.
A Case for Aggregate Networks
Session: Networks

Hofman, Rutger F. H.
Optimizing Parallel Applications for Wide-Area Clusters
Session: Distributed Systems

Holtkamp, Michael
Efficient Fine-Grain Thread Migration with Active Threads
Session: Operating Systems and Scheduling

Hood, Robert
Trace-Driven Debugging of Message Passing Programs
Session: Performance and Debugging Tools

Hopfner, Thomas
NoWait-RPC: Extending ONC RPC to a Fully Compatible Message Passing System
Session: Collective Communication

Hori, Atsushi
Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication
Session: Memory Hierarchy and I/O

Horst, Robert
The Effect of the Router Arbitration Policy on ServerNet(tm) Topolgies
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Hovland, Paul
Automatic Differentiation for Message-Passing Parallel Programs
Session: Mathematical Applications

Hu, Y. Charlie
Optimal All-to-Some Personalized Communication on Hypercubes
Session: Communication

Hu, Yiming
The Design of COMPASS: An Execution Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors
Session: Multiprocessor Performance Evaluation

Hwang, Gwan-Hwan
An Expression-Rewriting Framework to Generate Communication Sets for HPF Programs with Block-Cyclic Distribution
Session: Compilers I

I
Ikebe, Yoshiko
Solving the Maximum Clique Problem using PUBB
Session: Algorithms I

Irwin, Mary Jane
Aggressive Dynamic Execution of Multimedia Kernel Traces
Session: Performance Prediction and Evaluation

Ishikawa, Yutaka
Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication
Session: Memory Hierarchy and I/O

Iyer, R.
Impact of Switch Design on the Application Performance of Cache Coherent Multiprocessors
Session: Multiprocessor Performance Evaluation

J
JaJa, Joseph
Sorting on Clusters of SMP's
Session: Databases and Sorting

Jang, Ju-wook
An $AT^{2}$ Optimal Mapping of Sorting onto the Mesh Connected Array without Comparators
Session: Databases and Sorting

Jeannot, Emmanuel
Low Memory Cost Dynamic Scheduling of Large Coarse Grain Task Graphs
Session: Scheduling

Jhon, Chu Shik
Efficient Barrier Synchronization Mechanism for the BSP Model on Message-Passing Architectures
Session: Collective Communication

Jonsson, Magnus
The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing
Session: Signal and Image Processing

Joshi, Mahesh V.
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets
Session: Databases and Sorting

Jovanovic, Zoran
Predicated Software Pipelining Technique for Loops with Conditions
Session: Compilers II

Juurlink, Ben H.H.
Experimental Validation of Parallel Computation Models on the Intel Paragon
Session: Multiprocessor Performance Evaluation

K
Kandemir, M.
A Generalized Framework for Global Communication Optimization
Session: Compilers I

Kanth, K.V. Ravi
Improved Concurrency Control Techniques for Multi-dimensional Index Structures
Session: Databases and Sorting

Karypis, George
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets
Session: Databases and Sorting

Keleher, Pete
Locality and Performance of Page- and Object-Based DSMs
Session: Software Distributed Shared Memory

Update Protocols and Iterative Scientific Applications
Session: Software Distributed Shared Memory

Kim, Chulho
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

Kim, Dongmin
Vector Prefix and Reduction Computation on Coarse-Grained, Distributed-Memory Parallel Machines
Session: Algorithms I

Kim, Jin-Soo
Efficient Barrier Synchronization Mechanism for the BSP Model on Message-Passing Architectures
Session: Collective Communication

Ko, Sunghoon
High-Performance External Computations Using User-Controllable I/O
Session: Memory Hierarchy and I/O

Kogge, Peter M.
Optimizing Data Scheduling on Processor-In-Memory Arrays
Session: Compilers I

Kohler, Jr., Ralph L.
Rapid Development of Real-Time Systems Using RTExpress
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Krishnan, Venkata
A Clustered Approach to Multithreaded Processors
Session: Performance Prediction and Evaluation

Ku, Shan-Chyun
Optimally Locating a Structured Facility of a Specified Length in a Weighted Tree Network
Session: Routing

Kulaczewski, Mark B.
SIMD and Mixed-Mode Implementations of a Visual Tracking Algorithm
Session: Scientific Simulation

Kulkarni, C.
Code Transformations for Low Power Caching in Embedded Multimedia Processors
Session: Memory Hierarchy and I/O

Kumar, A.
Impact of Switch Design on the Application Performance of Cache Coherent Multiprocessors
Session: Multiprocessor Performance Evaluation

Kumar, Vipin
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets
Session: Databases and Sorting

Kwok, Yu-Kwong
Benchmarking the Task Graph Scheduling Algorithms
Session: Scheduling

L
Lain, Antonio
Evaluation of Compiler and Runtime Library Approaches for Supporting Parallel Regular Applications
Session: Compilers I

Lauzac, Sylvain
An Efficient RMS Admission Control and Its Application To Multiprocessor Scheduling
Session: Scheduling

LeMahieu, Paul
Fault-Tolerant Switched Local Area Networks
Session: Fault Tolerance

Lee, Jang Sun
High-Performance External Computations Using User-Controllable I/O
Session: Memory Hierarchy and I/O

Lee, Jenq Kuen
An Expression-Rewriting Framework to Generate Communication Sets for HPF Programs with Block-Cyclic Distribution
Session: Compilers I

Lennerstad, Hakan
Comparing the Optimal Performance of Different MIMD Multiprocessor Architectures
Session: Multiprocessor Performance Evaluation

Li, Keqin
Asymptotically Optimal Randomized Tree Embedding in Static Networks
Session: Algorithms II

Liang, Weifa
Multicasting and Broadcasting in Large WDM Networks
Session: Routing

Liao, Wei-keng
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers
Session: Signal and Image Processing

Libeskind-Hadas, Ran
Optimal Contention-Free Unicast-Based Multicasting in Switch-Based Networks of Workstations
Session: Routing

Tree-Based Multicasting in Wormhole-Routed Irregular Topologies
Session: Collective Communication

Lieu, Peter
Airshed Pollution Modeling: A Case Study in Application Development in an HPF Environment
Session: Scientific Simulation

Lilja, David J.
Dynamic Processor Allocation with the Solaris Operating System
Session: Operating Systems and Scheduling

Lin, Calvin
A Flexible Class of Parallel Matrix Multiplication Algorithms
Session: Mathematical Applications

Lin, R.
A Scalable VLSI Architecture for Binary Prefix Sums
Session: Algorithms I

Linderman, Mark
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers
Session: Signal and Image Processing

Linderman, Richard
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers
Session: Signal and Image Processing

Litman, Ami
On the Bisection Width and Expansion of Butterfly Networks
Session: Networks

Llosa, Josep
Partitioned Schedules for Clustered VLIW Architectures
Session: Operating Systems and Scheduling

Lopez, Louis
Trace-Driven Debugging of Message Passing Programs
Session: Performance and Debugging Tools

Lu, Bo
Compiler Optimization of Implicit Reductions for Distributed Memory Multiprocessors
Session: Compilers I

Ludwig, T.
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Lumetta, Steven S.
Managing Concurrent Access for Shared Memory Active Messages
Session: Collective Communication

Lundberg, Lars
Comparing the Optimal Performance of Different MIMD Multiprocessor Architectures
Session: Multiprocessor Performance Evaluation

VPPB - A Visualization and Performance Prediction Tool for Multithreaded Solaris Programs
Session: Performance and Debugging Tools

Luque, E.
Clustering and Reassignment-Based Mapping Strategy for Message-Passing Architectures
Session: Operating Systems and Scheduling

M
Macey, Benjamin S.
A Performance Evaluation of CP List Scheduling Heuristics for Communication Intensive Task Graphs
Session: Scheduling

Maciejewski, Anthony A.
A Comparative Study of Five Parallel Genetic Algorithms Using The Traveling Salesman Problem
Session: Algorithms I

Maggs, Bruce M.
On the Bisection Width and Expansion of Butterfly Networks
Session: Networks

Mahapatra, Nihar R.
Adaptive Quality Equalizing: High-Performance Load Balancing for Parallel Branch-and-Bound Across Applications and Computing Systems
Session: Distributed Systems

Mahesh, K.
Scheduling Algorithms Exploiting Spare Capacity and Tasks' Laxities for Fault Detection and Location in Real-Time Multiprocessor Systems
Session: Fault Tolerance

Maier, U.
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Malishevsky, Alexey
Preliminary Results from a Parallel MATLAB Compiler
Session: Mathematical Applications

Manimaran, G.
Scheduling Algorithms Exploiting Spare Capacity and Tasks' Laxities for Fault Detection and Location in Real-Time Multiprocessor Systems
Session: Fault Tolerance

Mavronicolas, Marios
An Efficient Counting Network
Session: Routing

Mayer, Achim
Medical Image Processing and Visualization on Heterogeneous Clusters of Symmetric Multiprocessors using MPI and POSIX Threads
Session: Signal and Image Processing

Mazzoni, Dominic
Optimal Contention-Free Unicast-Based Multicasting in Switch-Based Networks of Workstations
Session: Routing

Tree-Based Multicasting in Wormhole-Routed Irregular Topologies
Session: Collective Communication

Megson, G.M.
Synthesis of a Systolic Array Genetic Algorithm
Session: Algorithms I

Meinzer, Hans-Peter
Medical Image Processing and Visualization on Heterogeneous Clusters of Symmetric Multiprocessors using MPI and POSIX Threads
Session: Signal and Image Processing

Melhem, Rami
Distributed, Dynamic Control of Circuit-Switched Banyan Networks
Session: Networks

An Efficient RMS Admission Control and Its Application To Multiprocessor Scheduling
Session: Scheduling

Mellor-Crummey, John
Compiler Optimization of Implicit Reductions for Distributed Memory Multiprocessors
Session: Compilers I

Michael, Maged
The Design of COMPASS: An Execution Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors
Session: Multiprocessor Performance Evaluation

Milicev, Dragan
Predicated Software Pipelining Technique for Loops with Conditions
Session: Compilers II

Min, Byung Eui
High-Performance External Computations Using User-Controllable I/O
Session: Memory Hierarchy and I/O

Mirza, Jamshed
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

Mishra, Shivakant
Thread-based vs Event-based Implementation of a Group Communication Service
Session: Operating Systems and Scheduling

Morrow, Greg
A Flexible Class of Parallel Matrix Multiplication Algorithms
Session: Mathematical Applications

Mosse, Daniel
An Efficient RMS Admission Control and Its Application To Multiprocessor Scheduling
Session: Scheduling

Mueller, Frank
Prioritized Token-Based Mutual Exclusion for Distributed Systems
Session: Distributed Systems

Munoz, Xavier
Multiprocessor Architectures Using Multi-Hop Multi-OPS Lightwave Networks and Distributed Control
Session: Networks

Munz, Frank
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Murthy, C. Siva Ram
Scheduling Algorithms Exploiting Spare Capacity and Tasks' Laxities for Fault Detection and Location in Real-Time Multiprocessor Systems
Session: Fault Tolerance

N
Nakano, K.
A Scalable VLSI Architecture for Binary Prefix Sums
Session: Algorithms I

Nakano, Koji
Broadcast-Efficient Algorithms on the Coarse-Grain Broadcast Communication Model with Few Channels
Session: Communication

An O((log log n)^2) Time Convex Hull Algorithm on Reconfigurable Meshes
Session: Algorithms II

Nanda, Ashwini K.
The Design of COMPASS: An Execution Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors
Session: Multiprocessor Performance Evaluation

Nekolla, S.
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Nicolau, Alexandru
Analyzing the Individual/Combined Effects of Speculative and Guarded Execution on a Superscalar Architecture
Session: Compilers II

Nieplocha, Jarek
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

No, Jaechun
Design and Implementation of a Parallel I/O Runtime System for Irregular Applications
Session: Memory Hierarchy and I/O

O
O'Carroll, Francis
Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication
Session: Memory Hierarchy and I/O

Obrenic, Bojana
Emulating Direct Products by Index-Shuffle Graphs
Session: Algorithms I

Ohara, Moriyoshi
The Design of COMPASS: An Execution Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors
Session: Multiprocessor Performance Evaluation

Olariu, S.
A Scalable VLSI Architecture for Binary Prefix Sums
Session: Algorithms I

Olariu, Stephan
Broadcast-Efficient Algorithms on the Coarse-Grain Broadcast Communication Model with Few Channels
Session: Communication

An O((log log n)^2) Time Convex Hull Algorithm on Reconfigurable Meshes
Session: Algorithms II

Owens, Robert
Aggressive Dynamic Execution of Multimedia Kernel Traces
Session: Performance Prediction and Evaluation

P
Padua, David A.
Experimental Study of Compiler Techniques for Scalable Shared Memory Machines
Session: Compilers II

Paek, Yunheung
Experimental Study of Compiler Techniques for Scalable Shared Memory Machines
Session: Compilers II

Pan, Yi
Permutation Capability of Optical Multistage Interconnection Networks
Session: Networks

Pancake, C.M.
Parallel Performance Visualization Using Moments of Utilization Data
Session: Performance and Debugging Tools

Panda, Dhabaleswar K.
HIPIQS: A High-Performance Switch Architecture using Input Queuing
Session: Networks

Parhami, Behrooz
The Robust-Algorithm Approach to Fault Tolerance on Processor Arrays: Fault Models, Fault Diameter, and Basic Algorithms
Session: Fault Tolerance

Park, Sung-soon
Design and Implementation of a Parallel I/O Runtime System for Irregular Applications
Session: Memory Hierarchy and I/O

Park, Taesoon
An Efficient Logging Scheme for Lazy Release Consistent Distributed Shared Memory System
Session: Software Distributed Shared Memory

Parsons, Ian
Using PI/OT to Support Complex Parallel I/O
Session: Memory Hierarchy and I/O

Peterson, James B.
WILDFIRE(tm) Heterogeneous Adaptive Parallel Processing System
Session: Industrial Track - Reconfigurable Systems

Petrini, Fabrizio
Total-Exchange on Wormhole k-ary n-cubes with Adaptive Routing
Session: Collective Communication

Pietracaprina, Andrea
Deterministic Routing of h-relations on the Multibutterfly
Session: Routing

Pinotti, M. Cristina
Toward a Universal Mapping Algorithm for Accessing Trees in Parallel Memory Systems
Session: Algorithms II

Pinotti, M.C.
A Scalable VLSI Architecture for Binary Prefix Sums
Session: Algorithms I

Plaat, Aske
Optimizing Parallel Applications for Wide-Area Clusters
Session: Distributed Systems

Pormann, John B.
The Implicit Pipeline Method
Session: Scientific Simulation

Q
Quinlan, Dan
C++ Expression Templates Performance Issues in Scientific Computing
Session: Performance Prediction and Evaluation

Quinn, Michael J.
Parallel Performance Visualization Using Moments of Utilization Data
Session: Performance and Debugging Tools

Preliminary Results from a Parallel MATLAB Compiler
Session: Mathematical Applications

Quittek, Jurgen W.
Efficient Fine-Grain Thread Migration with Active Threads
Session: Operating Systems and Scheduling

R
Radhakrishnan, Radharamanan
Optimistic Synchronization of Mixed-Mode Simulators
Session: Software Distributed Shared Memory

Rajagopalan, Ranjith
Optimal Contention-Free Unicast-Based Multicasting in Switch-Based Networks of Workstations
Session: Routing

Tree-Based Multicasting in Wormhole-Routed Irregular Topologies
Session: Collective Communication

Ramanan, V. Janaki
Register-Sensitive Software Pipelining
Session: Compilers II

Ramanujam, J.
A Generalized Framework for Global Communication Optimization
Session: Compilers I

Ranka, Sanjay
Vector Prefix and Reduction Computation on Coarse-Grained, Distributed-Memory Parallel Machines
Session: Algorithms I

High-Performance External Computations Using User-Controllable I/O
Session: Memory Hierarchy and I/O

Memory Hierarchy Management for Iterative Graph Structures
Session: Memory Hierarchy and I/O

An Efficient Parallel Algorithm for High Dimensional Similarity Join
Session: Databases and Sorting

Rao, N.S.S. Narasimha
An Enhanced Co-Scheduling Method Using Reduced MS-State Diagrams
Session: Compilers II

Rhomberg, Alex
Design of a FEM Computation Engine for Real-Time Laparoscopic Surgery Simulation
Session: Scientific Simulation

Ripoll, A.
Clustering and Reassignment-Based Mapping Strategy for Message-Passing Architectures
Session: Operating Systems and Scheduling

Rose, Donald J.
The Implicit Pipeline Method
Session: Scientific Simulation

Rosenberg, Arnold L.
Guidelines for Data-Parallel Cycle-Stealing in Networks of Workstations
Session: Scheduling

Rougeot, Laurence
Low Memory Cost Dynamic Scheduling of Large Coarse Grain Task Graphs
Session: Scheduling

Roychowdhury, Vwani P.
A Comparative Study of Five Parallel Genetic Algorithms Using The Traveling Salesman Problem
Session: Algorithms I

Royo, Dolors
Jacobi Orderings for Multi-Port Hypercubes
Session: Mathematical Applications

Rugina, Radu
Predicting the Running Time of Parallel Programs by Simulation
Session: Performance Prediction and Evaluation

S
Salisbury, Chuck
Distributed, Dynamic Control of Circuit-Switched Banyan Networks
Session: Networks

Scarano, Vittorio
Toward a Universal Mapping Algorithm for Accessing Trees in Parallel Memory Systems
Session: Algorithms II

Schaeffer, Jonathan
Using PI/OT to Support Complex Parallel I/O
Session: Memory Hierarchy and I/O

Schauser, Klaus E.
Predicting the Running Time of Parallel Programs by Simulation
Session: Performance Prediction and Evaluation

Schewel, John
A Hardware / Software Co-Design System using Configurable Computing Technology
Session: Industrial Track - Reconfigurable Systems

Schimmel, David E.
Hiding Communication Latency in Data Parallel Applications
Session: Communication

Schopf, Jennifer M.
Performance Prediction in Production Environments
Session: Performance Prediction and Evaluation

Schuster, Assaf
Characterizations for Java Memory Behavior
Session: Software Distributed Shared Memory

Schwaiger, M.
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Schwing, J.L.
A Scalable VLSI Architecture for Binary Prefix Sums
Session: Algorithms I

Schwing, James L.
Broadcast-Efficient Algorithms on the Coarse-Grain Broadcast Communication Model with Few Channels
Session: Communication

Seelam, Nagajagadeswar
Preliminary Results from a Parallel MATLAB Compiler
Session: Mathematical Applications

Sen, Sandeep
An Improved Output-size Sensitive Parallel Algorithm for Hidden-Surface Removal for Terrains
Session: Signal and Image Processing

Senar, M.A.
Clustering and Reassignment-Based Mapping Strategy for Message-Passing Architectures
Session: Operating Systems and Scheduling

Serena, F. David
Improved Concurrency Control Techniques for Multi-dimensional Index Structures
Session: Databases and Sorting

Sernec, R.
A Quantitative Code Analysis of Scientific Systolic Programs: DSP Vs. Matrix Algorithms
Session: Signal and Image Processing

Sha, Edwin H.-M.
Optimizing Data Scheduling on Processor-In-Memory Arrays
Session: Compilers I

Shah, Gautam
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Session: Collective Communication

Shan, Hongzhang
Parallel Tree Building on a Range of Shared address Space Multiprocessors: Algorithms and Application Performance
Session: Multiprocessor Performance Evaluation

Shen, Hong
Multicasting and Broadcasting in Large WDM Networks
Session: Routing

Shenoy, N.
A Generalized Framework for Global Communication Optimization
Session: Compilers I

Shi, Wei
Hyper-Butterfly Network: A Scalable Optimally Fault Tolerant Architecture
Session: Fault Tolerance

Shinano, Yuji
Solving the Maximum Clique Problem using PUBB
Session: Algorithms I

Shurbanov, Vladimer
The Effect of the Router Arbitration Policy on ServerNet(tm) Topolgies
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Siegel, Howard Jay
A Comparative Study of Five Parallel Genetic Algorithms Using The Traveling Salesman Problem
Session: Algorithms I

SIMD and Mixed-Mode Implementations of a Visual Tracking Algorithm
Session: Scientific Simulation

Singh, Ambuj K.
Improved Concurrency Control Techniques for Multi-dimensional Index Structures
Session: Databases and Sorting

Singh, Jaswinder Pal
Parallel Tree Building on a Range of Shared address Space Multiprocessors: Algorithms and Application Performance
Session: Multiprocessor Performance Evaluation

Singh, Vineet
An Efficient Parallel Algorithm for High Dimensional Similarity Join
Session: Databases and Sorting

Sitaraman, Ramesh K.
On the Bisection Width and Expansion of Butterfly Networks
Session: Networks

Sivaram, Rajeev
HIPIQS: A High-Performance Switch Architecture using Input Queuing
Session: Networks

Somani, Arun K.
Scheduling Algorithms Exploiting Spare Capacity and Tasks' Laxities for Fault Detection and Location in Real-Time Multiprocessor Systems
Session: Fault Tolerance

Srimani, Pradip K.
Hyper-Butterfly Network: A Scalable Optimally Fault Tolerant Architecture
Session: Fault Tolerance

Srinivas, M.
Analyzing the Individual/Combined Effects of Speculative and Guarded Execution on a Superscalar Architecture
Session: Compilers II

Steenkiste, Peter
Airshed Pollution Modeling: A Case Study in Application Development in an HPF Environment
Session: Scientific Simulation

Stephan, T.
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Stichnoth, James
Airshed Pollution Modeling: A Case Study in Application Development in an HPF Environment
Session: Scientific Simulation

Stunkel, Craig B.
HIPIQS: A High-Performance Switch Architecture using Input Queuing
Session: Networks

Subhlok, Jaspal
Airshed Pollution Modeling: A Case Study in Application Development in an HPF Environment
Session: Scientific Simulation

Sulatycke, Peter D.
Caching-Efficient Multithreaded Fast Multiplication of Sparse Matrices
Session: Mathematical Applications

Sun, Xian-He
Memory Space Representation for Heterogeneous Network Process Migration
Session: Distributed Systems

Svensson, Bertil
The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing
Session: Signal and Image Processing

Szafron, Duane
Using PI/OT to Support Complex Parallel I/O
Session: Memory Hierarchy and I/O

T
Tarafdar, Ashis
Predicate Control for Active Debugging of Distributed Programs
Session: Performance and Debugging Tools

Tasic, J.F.
A Quantitative Code Analysis of Scientific Systolic Programs: DSP Vs. Matrix Algorithms
Session: Signal and Image Processing

Taveniku, Mikael
The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing
Session: Signal and Image Processing

Tezuka, Hiroshi
Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication
Session: Memory Hierarchy and I/O

Thaler, Markus
Design of a FEM Computation Engine for Real-Time Laparoscopic Surgery Simulation
Session: Scientific Simulation

Tian, Yi
Optimizing Data Scheduling on Processor-In-Memory Arrays
Session: Compilers I

Topham, Nigel
Partitioned Schedules for Clustered VLIW Architectures
Session: Operating Systems and Scheduling

Torrellas, Josep
A Clustered Approach to Multithreaded Processors
Session: Performance Prediction and Evaluation

Trabado, Pablo P.
Local Enumeration Techniques for Sparse Algorithms
Session: Compilers I

Troester, Gerhard
Design of a FEM Computation Engine for Real-Time Laparoscopic Surgery Simulation
Session: Scientific Simulation

Tsay, Jyh-Jong
Nearly Optimal Algorithms for Broadcast on d-Dimensional All-Port and Wormhole-Routed Torus
Session: Communication

Tseng, Chau-Wen
Compile-time Synchronization Optimizations for Software DSMs
Session: Software Distributed Shared Memory

U
Unrau, Ron
Using PI/OT to Support Complex Parallel I/O
Session: Memory Hierarchy and I/O

V
Valero-Garcia, Miguel
Jacobi Orderings for Multi-Port Hypercubes
Session: Mathematical Applications

van de Geijn, Robert
A Flexible Class of Parallel Matrix Multiplication Algorithms
Session: Mathematical Applications

Varshney, Pramod
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers
Session: Signal and Image Processing

W
Wang, Biing-Feng
Optimally Locating a Structured Facility of a Specified Length in a Weighted Tree Network
Session: Routing

Wang, H.
Impact of Switch Design on the Application Performance of Cache Coherent Multiprocessors
Session: Multiprocessor Performance Evaluation

Wang, Jianchao
A New Self-Routing Multicast Network
Session: Routing

Permutation Capability of Optical Multistage Interconnection Networks
Session: Networks

Wang, Lee
A Comparative Study of Five Parallel Genetic Algorithms Using The Traveling Salesman Problem
Session: Algorithms I

Wang, Wen-Tsong
Nearly Optimal Algorithms for Broadcast on d-Dimensional All-Port and Wormhole-Routed Torus
Session: Communication

Warber, Chris
DEEP: A Development Environment for Parallel Programs
Session: Industrial Track - Environments, Tools, and Evaluation Methods

Weil, Ahuva Mu'alem
Utilization and Predictability in Scheduling the IBM SP2 with Backfilling
Session: Scheduling

Weiner, Donald
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers
Session: Signal and Image Processing

Weissman, Boris
Efficient Fine-Grain Thread Migration with Active Threads
Session: Operating Systems and Scheduling

Wilsey, Philip A.
Optimistic Synchronization of Mixed-Mode Simulators
Session: Software Distributed Shared Memory

Wu, Jesse
The Generalized Lambda Test
Session: Compilers II

Wu, Jie
Minimizing Total Communication Distance of a Time-Step Optimal Broadcast in Mesh Networks
Session: Communication

Y
Yang, Rongguang
Thread-based vs Event-based Implementation of a Group Communication Service
Session: Operating Systems and Scheduling

Yang, Yuanyuan
A New Self-Routing Multicast Network
Session: Routing

Permutation Capability of Optical Multistage Interconnection Networks
Session: Networks

Yatzkar, Tal
On the Bisection Width and Expansion of Butterfly Networks
Session: Networks

Yeh, Chi-Hsiang
The Robust-Algorithm Approach to Fault Tolerance on Processor Arrays: Fault Models, Fault Diameter, and Basic Algorithms
Session: Fault Tolerance

Yeom, Heon Y.
An Efficient Logging Scheme for Lazy Release Consistent Distributed Shared Memory System
Session: Software Distributed Shared Memory

Yu, Hai
Performance Sensitivity of Space-Sharing Processor Scheduling in Distributed-Memory Multicomputers
Session: Operating Systems and Scheduling

Yue, Kelvin K.
Dynamic Processor Allocation with the Solaris Operating System
Session: Operating Systems and Scheduling

Z
Zajc, M.
A Quantitative Code Analysis of Scientific Systolic Programs: DSP Vs. Matrix Algorithms
Session: Signal and Image Processing

Zapata, Emilio L.
Local Enumeration Techniques for Sparse Algorithms
Session: Compilers I

Zhao, Yan
Preliminary Results from a Parallel MATLAB Compiler
Session: Mathematical Applications

Ziegler, S.
NOW Based Parallel Reconstruction of Functional Images
Session: Signal and Image Processing

Zimand, Marius
Sharing Random Bits with No Process Coordination
Session: Algorithms II

Zomaya, A.Y.
A Scalable VLSI Architecture for Binary Prefix Sums
Session: Algorithms I

Zomaya, Albert Y.
A Performance Evaluation of CP List Scheduling Heuristics for Communication Intensive Task Graphs
Session: Scheduling