Session 1 - Routing and Switching
Session 2 - Computational Science
Session 3 - Scheduling I
Session 4 - Memory Systems
Session 5 - Tools
Session 6 - Algorithms
Session 7 - Best Papers
Session 8 - Network Routing
Session 9 - Data Sets and Visualization
Session 10 - Scheduling II
Session 11 - Communication
Session 12 - Distributed Computing
Session 13 - Threading
Session 14 - Wormhole Routing
Session 15 - Input/Output
Session 16 - Shared Memory
Session 17 - Optical Computing
Session 18 - Numerical Algorithms
Session 19 - Meshes and Arrays

please wait for entire file to load before selecting a session name

Session 1 - Routing and Switching
Switch Scheduling in the Multimedia Router (MMR)
D. Love; S. Yalamanchili; J. Duato; M.B. Caminero; F.J. Quiles

Micro-architectures of High Performance, Multi-user System Area Network Interface Cards
Boon Seong Ang; Derek Chiou; Larry Rudolph; Arvind

Broadcasting in Hypercubes in the Circuit Switched Model (file unavailable)
J.-C. Bermond; T. Kodate; S. Perennes; A. Bonnecaze; P. Solé

Improving Routing Performance in Myrinet Networks
J. Flich; M.P. Malumbres; P. López; J. Duato

Efficient Virtual Interface Architecture Support for the IBM SP Switch-Connected NT Clusters
M. Banikazemi; V. Moorthy; L. Herger; D.K. Panda; B. Abali

Adaptive Routing in RS/6000 SP-like Bidirectional Multistage Interconnection Networks
M. Banikazemi; C.B. Stunkel; D.K. Panda; B. Abali

Session 2 - Computational Science
A General Parallel Simulated Annealing Library and Its Application in Airline Industry (file unavailable)
Georg Kliewer; Stefan Tschöke

Parallel Computation for Chromosome Reconstruction on a Cluster of Workstations
Suchendra M. Bhandarkar; Salem A. Machaka; Sanjay S. Shete; Jonathan Arnold

Parallel Maximum-Likelihood Inversion for Estimating Wavenumber-Ordered Spectra in Emission Spectroscopy
Hoda El-Sayed; Marc Salit; John Travis; Judith Devaney; William George

A Provably Optimal, Distribution-Independent Parallel Fast Multipole Method
Fatih E. Sevilgen; Srinivas Aluru; Natsuhiko Futamura

Efficiency of Dynamic Load Balancing Based on Permanent Cells for Parallel Molecular Dynamics Simulation
Ryoko Hayashi; Susumu Horiguchi

Parallel Performance Study of Monte Carlo Photon Transport Code on Shared-, Distributed, and Distributed-Shared-Memory Architectures
Amitava Majumdar

Session 3 - Scheduling I
Optimal Periodic Remapping of Bulk Synchronous Computations on Multiprogrammed Distributed Systems
Ngo-Tai Fong; Cheng-Zhong Xu; Le Yi Wang

Gang Scheduling with Memory Considerations
Anat Batat; Dror G. Feitelson

A Decision-Process Analysis of Implicit Coscheduling
R. Poovendran; P. Keleher; J.S. Baras

Improving Throughput and Utilization in Parallel Machines Through Concurrent Gang
Fabricio A.B. da Silva; Isaac D. Scherson

Scheduling with Advanced Reservations
Warren Smith; Ian Foster; Valerie Taylor

Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques
Yanyong Zhang; Hubertus Franke; Anand Sivasubramaniam; Jose Moreira

Session 4 - Memory Systems
A Mechanism for Speculative Memory Accesses Following Synchronizing Operations
Takayuki Sato; Kazuhiko Ohno; Hiroshi Nakashima

Safe Caching in a Distributed File System for Network Attached Storage
Randal C. Burns; Robert M. Rees; Darrell D.E. Long

Exploration of the Spatial Locality on Emerging Applications and the Consequences for Cache Performance
Martin Kämpe; Fredrik Dahlgren

Using Time Skewing to Eliminate Idle Time due to Memory Bandwidth and Network Limitations
David Wonnacott

The Memory Bandwidth Bottleneck and its Amelioration by a Compiler
Chen Ding; Ken Kennedy

Support for Recoverable Memory in the Distributed Virtual Communication Machine
Marcel-Catalin Rosu; Karsten Schwan

Session 5 - Tools
Multiclock Esterel: A Reactive Framework for Asynchronous Design (file unavailable)
Basant Rajan; RK Shyamasundar

Register Assignment for Software Pipelining with Partitioned Register Banks
Jason Hiser; Steve Carr; Philip Sweany; Steven J. Beaty

Deterministic Replay of Distributed Java Applications
Ravi Konuru; Harini Srinivasan; Jong-Deok Choi

Evaluation of P3T+: A Performance Estimator for Distributed and Parallel Programs
T. Fahringer; A. Pozgaj; J. Luitz; H. Moritsch

Applying Interposition Techniques for Performance Analysis of OpenMP Parallel Applications
Marc González; Albert Serra; Xavier Martorell; José Oliver; Eduard Ayguadé; Jesús Labarta; Nacho Navarro

FIMD-MPI: A Tool for Injecting Faults into MPI Applications (file unavailable)
Douglas M. Blough; Peng Liu

Session 6 - Algorithms
Semigroup and Prefix Computations on an Improved Generalized Mesh-Connected Computers with Multiple Buses
Yi Pan; S.Q. Zheng; Keqin Li; Hong Shen

On Sorting an Intransitive Total Ordered Set Using Semi-Heap
Jie Wu

Skiplist-Based Concurrent Priority Queues
Itay Lotan; Nir Shavit

Sorting on the OTIS-Mesh
Andre Osterloh

Sorting Multisets in Anonymous Rings
Paola Flocchini; Evangelos Kranakis; Danny Krizanc; Flaminia Luccio; Nicola Santoro

Session 7 - Best Papers
Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers
Keqin Li

Speed vs. Accuracy in Simulation for I/O-Intensive Applications
Hyeonsang Eom; Jeffrey K. Hollingsworth

A Parallel Implementation of A Fast Multipole Based 3-D Capacitance Extraction Program on Distributed Memory Multicomputers
Yanhong Yuan; Prith Banerjee

Efficient Integration of Compiler-directed Cache Coherence and Data Prefetching
Hock-Beng Lim; Pen-Chung Yew

Session 8 - Network Routing
Optimal On Demand Packet Scheduling in Single-Hop Multichannel Communication Systems
Maurizio A. Bonuccelli; Susanna Pelagatti

Optimal broadcasting in all-port meshes of trees with distance-insensitive routing
Petr Salinger; Pavel Tvrdík

Distributed Models and Algorithms for Survivability in Network Routing (file unavailable)
Fred S. Annexstein; Kenneth A. Berman

Gray Codes for Torus and Edge Disjoint Hamiltonian Cycles
Myung M. Bae; Bella Bose

Power aware localized routing in wireless networks
Ivan Stojmenovic; Xu Lin

Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance
Nicholas T. Karonis; Bronis R. de Supinkski; Ian Foster; William Gropp; Ewing Lusk; John Bresnahan

Session 9 - Data Sets and Visualization
PaDDMAS: Parallel and Distributed Data Mining Application Suite
Omer Rana; David Walker; Maozhen Li; Steven Lynden; Mike Ward

VisOK : A Flexible Visualization System for Distributed Java Object Application (file unavailable)
Dong-Woo Lee; R.S Ramakrishna

Bounded-Response-Time Self-Stabilizing OPS5 Production Systems
Albert M.K. Cheng; Seiya Fujii

Optimizing Retrieval and Processing of Multi-Dimensional Scientific Datasets
Chialin Chang; Tahsin Kurc; Alan Sussman; Joel Saltz

Using Available Remote Memory Dynamically for Parallel Data Mining Application on ATM-Connected PC Cluster
Masato Oguchi; Masaru Kitsuregawa

Image Layer Decomposition for Distributed Real-Time Rendering on Clusters
Thu D. Nguyen; John Zahorjan

Session 10 - Scheduling II
Effective load sharing on heterogeneous networks of workstations
Li Xiao; Xiaodong Zhang; Yanxia Qu

Buffered Coscheduling: A New Methodology for Multitasking Parallel Jobs on Distributed Systems
Fabrizio Petrini; Wu-chun Feng

A Task Duplication Based Scheduling Algorithm for Heterogeneous Systems
Samantha Ranaweera; Dharma P. Agrawal

S3MP: A Task Duplication Based Scalable Scheduling Algorithm for Symmetric Multiprocessors
Oh-Han Kang; Dharma P. Agrawal

Job Scheduling that Minimizes Network Contention due to both Communication and I/O
Jens Mache; Virginia Lo; Sharad Garg

Self-Stabilizing Mutual Exclusion Using Unfair Distributed Scheduler
Ajoy K. Datta; Maria Gradinariu; Sébastien Tixeuil

Session 11 - Communication
A New Portable and Seamless Pure Java Framework for Distributed Programming Over a TCP/IP Network (file unavailable)
Zvi Har'El; Zvi Rosberg

Reduction Optimization in Heterogeneous Cluster Environments
Pangfeng Liu; Da-Wei Wang

Template Based Structured Collections
Jürg Nolte; Mitsuhisa Sato; Yutaka Ishikawa

Bandwidth-efficient Collective Communication for Clustered Wide Area Systems
Thilo Kielmann; Henri E. Bal; Sergei Gorlatch

Replicating the Contents of a WWW Multimedia Repository to Minimize Download Time
Thanasis Loukopoulos; Ishfaq Ahmad

Enhancing NWS for use in an SNMP Managed Internetwork
Robert E. Busby Jr.; Mitchell L. Neilsen; Daniel Andresen

Session 12 - Distributed Computing
Consensus Based on Failure Detectors with a Perpetual Accuracy Property
Achour Mostefaoui; Michel Raynal

High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?
David Abramson; Jon Giddy; Lew Kotler

Space and Time Efficient Self-Stabilizing ?-Exclusion in Tree Networks (file unavailable)
Rachid Hadid

Virtual BUS: A Network Technology for Setting up Distributed Resources in Your Own Computer
Toshiaki Miyazaki; Atsushi Takahara; Shinya Ishihara; Seiichiro Tani; Takahiro Murooka; Tomoo Fukazawa; Mitsuo Teramoto; Kazuyoshi Matsuhiro

Limits and Power of the Simplest Uniform and Self-Stabilizing Phase Clock Algorithm (file unavailable)
Florent Nolot; Vincent Villain

Are Global Computing Systems Useful? - Comparison of Client-Server Global Computing Systems Ninf, NetSolve versus CORBA - (file unavailable)
Toyotaro Suzumura; Takayuki Nakagawa; Satoshi Matsuoka; Hidemoto Nakada; Satoshi Sekiguchi

Session 13 - Threading
JavaSpMT: A Speculative Thread Pipelining Parallelization Model for Java Programs
Iffat H. Kazi; David J. Lilja

On the Scheduling Algorithm of the Dynamic Trace Scheduled VLIW Architecture
Alberto Ferreira de Souza; Peter Rounce

Monotonic Counters: A New Mechanism for Thread Synchronization
John Thornley; K. Mani Chandy

Thread Migration and Load Balancing in Non-Dedicated Environments
Kritchalach Thitikamol; Peter Keleher

Caching Single-Assignment Structures to Build a Robust Fine-Grain Multi-Threading System
Wen-Yen Lin; Jean-Luc Gaudiot; José Nelson Amaral; Guang R. Gao

A Quantitative Assessment of Thread-Level Speculation Techniques
Pedro Marcuello; Antonio Gonzalez

Session 14 - Wormhole Routing
An Analytical Model of Fully-Adaptive Wormhole-Routed k-Ary n-Cubes in the Presence of Hot-Spot Traffic
H. Sarbazi-Azad; M. Ould-Khaoua; L.M. Mackenzie

Balancing Traffic Load for Multi-Node Multicast in a Wormhole 2D Torus/Mesh
San-Yuan Wang; Yu-Chee Tseng; Ching-Sung Shiu; Jang-Ping Sheu

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks
E. Baydal; P. López; J. Duato

Fair and Efficient Packet Scheduling in Wormhole Networks
Salil S. Kanhere; Alpa B. Parekh; Harish Sethu

Fault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults
Seungjin Park; Jong-Hoon Youn; Bella Bose

Session 15 - Input/Output
ACDS: Adapting Computational Data Streams for High Performance
Carsten Isert; Karsten Schwan

A Component Framework for Communication in Distributed Applications
Jeffrey M. Fischer; Milos D. Ercegovac

Design and Evaluation of I/O Strategies for Parallel Pipelined STAP Applications
Wei-keng Liao; Alok Choudhary; Donald Weiner; Pramod Varshney

A Multi-tier RAID Storage System with RAID1 and RAID5 (file unavailable)
Nitin Muppalaneni; K. Gopinath

Performance of the IBM General Parallel File System (file unavailable)
Alice Koniges; Terry Jones; R. Kim Yates

Session 16 - Shared Memory
Reducing Ownership Overhead for Load-Store Sequences in Cache-Coherent Multiprocessors
Jim Nilsson; Fredrik Dahlgren

Dynamic Data Layouts for Cache-conscious Factorization of DFT (file unavailable)
Neungsoo Park; Dongsoo Kang; Kiran Bondalapati; Viktor K. Prasanna

Exploring the Switch Design Space in a CC-NUMA Multiprocessor Environment
Marius Pirvu; Nan Ni; Laxmi Bhuyan

Fast Synchronization on Scalable Cache-Coherent Multiprocessors using Hybrid Primitives
Dimitrios S. Nikolopoulos; Theodore S. Papatheodorou

Using Switch Directories to Speed Up Cache-to-Cache Transfers in CC-NUMA Multiprocessors
Ravi Iyer; Laxmi Bhuyan; Ashwini Nanda

Predicting Performance on SMPs. A Case Study: The SGI Power Challenge
Nancy M. Amato; Jack Perdue; Andrea Pietracaprina; Geppino Pucci; Mark Mathis

Session 17 - Optical Computing
An Optimal Parallel Algorithm for Computing Moments on Arrays with Reconfigurable Optical Buses
Chin-Hsiung Wu; Shi-Jinn Horng; Horng-Ren Tsai; Jinn-Fu Lin; Tsrong-Lay Lin

Relating Two-Dimensional Reconfigurable Meshes with Optically Pipelined Buses
Anu G. Bourgeois; Jerry L. Trahan

Optimal All-to-All Personalized Exchange in a Class of Optical Multistage Networks
Yuanyuan Yang; Jianchao Wang

Wavelengths Requirement for Permutation Routing in All-Optical Multistage Interconnection Networks
Qian-Ping Gu; Shietung Peng

De Bruijn Isomorphisms and Free Space Optical Networks
D. Coudert; A. Ferreira; S. Perennes

Session 18 - Numerical Algorithms
Parallel Lagrange Interpolation on the Star Graph
H. Sarbazi-Azad; M. Ould-Khaoua; L.M. Mackenzie; S.G. Akl

Data Allocation Strategies for Dense Linear Algebra Kernels on Heterogeneous Two-dimensional Grids
Olivier Beaumont; Vincent Boudet; Fabrice Rastello; Yves Robert

Multicomputer Algorithms for Wavelet Packet Image Decomposition (file unavailable)
Manfred Feil; Andreas Uhl

On optimal fill-preserving orderings of sparse matrices for parallel Cholesky factorizations
Wen-Yang Lin; Chuen-Liang Chen

Using Postordering and Static Symbolic Factorization for Parallel Sparse LU
Michel Cosnard; Laura Grigori

Session 19 - Meshes and Arrays
A Constructive Solution to the Juggling Problem in Processor Array Synthesis (file unavailable)
Alain Darte; Robert Schreiber; B. Ramakrishna Rau; Frédéric Vivien

Repartitioning Unstructured Adaptive Meshes
José G. Castaños; John E. Savage

Study of a Multilevel Approach to Partitioning for Parallel Logic Simulation
Swaminathan Subramanian; Dhananjai M. Rao; Philip A. Wilsey