
SESSION INDEX
please wait for entire file to load before selecting a session name
- Keynote Speakers:
- David E. Culler, University of California at Berkeley - What Is So Different About Cluster Architectures?
Jim Gray, Microsoft Research - Parallel Data Access and Parallel Execution in a World of CyberBricks
Greg Papadopoulos, Sun Microsystems - The Future of Scalable Systems: The Interplay of Architecture and Management
- Panel Discussion:
- Data Intensive vs. Scientific Computing: Will the Twain Meet for Parallel Processing?
Moderator: Vipin Kumar, University of Minnesota
Nearly Optimal Algorithms for Broadcast on d-Dimensional All-Port and Wormhole-Routed Torus
Jyh-Jong Tsay, Wen-Tsong Wang, National Chung Cheng University
Minimizing Total Communication Distance of a Time-Step Optimal Broadcast in Mesh Networks
Songluan Cang, Jie Wu, Florida Atlantic University
Hiding Communication Latency in Data Parallel Applications
Vivek Garg, David E. Schimmel, Georgia Institute of Technology
Protocols for Non-Deterministic Communication over Synchronous Channels
Erik D. Demaine, University of Waterloo
Broadcast-Efficient Algorithms on the Coarse-Grain Broadcast Communication Model with Few Channels
Koji Nakano, Nagoya Institute of Technology, Stephan Olariu, James L. Schwing, Old Dominion University
Optimal All-to-Some Personalized Communication on Hypercubes
Y. Charlie Hu, Rice University
Compiler Optimization of Implicit Reductions for Distributed Memory Multiprocessors
Bo Lu, John Mellor-Crummey, Rice University
Local Enumeration Techniques for Sparse Algorithms
Gerardo Bandera, Pablo P. Trabado, Emilio L. Zapata, University of Malaga - Campus of Teatinos
Optimizing Data Scheduling on Processor-In-Memory Arrays
Yi Tian, Edwin H.-M. Sha, Chantana Chantrapornchai, Peter M. Kogge, University of Notre Dame
An Expression-Rewriting Framework to Generate Communication Sets for HPF Programs with Block-Cyclic Distribution
Gwan-Hwan Hwang, Jenq Kuen Lee, National Tsing-Hua University
A Generalized Framework for Global Communication Optimization
M. Kandemir, Syracuse University, P. Banerjee, A. Choudhary, Northwestern University, J. Ramanujam, Louisiana State University, N. Shenoy, Northwestern University
Evaluation of Compiler and Runtime Library Approaches for Supporting Parallel Regular Applications
Dhruva R. Chakrabarti, Northwestern University, Antonio Lain, Hewlett Packard Labs, Prithviraj Banerjee, Northwestern University
Preliminary Results from a Parallel MATLAB Compiler
Michael J. Quinn, Alexey Malishevsky, Nagajagadeswar Seelam, Yan Zhao, Oregon State University
Jacobi Orderings for Multi-Port Hypercubes
Dolors Royo, Antonio Gonzalez, Miguel Valero-Garcia, Universitat Politecnica de Catalunya
Automatic Differentiation for Message-Passing Parallel Programs
Paul Hovland, Christian Bischof, Argonne National Laboratory
Processor Lower Bound Formulas for Array Computations and Parametric Diophantine Systems
Peter Cappello, Omer Egecioglu, University of California at Santa Barbara
A Flexible Class of Parallel Matrix Multiplication Algorithms
John Gunnels, Calvin Lin, Greg Morrow, Robert van de Geijn, University of Texas at Austin
Caching-Efficient Multithreaded Fast Multiplication of Sparse Matrices
Peter D. Sulatycke, Kanad Ghose, State University of New York (Binghamton)
Permutation Capability of Optical Multistage Interconnection Networks
Yuanyuan Yang, University of Vermont, Jianchao Wang, GTE Laboratories, Yi Pan, University of Dayton
HIPIQS: A High-Performance Switch Architecture using Input Queuing
Rajeev Sivaram, Ohio State University, Craig B. Stunkel, IBM T.J. Watson Research Center, Dhabaleswar K. Panda, Ohio State University
On the Bisection Width and Expansion of Butterfly Networks
Claudson F. Bornstein, Carnegie Mellon University, Ami Litman, Technion, Bruce M. Maggs, Carnegie Mellon University, Ramesh K. Sitaraman, University of Massachusetts, Tal Yatzkar, Technion
Multiprocessor Architectures Using Multi-Hop Multi-OPS Lightwave Networks and Distributed Control
David Coudert, Afonso Ferreira, LIP ENS Lyon, Xavier Munoz, UPC
Distributed, Dynamic Control of Circuit-Switched Banyan Networks
Chuck Salisbury, Rami Melhem, University of Pittsburgh
A Case for Aggregate Networks
Raymond R. Hoare, Henry G. Dietz, Purdue University
An Enhanced Co-Scheduling Method Using Reduced MS-State Diagrams
R. Govindarajan, Supercomputer Education and Research Center and Indian Institute of Science, N.S.S. Narasimha Rao, Indian Institute of Science, E.R. Altman, IBM T.J. Watson Research Center, Guang R. Gao, University of Delaware
Predicated Software Pipelining Technique for Loops with Conditions
Dragan Milicev, Zoran Jovanovic, University of Belgrade
The Generalized Lambda Test
Weng-Long Chang, Chih-Ping Chu, Jesse Wu, National Cheng Kung University
Experimental Study of Compiler Techniques for Scalable Shared Memory Machines
Yunheung Paek, New Jersey Institute of Technology, David A. Padua, University of Illinois at Urbana-Champaign
Register-Sensitive Software Pipelining
Amod K. Dani, Indian Institute of Science, V. Janaki Ramanan, R. Govindarajan, Supercomputer Education and Research Center and Indian Institute of Science
Analyzing the Individual/Combined Effects of Speculative and Guarded Execution on a Superscalar Architecture
M. Srinivas, Silicon Graphics Inc., Alexandru Nicolau, University of California at Irvine
NOW Based Parallel Reconstruction of Functional Images
Frank Munz, T. Stephan, U. Maier, T. Ludwig, A. Bode, S. Ziegler, S. Nekolla, P. Bartenstein, M. Schwaiger, Nuklearmedizinische Klinik und Poliklinik des Klinikums rechts der Isar
An Improved Output-size Sensitive Parallel Algorithm for Hidden-Surface Removal for Terrains
Neelima Gupta, Sandeep Sen, Indian Institute of Technology (New Delhi)
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers
Alok Choudhary, Northwestern University, Wei-keng Liao, Donald Weiner, Pramod Varshney, Syracuse University, Richard Linderman, Mark Linderman, Air Force Research Laboratory
The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing
Mikael Taveniku, Ericsson Microwave Systems AB and Chalmers University of Technology, Anders Ahlander, Ericsson Microwave Systems AB and Halmstad University, Magnus Jonsson, Halmstad University, Bertil Svensson, Halmstad University and Chalmers University of Technology
Medical Image Processing and Visualization on Heterogeneous Clusters of Symmetric Multiprocessors using MPI and POSIX Threads
Christoph Giess, Achim Mayer, Harald Evers, Hans-Peter Meinzer, Deutsches Krebsforschungszentrum
A Quantitative Code Analysis of Scientific Systolic Programs: DSP Vs. Matrix Algorithms
R. Sernec, BIA D.o.o., M. Zajc, J.F. Tasic, University of Ljubljana
Tree-Based Multicasting in Wormhole-Routed Irregular Topologies
Ran Libeskind-Hadas, Dominic Mazzoni, Ranjith Rajagopalan, Harvey Mudd College
NoWait-RPC: Extending ONC RPC to a Fully Compatible Message Passing System
Thomas Hopfner, Franz Fischer, Georg Faerber, Technische Universitat Munchen
ftp://ftp.lpr.e-technik.tu-muenchen.de/pub/papers/rtsg/ipps98.ps.gz
Efficient Barrier Synchronization Mechanism for the BSP Model on Message-Passing Architectures
Jin-Soo Kim, Soonhoi Ha, Chu Shik Jhon, Seoul National University
Performance and Experience with LAPI -- a New High-Performance Communication Library for the IBM RS/6000 SP
Gautam Shah, IBM Power Parallel Systems, Jarek Nieplocha, Pacific Northwest National Laboratory, Jamshed Mirza, Chulho Kim, IBM Power Parallel Systems, Robert Harrison, Pacific Northwest National Laboratory, Rama K. Govindaraju, Kevin Gildea, Paul DiNicola, Carl Bender, Pacific Northwest National Laboratory
Total-Exchange on Wormhole k-ary n-cubes with Adaptive Routing
Fabrizio Petrini, International Computer Science Institute
Managing Concurrent Access for Shared Memory Active Messages
Steven S. Lumetta, David E. Culler, University of California at Berkeley
Design and Implementation of a Parallel I/O Runtime System for Irregular Applications
Jaechun No, Syracuse University, Sung-soon Park, Anyang University, Jesus Carretero, Universidad Politecnica de Madrid, Alok Choudhary, Northwestern University, Pang Chen, Sandia National Laboratory
Using PI/OT to Support Complex Parallel I/O
Ian Parsons, Jonathan Schaeffer, Duane Szafron, Ron Unrau, University of Alberta
Code Transformations for Low Power Caching in Embedded Multimedia Processors
C. Kulkarni, IMEC, F. Catthoor, H. De Man, IMEC and Katholieke Universiteit Leuven
Memory Hierarchy Management for Iterative Graph Structures
Ibraheem Al-Furaih, Syracuse University, Sanjay Ranka, University of Florida
High-Performance External Computations Using User-Controllable I/O
Jang Sun Lee, A.I. Section ETRI, Sunghoon Ko, Syracuse University, Sanjay Ranka, University of Florida, Byung Eui Min, A.I. Section ETRI
Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication
Hiroshi Tezuka, Francis O'Carroll, Atsushi Hori, Yutaka Ishikawa, Real World Computing Partnership
Synthesis of a Systolic Array Genetic Algorithm
G.M. Megson, I.M. Bland, University of Reading
Vector Prefix and Reduction Computation on Coarse-Grained, Distributed-Memory Parallel Machines
Seungjo Bae, Dongmin Kim, Syracuse University, Sanjay Ranka, University of Florida
Solving the Maximum Clique Problem using PUBB
Yuji Shinano, Science University of Tokyo, Tetsuya Fujie, Tokyo Institute of Technology, Yoshiko Ikebe, Ryuichi Hirabayashi, Science University of Tokyo
A Scalable VLSI Architecture for Binary Prefix Sums
R. Lin, SUNY Genesco, K. Nakano, Nagoya Institute of Technology, S. Olariu, Old Dominion University, M.C. Pinotti, I.E.I. C.N.R., J.L. Schwing, Old Dominion University, A.Y. Zomaya, University of Western Australia
Emulating Direct Products by Index-Shuffle Graphs
Bojana Obrenic, Queens College and Graduate Center of CUNY
A Comparative Study of Five Parallel Genetic Algorithms Using The Traveling Salesman Problem
Lee Wang, Anthony A. Maciejewski, Howard Jay Siegel, Purdue University, Vwani P. Roychowdhury, UCLA
A New Self-Routing Multicast Network
Yuanyuan Yang, University of Vermont, Jianchao Wang, GTE Laboratories
Optimal Contention-Free Unicast-Based Multicasting in Switch-Based Networks of Workstations
Ran Libeskind-Hadas, Dominic Mazzoni, Ranjith Rajagopalan, Harvey Mudd College
Multicasting and Broadcasting in Large WDM Networks
Weifa Liang, University of Queensland, Hong Shen, Griffith University
Optimally Locating a Structured Facility of a Specified Length in a Weighted Tree Network
Shan-Chyun Ku, Biing-Feng Wang, National Tsing Hua University
Deterministic Routing of h-relations on the Multibutterfly
Andrea Pietracaprina, Universita di Padova
An Efficient Counting Network
Costas Busch, Brown University, Marios Mavronicolas, University of Cyprus
Partitioned Schedules for Clustered VLIW Architectures
Marcio Merino Fernandes, University of Edinburgh, Josep Llosa, Universitat Politecnica de Catalunya, Nigel Topham, University of Edinburgh
Dynamic Processor Allocation with the Solaris Operating System
Kelvin K. Yue, Sun Microsystems Inc., David J. Lilja, University of Minnesota
Thread-based vs Event-based Implementation of a Group Communication Service
Shivakant Mishra, Rongguang Yang, University of Wyoming
Performance Sensitivity of Space-Sharing Processor Scheduling in Distributed-Memory Multicomputers
Sivarama P. Dandamudi, Hai Yu, Carleton University
Efficient Fine-Grain Thread Migration with Active Threads
Boris Weissman, Benedict Gomes, University of California at Berkeley and International Computer Science Institute, Jurgen W. Quittek, International Computer Science Institute, Michael Holtkamp, Technical University of Hamburg-Harburg
Clustering and Reassignment-Based Mapping Strategy for Message-Passing Architectures
M.A. Senar, A. Ripoll, A. Cortes, E. Luque, Universitat Autonoma de Barcelona
Asymptotically Optimal Randomized Tree Embedding in Static Networks
Keqin Li, State University of New York (New Paltz)
Resource Placements in 2D Tori
Bader Almohammad, Bella Bose, Oregon State University
An O((log log n)^2) Time Convex Hull Algorithm on Reconfigurable Meshes
Tatsuya Hayashi, Koji Nakano, Nagoya Institute of Technology, Stephan Olariu, Old Dominion University
Toward a Universal Mapping Algorithm for Accessing Trees in Parallel Memory Systems
Vincenzo Auletta, Universita di Salerno, Sajal K. Das, University of North Texas, Amelia De Vivo, Universita di Salerno, M. Cristina Pinotti, I.E.I. Consiglio Nazionale delle Ricerche, Vittorio Scarano, Universita di Salerno
Sharing Random Bits with No Process Coordination
Marius Zimand, Georgia Southwestern State University
Lower Bounds on Communication Loads and Optimal Placements in Torus Networks
M. Cemil Azizoglu, Omer Egecioglu, University of California at Santa Barbara
Impact of Switch Design on the Application Performance of Cache Coherent Multiprocessors
Laxmi N. Bhuyan, H. Wang, R. Iyer, Texas A&M University, A. Kumar, Intel Corporation
Parallel Tree Building on a Range of Shared address Space Multiprocessors: Algorithms and Application Performance
Hongzhang Shan, Jaswinder Pal Singh, Princeton University
Configuration Independent Analysis for Characterizing Shared-Memory Applications
Gheith A. Abandah, Edward S. Davidson, University of Michigan
Experimental Validation of Parallel Computation Models on the Intel Paragon
Ben H.H. Juurlink, University of Paderborn
Comparing the Optimal Performance of Different MIMD Multiprocessor Architectures
Lars Lundberg, Hakan Lennerstad, University of Kariskrona/Ronneby
The Design of COMPASS: An Execution Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors
Ashwini K. Nanda, IBM T.J. Watson Research Center, Yiming Hu, University of Rhode Island, Moriyoshi Ohara, IBM Tokyo Research Lab, Caroline D. Benveniste, Mark E. Giampapa, Maged Michael, IBM T.J. Watson Research Center
An Efficient RMS Admission Control and Its Application To Multiprocessor Scheduling
Sylvain Lauzac, Rami Melhem, Daniel Mosse, University of Pittsburgh
Guidelines for Data-Parallel Cycle-Stealing in Networks of Workstations
Arnold L. Rosenberg, University of Massachusetts
Low Memory Cost Dynamic Scheduling of Large Coarse Grain Task Graphs
Michel Cosnard, LORIA-INRIA Loraine, Emmanuel Jeannot, Laurence Rougeot, LIP ENS de Lyon
Benchmarking the Task Graph Scheduling Algorithms
Yu-Kwong Kwok, Ishfaq Ahmad, Hong Kong University of Science and Technology
A Performance Evaluation of CP List Scheduling Heuristics for Communication Intensive Task Graphs
Benjamin S. Macey, Albert Y. Zomaya, University of Western Australia
Utilization and Predictability in Scheduling the IBM SP2 with Backfilling
Dror G. Feitelson, Ahuva Mu'alem Weil, Hebrew University of Jerusalem
filename sp2.ps.gz at:
High Performance Data Mining Using Data Cubes on Parallel Computers
Sanjay Goil, Alok Choudhary, Northwestern University
An Efficient Parallel Algorithm for High Dimensional Similarity Join
Khaled Alsabti, Syracuse University, Sanjay Ranka, University of Florida, Vineet Singh, Hitachi America Ltd.
Sorting on Clusters of SMP's
David R. Helman, Joseph JaJa, University of Maryland
An $AT^{2}$ Optimal Mapping of Sorting onto the Mesh Connected Array without Comparators
Ju-wook Jang, Sogang University
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets
Mahesh V. Joshi, George Karypis, Vipin Kumar, University of Minnesota
Improved Concurrency Control Techniques for Multi-dimensional Index Structures
K.V. Ravi Kanth, F. David Serena, Ambuj K. Singh, University of California at Santa Barbara
A Clustered Approach to Multithreaded Processors
Venkata Krishnan, Josep Torrellas, University of Illinois at Urbana-Champaign
C++ Expression Templates Performance Issues in Scientific Computing
Federico Bassetti, New Mexico State University and Scientific Computing Group CIC-19, Kei Davis, Dan Quinlan, Scientific Computing Group CIC-19
Aggressive Dynamic Execution of Multimedia Kernel Traces
Benjamin Bishop, Robert Owens, Mary Jane Irwin, Pennsylvania State University
Performance Prediction in Production Environments
Jennifer M. Schopf, Francine Berman, University of California at San Diego
Predicting the Running Time of Parallel Programs by Simulation
Radu Rugina, Klaus E. Schauser, University of California at Santa Barbara
Compile-time Synchronization Optimizations for Software DSMs
Hwansoo Han, Chau-Wen Tseng, University of Maryland
An Efficient Logging Scheme for Lazy Release Consistent Distributed Shared Memory System
Taesoon Park, Sejong University, Heon Y. Yeom, Seoul National University
Update Protocols and Iterative Scientific Applications
Pete Keleher, University of Maryland
Characterizations for Java Memory Behavior
Alex Gontmakher, Assaf Schuster, Technion
Locality and Performance of Page- and Object-Based DSMs
Bryan Buck, Pete Keleher, University of Maryland
Optimistic Synchronization of Mixed-Mode Simulators
Peter Frey, Radharamanan Radhakrishnan, Harold W. Carter, Philip A. Wilsey, University of Cincinnati
Airshed Pollution Modeling: A Case Study in Application Development in an HPF Environment
Jaspal Subhlok, Peter Steenkiste, James Stichnoth, Peter Lieu, Carnegie Mellon University
Design of a FEM Computation Engine for Real-Time Laparoscopic Surgery Simulation
Alex Rhomberg, Rolf Enzler, Markus Thaler, Gerhard Troester, Eidgenossische Technische Hochschule
SIMD and Mixed-Mode Implementations of a Visual Tracking Algorithm
Mark B. Kulaczewski, Howard Jay Siegel, Purdue University
The Implicit Pipeline Method
John B. Pormann, John A. Board Jr., Donald J. Rose, Duke University
Rendering Computer Animations on a Network of Workstations
Timothy A. Davis, Edward W. Davis, North Carolina State University
Hyper-Butterfly Network: A Scalable Optimally Fault Tolerant Architecture
Wei Shi, Pradip K. Srimani, Colorado State University
Scheduling Algorithms Exploiting Spare Capacity and Tasks' Laxities for Fault Detection and Location in Real-Time Multiprocessor Systems
K. Mahesh, G. Manimaran, C. Siva Ram Murthy, Indian Institute of Technology, Arun K. Somani, University of Washington
The Robust-Algorithm Approach to Fault Tolerance on Processor Arrays: Fault Models, Fault Diameter, and Basic Algorithms
Behrooz Parhami, Chi-Hsiang Yeh, University of California at Santa Barbara
Fault-Tolerant Switched Local Area Networks
Paul LeMahieu, Vasken Bohossian, Jehoshua Bruck, California Institute of Technology
Trace-Driven Debugging of Message Passing Programs
Michael Frumkin, Robert Hood, Louis Lopez, NASA Ames Research Center
Predicate Control for Active Debugging of Distributed Programs
Ashis Tarafdar, Vijay K. Garg, University of Texas at Austin
VPPB - A Visualization and Performance Prediction Tool for Multithreaded Solaris Programs
Magnus Broberg, Lars Lundberg, Hakan Grahn, University of Kariskrona/Ronneby
Parallel Performance Visualization Using Moments of Utilization Data
T.J. Godin, Michael J. Quinn, C.M. Pancake, Oregon State University
Optimizing Parallel Applications for Wide-Area Clusters
Henri E. Bal, Aske Plaat, Mirjam G. Bakker, Peter Dozy, Rutger F. H. Hofman, Vrije Universiteit
Prioritized Token-Based Mutual Exclusion for Distributed Systems
Frank Mueller, Humboldt-Universitat zu Berlin
Adaptive Quality Equalizing: High-Performance Load Balancing for Parallel Branch-and-Bound Across Applications and Computing Systems
Nihar R. Mahapatra, State University of New York at Buffalo, Shantanu Dutt, University of Illinois at Chicago
Memory Space Representation for Heterogeneous Network Process Migration
Kasidit Chanchio, Xian-He Sun, Louisiana State University
WILDFIRE(tm) Heterogeneous Adaptive Parallel Processing System
Bradley K. Fross, Senior WILDFIRE Application Engineer, Dennis M. Hawver, Principal Design Engineer, James B. Peterson, Principal Design Engineer
Annapolis Micro Systems, Inc.
ACEcard(tm): A High Performance Architecture for Run-Time Reconfiguration
Don Davis, Manager (Strategic Engineering), Jonathan Harris
TSI TelSys, Inc.
A Hardware / Software Co-Design System using Configurable Computing Technology
John Schewel, Vice President of Sales & Marketing
Virtual Computer Corporation
DEEP: A Development Environment for Parallel Programs
Brian Brode, Vice President, Chris Warber, Senior Analyst, James Bonang, Software Engineer
Pacific-Sierra Research Corporation
Rapid Development of Real-Time Systems Using RTExpress
Milissa Benincasa, Senior Software Engineer, Richard Besler, Senior Software Engineer, Diane Brassaw, Senior Software Engineer, Ralph L. Kohler Jr., Program Manager (Air Force Research Laboratory)
Integrated Sensors, Inc.
Evaluating ASIC, DSP, and RISC Architectures for Embedded Applications
Marc Campbell, Technical Lead (High Performance Computing)
Northrop Grumman
The Effect of the Router Arbitration Policy on ServerNet(tm) Topolgies
Vladimer Shurbanov, Research Assistant (Boston University), Dimiter R. Avresky, Associate Professor (Boston University), Robert Horst, Technical Director
Tandem Computers, a Compaq Company