Resume
Curricula Vitae
Academic
Experience
Education ----------------------------------------------------------------------------------- 2
Professional Experience
------------------------------------------------------------------------- 2
Current Research
Areas-------------------------------------------------------------------------- 3
National Recognition ------------------------------------------------------------------------- 3
Institutional Recognition --------------------------------------------------------------- 4
Section A: Teaching and Research Supervision
A.1
Teaching --------------------------------------------------------------------------------------------- 5
A.1.a Teaching at University of Delaware -------------------------------------------- 5
A.1.b Other Teaching Experience ------------------------------------------------------ 6
A.2.
Research Supervision -------------------------------------------------------------------------- 7
B.1:
Research Activity and Interests ---------------------------------------------------------------- 9
B.2:
List of Research Contributions ---------------------------------------------------------------- 11
Refereed Journal Publications ----------------------------------------------------------------------- 11
Publications in Refereed Conference Proceedings
(Last Six Years Only)
-------------------------------------------------------------------------- 12
Monographs, Books and Book Chapters ------------------------------------------------------- 17
B.3 Research Significance --------------------------------------------------------------------------- 18
B.4 Research Support ------------------------------------------------------------------------------------- 19
Section
C: Services
C.1 University Activities and Services ----------------------------------------------------------------- 20
C.2
Professional Services --------------------------------------------------------------------------- 20
CURRICULUM VITAE
NAME: Guang R. Gao
OFFICE ADDRESS:
Department
of Electrical Engineering
104
Evans Hall
University
of Delaware
Newark,
DE 19716
Tel:
302-831-8218
Fax:
302-831-4316
ggao@eecis.udel.edu
EDUCATION
Ph.D
Degree in Electrical Engineering and Computer
Science
Massachusetts
Institutes of Technology, August 1986.
Member
of Computational Structures Group at Laboratory of Computer Science, MIT,
June
1982 to August 1986.
Master
Degree in Electrical Engineering and
Computer Science
Massachusetts
Institutes of Technology, June 1982.
BS
in Electrical Engineering
Tsinghua
University, Beijing.
PROFESSIONAL EXPERIENCE
Newark,
DE.
Associate
Professor, Department of Electrical and Computer Engineering, Sept. 96-present
Associate
Professor, Department of Electrical and Computer Engineering, Sept. 96-present
Founder
and a leader of the Computer Architectures and Parallel Systems Laboratory
(CAPSL) .
Montreal,
Canada
Associate
Professor, School of Computer Science, June'92-August,1996
Assistant
Professor, School of Computer Science, Aug.'87-June'92
Founder
and a leader of the Advanced Compilers, Architectures and Parallel Systems
Group (ACAPS) at McGILL since 1988.
Sept.
1986 – June 1987
Briarcliff
Manor, NY, USA
Senior
member of research staff of the Computer Architecture and Programming Systems
Group. Played a major role in founding a multiprocessor system project, and
research in parallelizing compilers.
June
1980 - Aug. 1986
Member
of the Computational Structures Group at the Laboratory of Computer Science,
MIT. Participated in the MIT Static Dataflow Architecture Project and other
projects.
Proposed
a novel methodology of organizing array operations to exploit the fine-grain
parallelism of dataflow computation models. Developed a unique pipelined code
mapping scheme for dataflow machines (later known as dataflow software
pipelining).
Aug
1993 – June 1994
Visiting
scientist with a NSERC Senior Industrial Fellowship.
CURRENT RESEARCH AREAS:
Computer
Architecture and Systems
Parallel
and Distributed Systems
Optimizing
and Parallelizing Compilers, Parallel Programming
VLSI
and Application-Specific System Design
PROFESSIONAL MEMBERSHIP
I am a Senior Member of IEEE, Member of ACM, ACM-SIGARCH, ACM-SIGPLAN.
I am currently a Distinguished Visitor of IEEE Computer
Society.
NATIONAL RECOGNITION:
·
IEEE Computer Society
Distringuished Visitor, 1998-2001
·
IEEE, Senior Member
·
Program Committee Members of
Recognized International Conferences
-
IEEE
International Symposium on Computer Architecture (HPCA-95, HPCA-99, HPCA-00)
-
ACM
Symposium on Programming Language Design and Implementation (PLDI-98)
-
ACM
International Conference on Supercomputing (ICS-95)
-
ACM/IEEE
International Symposium on Microarchitectures (MICRO-95, 96, 97)
-
International
Parallel Processing Symposium (IPPS-95)
-
IFIP
and ACM SIGARCH International Conference on Parallel Architectures and
Compilation Techniques (PACT-94,95,96,97,98,99)
-
International
Conference on Algorithms And Architectures for Parallel Processing (ICAPP-95)
-
Parallel
Architecture and Language Europe (PARLE-91,92,93,94,95)
-
International
Conference on Parallel Processing (EURO-PAR-95,96)
-
Working
Conference on Massively Parallel Programming Models (MPPM-93,95,97, 99)
-
High
Performance Computing Symposium (HPCS-95, 96, 98), Canada.
-
International Conference on Compiler Construction
(CC-98,99,00), Europe.
-
International Symposium on High Performance Computing
(ISHPC99), Japan.
·
Conference Committee
Chairmanship
- Program Chairman of the 1994 ACM SIGARCH, International Conference on parallel Architectures and Compilation Techniques (PACT '94), Aug/. 1984. Montreal, Canada, co-sponsored by IFIP and in association with ACM SIGPLAN, IEEE TCCA (Technical Committee on Computer Architecture) and IEEE TCPP (Technical Committee on Parallel Processing).
- General Co-Chair of the 1998 International Conference on Parallel Architectures and Compilation Techniques (PACT '98), Oct. 1998, Paris, France., co-sponsored by IFIP and IEEE Computer Society
- Chair of the Third Workshop on Petaflop Computing, Feb. 1999. Annapolis, MD.
- Co-Chair of the Multithreaded Architecture Workshop, in Conjunction to HPCA99, Orlando, Florida, Jan. 1999.
- Co-Chair of the Compiler and Architecture Support for Embedded Systems (CASES98), Washington D.C., Oct. 1998.
·
Journal Editorialship
-
I
am elected to the Editorial Board of
IEEE Transactions on Computers (1998 -)
-
I
am elected to the Editorial Board of
IEEE Concurrency Journal (1997 -)
-
I
joined the Editorial Board of the Journal on Programming Languages in Jan.
1996, and subsequently became one of the two Co-Editors of the journal.
-
I have been a Guest Editor for the Special
Issue on Dataflow and Multithreaded Computers, Journal of Parallel and
Distributed Computing, Academic Press, June, 1993.
·
Invited Seminars and
Distinguished Seminars
I have given seminars in many industrial and academic organizations: IBMT.J. Watson Research Center, IBM Toronto Lab, AT&T Bell Laboratories, BNR, HP Labs, SGI, DEC, NRL(Navy Research Lab.), MIT, Stanford, UC Berkeley, NYU, Cornell U., University of Victoria are just named a few.
Section A: Teaching and
Research Supervision
A.1.a Teaching at University
of Delaware
·
New courses introduced and
taught:
This is is a new
undergraduate CE design course which I developed and taught in Spring
97. Now, it will be offered again under
the title CPEG-422 this fall.
This course stresses the principal design concepts
which are embodied in modern computer architectures, and emphasizes ideas which
we believe will continue to apply into the future, in spite
of a rapidly changing technological environment. The primary objective of the course is to
show how the design and evaluation of architectural features, based on both
qualitative and quantitative studies, can be used to achieve balanced,
efficient systems, well-matched to the class of problems they overcome.
ELEGG-652: Topics in High-Performance Architecture
This
is a graduate core course and I taught it first time in the fall of 1997.
This course examines the basic principles and
methodology used in the design and evaluation
of
high-performance computer architectures, and its relation with the underlying
program execution and architecture models. Topics include pipelining and vector processing, instruction level
parallelism (ILP) architectures, multiprocessor architectures and high-speed
networking, memory consistency models and cache-coherence issues, fine-grain
parallelism and multithreaded architectures, and the roles of optimizing and
parallelizing compilers.
ELEG 867-14: Topics in Hardware/Software Codesign
This new course introduces the concepts, principles
and methods of digital system design from both a hardware and a software
viewpoint. In the context of general purpose computer systems, the principles
studied in this course include the close interaction between compiler
technology and architecture design. In
the context of special-purpose systems, such as embedded systems, the course
will deal with the close interaction between software synthesis and hardware
system design.
Topics to be discussed include the fundamentals of
analysis, generation, synthesis, and optimization of computer code. Specific
topics in this area include dependency analysis, code motion,
scheduling,
register and resource allocation. Among the hardware micro-architecture topics
studied are pipeline co-design and memory models. Important case studies that illustrate the basic principles
of
software/hardware co-design will be introduced. Topics in the new emerging field of adaptive computing system
design will be discussed.
·
Activities
- New hardware/software tools
introduced or developed for teaching laboratories:
Modern computer architecture and system design
involve both intensive software and hardware design activities. In the new
courses introduced, the students are
exposed to both software/hardware tools and methodology for computer
architecture design (e.g. software simulation toolset) as well as hardware
design tools and methodology (e.g. VHDL tools and environments) on digital
systems.
Students
are expected to learn modern design tools and related skills through lab
assignments and course projects. To
this end, we have invested extensive effort to develop the laboratory and
introduce the VHDL design environment in the course.
q
The
SEMi instruction set architecture simulator which provide accurate timing
simulation for RISC-like architectures and its cache memory.
q
The
EARTH architecture emulation testbedwith a 20-node multiprocessor hardware
engine which provide tools to study parallel and multithreaded programming
paradigms and architecture models. The
EARTH-MANNA platform and PC based EARTH-Beowulf platform are developed and made
available in the teaching.
q
A
series of VHDL based hardware design and simulation tools has been introduced
and established, which include VHDL behaviour simulation tool, the VLSI
synthesis tools, FPGA place and routing tools, and the FPGA based hardware experimental test boards.
This has considerably enhanced the teaching capacity for undergraduate
design course and courses in computer architectures.
q
Various
benchmark suits for architecture/compiler studies: SPEC, LINPACK, Whetstone,
Drystone, Livermore Loops, NAS, Spice, GCC etc. have been introduced;
- The CAPSL laboratory seminar
series.
I have established the Computer Architecture and Parallel System Laboratory since I
joined UDel. In addition to perform research, one important objective of this
laboratory has been to facilitate the teaching of the computer architecture and
digital system courses, and training of graduate research and teaching assistants.
The new courses and software tools described above depend directly on this
laboratory. The laboratory is now equipped with various workstations, We have a
wide variety of research and teaching software installed, and a number of my
best graduate students have been actively participated and contributed to
teaching. Activities organized include:
q
organization
of the CAPSL research seminar series;
q
invitation
of a number of distinguished speakers of international reputation to give such
seminars;
A.1.b Other Teaching Experience
At McGill Uniersity, I have introduced and developed
a set of new courses (308-505,308-605,308- 622) on high-performance computer
architectures, parallel systems and parallelizing compilers. These courses have
been consolidated and improved over the period of time, forming a core for
students who are interested in the related subject areas. I have also taught a number of graduate seminar courses. (Details can be provided by request). The excellence of my teaching have been recognized through the
following outstanding teaching award nominations:
-
nomination
for the McGill Engineering Class of 51' Award for Outstanding Teaching (1988);
- nomination reconsideration for the Engineering Class of 51' Award for
Outstanding Teaching (1989);
-
nomination
for the McGill Engineering Class of 51' Award for Outstanding Teaching (1990);
- nomination reconsideration for the Engineering Class of 51' Award for
Outstanding Teaching (1991);
A.2. Research Supervision
Current,
graduate students under my supervision include:
Gieger, Thomas
(processing in memory and multithreading)
Marquez, Andres
(multithreaded architectures)
Ryan, Sean (optimizing
compilers)
Stouchinin, Artour
(instruction-lelvel parallelism, software pipelining)
Tang, Xi-Nan (compiler for multitheading)
Thulasiraman, Parimala,
(parallel algorithmsand applications)
Yang, Hongbo (instruction-level parallelism)
Douillet, Alban (compiling for multithreading)
Current Postodoc
fellows under my supervision include:
Amaral, Nelson (system software, compilers)
Kevin, Theobald (computer architecture, parallel systems)
Rupak, Thulasiram (parallel applications)
Already Completed:
The applicant has completed the supervision of 7 Ph.D. and 18 M.Sc. students, and 5 postdoctoral fellows in the proposed research areas of high-performance computing.
Post-Doctor |
Ph.D. Level |
|
M.Sc Level |
|
||
(4 Completed) |
(7 Graduaged) |
|
(18 Graduated) |
|
||
G. Liao |
E. Altman |
(1991 -
1996) |
H. Cai |
(1995 -
1997) |
R. Shanker |
(1991 -
1993) |
(1991-1993) |
H. Hum |
(1998 -
1992) |
N. Elmasri |
(1992 -
1995) |
N. Shiri |
(1990 -
1992) |
O.
Maquelin |
S.
Nemawarka |
(1989 -
1996) |
A. Emtage |
(1988 -
1991) |
R. Silvera |
(1996 -
1997) |
(1994 -
1998) |
Q. Ning |
(1990 -
1993) |
S.H. Han |
(1996 -
1997) |
A.
Stouchinin |
(1994 -
1996) |
G.
Ramaswamy |
V. C.
Sreedhar |
(1990 -
1995) |
A. Jimenez |
(1993 -
1996) |
R. Wen |
(1993 -
1995) |
(1990 –
1994) |
G.
Tremblay |
(1988 -
1994) |
L. Lozano |
(1992 -
1994) |
Y-B Wong |
(1989 -
1991) |
X. Tian |
R. Yates |
(1988 -
1992) |
S. Merali |
(1993 -
1996) |
|
|
(1993 –
1996) |
|
|
C. Moura |
(1991 -
1993) |
|
|
J. Wang |
|
|
C. Mukerji |
(1991 -
1994) |
|
|
(1995 –
1997) |
|
|
R. Olsen |
(1989 -
1992) |
|
|
|
|
|
Z.
Paraskevas |
(1987 -
1989) |
|
|
|
|
|
H. Petry |
(1995 -
1997) |
|
|
Those who have graduated are highly trained in the
field of parallel architectures and compilers, as evidenced by the fact that
they have been working (or worked) as tenure-track university professors
(Ramaswamy, Tremblay); as engineers in key industrial sectors, e.g.,
Intel(Hum), Nortel (Wang), IBM
(Altman,Nemawarkar, Sreedhar), BNR (Liao, Wen), HP (Lozano), Convex (Ning),
NCUBE (Olsen), CAE(Nassur), AT&T (Petry); and as researchers in government
labs, e.g., LLNL (Yates), or assuming other professional jobs.
Section B: Scholarship
1. Computer
Architecture and Systems.
One main question
facing modern computer architects is: Is it ever possible to build a
high-performance parallel architecture combining the power of hundreds, or even
thousands, of processors to solve real world applications (regular or
irregular) with scalable performance?
My research interests have been seeking an answer to this challenge. In particular, our primary work has been concentrated on multithreaded program execution models and architectures. To this end, I have initiated/led or played a major roles in a number of research projects in this area.
-
In
the EARTH (Efficient Architecture for Running THreads) project, our focus has
been, given the conventional off-the-shell processor technology, how can a
multithreaded program execution model and architecture be developed which can
exploit fine-grain parallelism and deliver scalable performance with affordable
cost. Our current activities include: refinement of the EARTH program execution
model and shared-memory architecture support (partially supported via a
NSF-MIPS grant joint with USC), study and implementation of EARTH model on a
cluster of SMP workstations linked with high-speed networks (via a NSF-CISE
infrastructure grant), the study and implementation of a real world large
irregular application (the crack propagation) on EARTH platforms (partially
supported via a NSF-CISE grant joint with Cornell), and the investigation of
compiling techniques for multithreaded architectures (partially supported via a
NSF-CISE grant).
-
In
the HTMT (Hybrid Technology Multithreaded Architecture Project), our focus is
on very high-end parallel supercomputer architectures based on advanced
technology and beyond the off-the-shell processor architectures. Our recent and
current activities include an initial "point design" study of the
HTMT architecture model (funded partially via a NSF grant), and subsequently a
feasibility study of the HTMT program execution and architecture model for a
petaflop-scale architecture which employs and integrates the combined
capabilities of semiconductor, superconductor, and optical technologies, as well
as the PIM (processing-in-memory) technology (funded via a grant from
DARPA/NSA/ NASA through JPL/Caltech).
-
In
the Data-IntensiVe Architecture Project (DIVA), our goal is to exploit
alternative multithreaded execution model to fully utilize the processing power
and memory bandwidth provided by the DIVA PIM chip for large-scale
high-performance data base applications (funded through a grant from DARPA via
Caltech/JPL).
2. Optimizing
compiler technology.
-
Modulo scheduling and
software pipelining. My interest in modulo scheduling and software pipelining stemmed from
work on register allocation for loops on dataflow machines. This work
culminated in a mathematical formulation of the problem in a linear periodic
form. It was soon discovered that this formulation can also be applied to
software pipelining for conventional architectures. This formulation was then
used to prove an interesting theoretical result: the minimum storage assignment
problem for rate-optimal software pipelined schedules can be solved using an
efficient polynomial-time method provided the target machine has enough
functional units so resource constraints can be ignored. At the same time, I
and my colleague and students have proposed the use of an ``interval graph''
based register allocation algorithms which appear to provide a good
representation to study combined instruction scheduling and register
allocation. Subsequently, we extended our framework to handle resource
constraints, resulting in a unified integer linear programming formulation for
the problem of simple pipelined architectures. The work was subsequently
generalized to more complex architectures. This work was implemented in MOST --- the Modulo Scheduling Toolset --
developed in my group. Recently, I and co-workers have proposed
"co-scheduling", a FSA (Finite State Automata) based framework for
simultaneous design of hardware pipelines structures and software-pipelined
schedules (partially funded through a NSF-CCR grant).
-
Program analysis techniques. I have been interested in
program analysis techniques for compiler optimization. V.C. Sreedhar (my Ph.D
student) and myself have proposed a novel program representation, called the DJ
graph. Based on DJ-graph, we have developed a surprisingly simple algorithm for
computing Phi-nodes for arbitrary
flowgraphs (reducible or irreducible) that runs in linear time. Based on DJ graphs, we have developed other novel
and efficient algorithms to a series of problems in flowgraph analysis such as
multiple-node immediate dominator analysis, identification of reducible and
irreducible graphs, incremental algorithm for maintaining dominator trees, and
exhaustive and incremental dataflow analysis based on DJ-graphs.
-
Program parallelization: I have been interested in
methodology of collective loop optimization. We have developed a methodology
which has been applied to a collection of loops to perform a novel optimization
called array contraction, that saves space and time by converting an array
variable into a scalar variable or a buffer containing a small number of scalar
variables. We have shown that the array
contraction problem can be solved efficiently for a class of loops.
-
Thread partitioning. I have been interested in
the automatic thread partitioning and the threaded-code generation problem. We
have developed a new heuristic algorithms based on an extension of the
classical list scheduling algorithm. Based on a cost model, our algorithm
groups instructions into threads by considering the trade-offs of the following
characteristics: exploitation of parallelism, latency tolerance, minimizing
thread switching costs and sequential execution efficiency. The proposed
algorithm has been implemented, and a quantitative performance study of our
algorithm
has been conducted.
Currently, we are extending our method and studying new thread
partitioning algorithms which can integrate the scheduling and register
allocation under the same framework
(partially funded via a NSF-CCR grant).
3. Other
areas
-
Memory
consistency models. I am interested in the problem of defining a memory model
that does not rely on the memory coherence assumption, and also the problem of
designing a cache consistency
protocol based on such a memory model. I and my
colleague have defined a new memory consistency model, called Location
Consistency (LC), in which the state of a memory location is modeled as a
partially ordered multiset (pomset) of write and
synchronization operations. We have
proved that LC is strictly weaker than existing memory models, but is still
equivalent to stronger models for parallel
programs that have no data races. We also introduced
a new multiprocessor cache consistency protocol based on the LC memory model.
B.2: List of Research
Contributions
X. Tang and Guang R. Gao, Automatic partitioning threads for multithreaded architectures, Special Issues on Compilation and Architectural Support for Parallel Applications, Accepted for Application, June, 99.
1.
Vugraman
C. Sreedhar, Guang R. Gao and Yong-Fong Lee, A New Framework for
Elimination0Based Dataflow Analysis Using DJ Graphs, ACM Transaction on
Programming Languages and Systems, Vol 20, No. 2, PP 388-433, march 1998.
2.
Erik
Altman, Guang R. Gao, Optimal Modulo Scheduling Through Enumeration,
International Journal on Parallel Programming, Accepted for publication, 1998.
3.
Erik
Altman, Guang R. Gao, A Unified Framework for Instruction Scheduling and
Mapping for Function Units with Structural Hazards, Journal of Parallel and
Distributed Computing, No. 39, pp 259-293, 1998.
4.
Vugranam
C. Sreedhar, Guang R. Gao, and Yong-fong Lee. Incremental computation of
dominator trees. ACM Transactions on Programming Languages and Systems, 1996.
Vol 19, No. 2, pp239-252, March 1997.
5.
Vugranam
C. Sreedhar, Guang R. Gao, and Yong fong Lee. A quadratic time algorithm for
computing multiple node immediate dominators. Journal of Programming Languages,
1996. Accepted for Publication.
6.
R.
Govindarajan, Erik R. Altman, and Guang R. Gao. A framework for
resource-constrained rate-optimal software pipelining. IEEE Transactions on
Parallel and Distributed Systems, pages 1133-1149, November 1996.
7.
Herbert
H. J. Hum, Olivier Maquelin, Kevin B. Theobald, Xinmin Tian, Guang R. Gao, and
Laurie J. Hendren. A study of the EARTH-MANNA multithreaded system. International
Journal of Parallel Programming, 24(4):319-347, August 1996.
8.
Vugranam
Sreedhar, Guang R. Gao, and Yong fong Lee. Identifying loops using dj graphs.
ACM Transactions on Programming Languages and Systems, 1996. Accepted.
9.
Vugranam
C. Sreedhar and Guang R. Gao. A linear time algorithm for placing OE-nodes.
Journal of Programming Languages, 1995. Accepted.
10.
Ning
Qi, Vincent V. Dongen, and Guang R. Gao. Automatic data and computation
decomposition for distributed memory machines. Parallel Processing Letters,
5(4):539-550, April 1995.
11.
Vugranam
Sreedhar and Guang Gao. Computing OE-nodes in linear time using DJ-graphs.
Journal of Programming Languages, April 1995. 3(1995), page 191-213.
12.
E.
Arjomandi, W. O'Farrell, I. Kalas, G. Koblents, F. Ch. Eigler, and G. R. Gao.
ABC++: Concurrency by inheritance in C++. IBM Systems Journal, 34(1):120-137,
1995.
13.
R.
Govindarajan and Guang R. Gao. Rate-optimal schedule for multi-rate DSP
computations. Journal of VLSI Signal Processing, 9(3), April 1995. page
211-232.
14.
G. R. Gao. An efficient hybrid dataflow
architecture model. Journal of Parallel and Distributed Computing,
19(4):293-307, December 1993.
15.
Laurie
J. Hendren, Guang R. Gao, Erik R. Altman, and Chandrika Mukerji. A register
allocation framework based on hierarchical cyclic interval graphs. The Journal
of Programming Languages, 1(3):155-185, 1993.
16.
Qi
Ning and Guang R. Gao. Optimal loop storage allocation for argument-fetching
dataflow machines. International Journal of Parallel Programming,
21(6):421-448, December 1992.
17.
H.
H. J. Hum and G. R. Gao. A high-speed memory organization for hybrid
dataflow/von Neumann computing. Future Generation Computer Systems, 8:287-301,
1992.
18.
G.
R. Gao, H. H. J. Hum, and Y-B Wong. Toward efficient fine-grain software pipelining
and the limited balancing techniques. International Journal of Mini and
Microcomputers, 13(2):57-68, 1991.
19.
Guang
R. Gao. Exploiting fine-grain parallelism on dataflow architectures. Parallel
Computing, 13(3):309-320, March 1990.
Publications in Refereed
Conference Proceedings (Last SixYears Only)
I have more than 80 publications in refereed conferences. Due to space limitations, only those in the last 6 years are listed. The rest can be provided by request.
1.
G.
Heber, R. Biswas, and Guang R. Gao, Self-Adative Walks over Adaptive
Unstructured Grids, In the Proceedings of
Irregular’99 in conjuction to the International Parallel Processing
Symposium (IPPS/SPDP), pp 969-977, San Juan, Puerto Rico, April 12-16, 1999.
2. G. Heber, R. Biswas, P. Thulasiram and Guang R. Gao, Using Multithreading for Automatic Load Balancing of Adaptive Finite Element Meshes, In the Proceedings of Irregular’99 in conjuction to the International Parallel Processing Symposium (IPPS/SPDP), pp 969-977, San Juan, Puerto Rico, April 12-16, 1999.
3.
A.
Khokhar, G. Heber, Parimala Thulasiraman and Guang R. Gao, Load Adaptive
Algorithms and Implementation for the 2D Discret e Wavelet Transform on
Fine-Grain Multihtreaded Architctures, In the Proceedings of the International
Parallel Processing Symposium (IPPS/SPDP), pp 360-364, San Juan, Puerto
Rico, April 12-16, 1999.
4. G. Heber, R. Biswas, and Guang R. Gao, Self-Avoiding Walks over Adaptive Triangular Grids, In Proceedings of SIAM Parallel Processing Conference for Scientific Computing, San Antonio, Texas, April, 1999.
5.
Chihong
Zhang, R. Govindarajan, and Guang R. Gao, Efficient State-Diagram Construction
Methods for Software Pipelining, In Proceedings of the International Conference
on Compiler Construction, CC'99, held as part of ETAPS'99, Amsterdam, The
Netherland, March 22 – 26, 1999.
6.
K.
Theobald, Guang R. Gao and T. Sterling, Superconducting Processors for HTMT:
Issues and Challenges, In Proceedings of The Seventh Symposium on The Frontiers
of Massively Parallel Computation (Frontiers’99), pp 260-267, Annopolis,
Maryland, February 21-25, 1999.
7. H. Cai, O. Maquelin, P. Kakulavarapu and Guang R. Gao, Design and Evaluation of Dynamic Load Balancing Schemes under a Fine-Grain Multithreaded Execution Model, In Proceedings of Workshop on Multithreaded Execution, Architecture and Compilation (MTEAC), in conjunction to the 1999 IEEE Symposium on High-Performance Computer Architecture (HPCA99), Orlando, Florida, Janurary, 1999.
8. A. Marquez, K. Theobald, X. Tang and Guang R. Gao, The Superstrand Model, In Proceedings of Workshop on Multithreaded Execution, Architecture and Compilation (MTEAC), in conjunction to the 1999 IEEE Symposium on High-Performance Computer Architecture (HPCA99), Orlando, Florida, Janurary, 1999.
9.
Sylvain
Lelait, Guang R. Gao and Christine Eisenbeis, A New Fast Algorithm for Optimal
Register Allocation in Modulo Scheduled Loops, In Proceedings of the
International Conference on Compiler Construction, CC'98, held as part of
ETAPS'98, 1998, Kai Koskimies, volume 1383, Lecture Notes in Computer Science,
pp 204--218, Springer, Lisbon, Portugal, March 28 - April 4.
10.
R.
Govindrarajan, Narasimba Rao, E.R. Altman and Guang R. Gao, An Enhanced
Co-Scheduling Method using Reduced MS-State Diagrams, In the Proceedings of the
International Parallel Processing Symposium (IPPS/SPDP), pp 168-175, Orlando,
Florida, April, 1998.
11.
Maria-Dana
Tarlescu, Kevin Theobald and Guang R. Gao, Elastic History Buffer: A Low Cost
Method to Improce Branch Prediction Accuracy, In the Proceedings of the
International Conference on Computer Design (ICCD'97), pp 82-87, Austin, TX.,
Oct. 1997.
12.
Rauls
Silvera, Jian Wang, Guang R. Gao and R. Govindarajan, A Register Pressure
Sensitive Instruction Scheduler for Dynamic Issue Processors, In the Proceedings
of the International Conference on Parallel Architecture and Compiler
Techniques (PACT'97), San Francisco, CA, Nov. 1997.
13.
X,N.
Tang, Rakesh Ghiya, Laurie Hendren, Guang R. Gao, Heap Analysis and
Optimizations for Threaded Programs, In the Proceedings of the International
Conference on Parallel Architecture and Compiler Techniques (PACT'97), San
Francisco, CA, Nov. 1997.
14.
Xinan
Tang, Guang R. Gao, How “Hard” is
Thread Partitioning and How “Bad” is a List Scheduling Based Partitioning
Algorithm, In Proceedings of Tenth Annual ACM Symposium on Parallel Algorithms
and Architectures,Puerto Vallarta, Mexico, pp130--139, June,1998
15.
Angela
Sodan, Guang R. Gao, Olivier Maquelin, Jens-Uwe Schultz, and Xin-Min Tian.
Experience with non-numeric applications on multithreaded architectures. In Proceedings of the ACM SIGPLAN Symposium
on Principles and Practice of Parallel Programming, Las Vegas, Nevada,
pp124-135, June, 1997
16.
X.
N. Tang, J. Wang, K. Theobald, and Guang R. Gao. Thread Partition and Schedule
Based on Cost Model. In Proceedings of
the 9th Annual Symposium on Parallel Algorithms and Architectures
(SPAA), Newport, Rhode Island, pp272-281, July 1997.
17.
Shashank
S. Nemawarkar and Guang R. Gao. Latency tolearance: A metric for performance
analysis of multithreaded architecture. In Proceedings of the International
Parallel Processing Symposium, April 1997.
18.
Parimala
Thulasiraman, Xin-Min Tian, and Guang R. Gao. Multithreading implementation of
a distributed shortest path algorithm on earth multiprocessor. In Proc. of the
Internatinal Conference on High Performance Computing, Trivandrum, India,
pp336-341, December 1996.
19.
Xin-Min
Tian, Shashank S. Nemawarkar, Guang R. Gao, et al. Quantitive studies of data
locality sensitivity on the EARTH multithreaded architecture: Preliminary
results. In Proc. of the Internatinal Conference on High Performance Computing,
Trivandrum, India, pp362-367December 1996.
20.
Guang
Gao, Konstantin K. Likharev, Paul C. Messina, and Thomas L. Sterling. Hybrid
technology multi-threaded architecture. In Proceedings of Frontiers '96: The
Sixth Symposium on the Frontiers of Massively Parallel Computation, pages
98-105, Annapolis, Maryland, October 1996.
21.
Laurie
J. Hendren, Xinan Tang, Yingchun Zhu, Guang R. Gao, Xun Xue, Haiying Cai, and
Pierre Ouellet. Compiling C for the EARTH multithreaded architecture. In
Proceedings of the 1996 Conference on Parallel Architectures and Compilation
Techniques (PACT '96), pages 12-23, Boston, Massachusetts, October 1996. IEEE
Computer Society Press.
22.
Erik
R. Altman and Guang R. Gao. Optimal software pipelining through enumeration of
schedules. In Proceedings of Euro-Par'96, pages 833-840, Lyon, France, August
1996.
23.
Vivek
Sarkar, Guang R. Gao, and Shaohua Han. Data locality analysis for distributed
shared memory multiprocessors. In Proceedings of the Ninth Workshop on
Languages and Compilers for Parallel Computing, San Jose, California, August
1996.
24.
Olivier
Maquelin, Guang R. Gao, Herbert H. J. Hum, Kevin B. Theobald, and Xin-Min
Tian.Polling Watchdog: Combining polling and interrupts for efficient message
handling. In Proceedings of the 23rd Annual International Symposium on Computer
Architecture, pages 178-188, Philadelphia, Pennsylvania, May 1996.
25.
John
Ruttenberg, G. R. Gao, A. Stouchinin, and W. Lichtenstein. Software pipelining
showdown: Optimal vs. heuristic methods in a production compiler. In
Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design
and Implementation, pages 1-11, Philadelphia, Pennsylvania, May 1996.
26.
Vugranam
C. Sreedhar, Guang R. Gao, and Yong fong Lee. A new framework for exhaustive
and incremental data flow analysis using DJ graphs. In Proceedings of the ACM
SIGPLAN '96 Conference on Programming Language Design and Implementation, pages
278-290, Philadelphia, Pennsylvania, May 1996.
27.
Jian
Wang and Guang R. Gao. Pipelining-dovetailing: A transformation to enhance
software pipelining for nested loops. In Proceedings of the 6th International
Conference on Compiler Construction, Lecture Notes in Computer Science, Linkoping,
Sweden, April 1996. Springer-Verlag.
28.
R.
Govindarajan, Erik R. Altman, and Guang R. Gao. Instruction scheduling in the
presence of structureal hazards: An integer programming approach to software
pipeline. In Proc. of the nternatinal Conference on High Performance Computing,
Goa, India, December 1995.
29.
R.
Govindarajan, Erik R. Altman, and Guang R. Gao. Co-scheduling hardware and
software pipelines. In Second International Symposium on High-Performance
Computer Architecture, San Jose, California, February 1996.
30.
Shashank
S. Nemawarkar and Guang R. Gao. Measurement and modeling of EARTH-MANNA
multithreaded architecture. In Proceedings of the Fourth International Workshop
on Modeling, Analysis and Simulation of Computer and Telecommunication Systems,
pages 109-114, San Jose, California, February 1996. IEEE Computer Society TCCA
and TCS.
31.
Luis
A. Lozano C. and Guang R. Gao. Exploiting short-lived variables in superscalar
processors. In Proceedings of the 28th Annual International Symposium on
Microarchitecture, pages 292-302, Ann Arbor, Michigan, November-December 1995.
32.
J.
B. Dennis and G.R. Gao. On memory models and cache management for shared-memory
multi-processors. In Proceedings of Seventh IEEE International Symposium on
Parallel and Distributed Proccesing. IEEE, October 1995.
33.
Olivier
C. Maquelin, Herbert H. J. Hum, and Guang R. Gao. Costs and benefits of
multithreading with off-the-shelf RISC processors. In Proceedings of the First
International EURO-PAR Conference, number 966 in Lecture Notes in Computer
Science, pages 117-128, Stockholm, Sweden, August 1995. Springer-Verlag.
34.
R.
Wen, Guang R. Gao, and Vincent V. Dongen. The design and implementation of the
accurate array data-flow analysis in the HPC compiler. In Proceedings of High
Performance Computing Symposium '95, Canada's Ninth Annual International High
Performance Computing Conference and Exhibition, pages 144-155, Montr'eal,
Qu'ebec, July 1995. Centre de recherche informatique de Montr'eal.
35.
Nasser
Elmasri, Herbert H. J. Hum, and Guang R. Gao. The Threaded Communication
Library: Preliminary experiences on a multiprocessor with dual-processor nodes.
In Conference Proceedings, 1995 International Conference on Supercomputing,
pages 195-199, Barcelona, Spain, July 1995.
36.
Erik
R. Altman, R. Govindarajan, and Guang R. Gao. An experimental study of an
ILP-based exact solution method for software pipelining. In Proceedings of the
8th International Workshop on Languages and Compilers for Parallel Computing,
Lecture Notes in Computer Science, pages 2.1 - 2.15, Columbus, Ohio, August
1995. Springer-Verlag.
37.
Guang
R. Gao and Vivek Sarkar. Location consistency: Stepping beyond the memory
coherence barrier. In 24th International Conference on Parallel Processing,
pages II-73-II-76, University Park, Pennsylvania, August 1995.
38.
Herbert
H. J. Hum, Olivier Maquelin, Kevin B. Theobald, Xinmin Tian, Xinan Tang, Guang
R. Gao, Phil Cupryk, Nasser Elmasri, Laurie J. Hendren, Alberto Jimenez, Shoba
Krishnan, Andres Marquez, Shamir Merali, Shashank S. Nemawarkar, Prakash Panangaden,
Xun Xue, and Yingchun Zhu. A design study of the EARTH multiprocessor. In
Proceedings of the IFIP WG 10.3 Working Conference on Parallel Architectures
and Compilation Techniques, PACT '95, pages 59-68, Limassol, Cyprus, June 1995.
ACM Press.
39.
E.
R. Altman, R. Govindarajan, and G. R. Gao. Scheduling and mapping: Software
pipelining in the presence of structual hazards. In ACM SIGPLAN Symposium on
Programming Language Design and Implementation, June 1995. page 139-150.
40.
G.
Tremblay and G. R. Gao. The impact of laziness on parallelism and the limits of
strictness analysis. In Proceedings of the High Performance Functional
Computing Conference, pages 119- 133, Denver, Colorado, April 1995. Lawrence
Livermore National Laboratory. CONF-9504126.
41.
Vugranam
C. Sreedhar and Guang R. Gao. A linear time algorithm for placing OE-nodes. In
Conference Record of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages, pages 62-73, San Francisco, California, January 1995.
42.
Vugranam
C. Sreedhar, Guang R. Gao, and Yong fong Lee. Incremental computation of
dominator trees. In Proceedings of the ACM SIGPLAN Workshop on Intermediate
Representations (IR'95), pages 1-12, San Francisco, California, January 22,
1995. SIGPLAN Notices, 30(3), March 1995.
43.
Kevin
B. Theobald, Herbert H. J. Hum, and Guang R. Gao. A design framework for
hybrid-access caches. In Proceedings of the First International Symposium on
High-Performance Computer Architecture, pages 144-153, Raleigh, North Carolina,
January 1995.
44.
R.
Govindarajan, Erik R. Altman, and Guang R. Gao. Minimizing register
requirements under resource-constrained rate-optimal software pipelining. In
Proceedings of the 27th Annual International Symposium on Microarchitecture,
pages 85-94, San Jose, California, November-December 1994.
45.
R.
Govindarajan, Erik R. Altman, and Guang R. Gao. A framework for
resource-constrained rate optimal software pipelining. In Proceedings of the
Third Joint International Conference on Vector and Parallel Processing (CONPAR
94 - VAPP VI), number 854 in Lecture Notes in Computer Science, pages 640-651,
Linz, Austria, September 1994. Springer-Verlag.
46.
R.
Govindarajan, Guang R. Gao, and Palash Desai. Minimizing memory requirements in
rate optimal schedules. In Proceedings of the 1994 International Conference on
Application Specific Array Processors, pages 75-86, San Francisco, California,
August 1994. IEEE Computer Society.
47.
S.
S. Nemawarkar, R. Govindarajan, G. R. Gao, and V. K. Agarwal. Performance of
interconnection network in multithreaded architectures. In Proceedings of PARLE
'94 - Parallel Architectures and Languages Europe, number 817 in Lecture Notes
in Computer Science, pages 823-826, Athens, Greece, July 1994. Springer-Verlag.
48.
V.
Van Dongen, C. Bonello, and Guang R. Gao. Data parallelism with High
Performance C. In Proceedings of Supercomputing Symosium ’94, Canada’s Eighth
Annual High Performance Computing Conference, pp 128-135, Toronto, Ontario,
June 1994. University of Toronto.
49.
Herbert
H. J. Hum, Kevin B. Theobald, and Guang R. Gao. Building multithreaded
architectures with off-the-shelf microprocessors. In Proceedings of the 8th
International Parallel Processing Symposium, pp 288-294, Cancun, Mexico, April
1994. IEEE Computer Society.
50.
G.
Liao, E.R. Altman, V.K. Agarwal, and Guang R. Gao. A comparative study of DSP
multiprocessor list sheduling heuristics. In Proceedings of the 27th
Annual Hawaii International Conference on System Sciences, Kihei, Hawaii, 1994.
51.
S.
S. Nemawarkar, R. Govindarajan, Guang R. Gao, and V. K. Agarwal. Analysis of
multithreaded multiprocessors with distributed shared memory. In Proceedings of
the Fifth IEEE Symposium on Parallel and Distributed Processing, pp 114-121,
Dallas, Texas, December 1993.
52.
R.
Govindarajan and Guang R. Gao. A novel framework for multi-rate sheduling in
DSP applications. In Proceedings of the 1993 International Conference on
Application Specific Array Processors, pp 77-88, Venice, Italy, October 1993.
IEEE Computer Society.
53.
Guang
R. Gao, Vivek Sarkar, and Lelia A. Vazquez. Beyond the data parallel paradigm:
Issues and options. In W.K. Giloi, S. Jahnichen, and B.D. Shriver, Editors
Proceedings – 1993 Programming Models for Massively Parallel Computers, pp
191-197, Berlin, Germany, September 20-23, 1993. IEEE Computer Society Press.
54.
Guang
R. Gao, Qi Ning, and Vincent Van Dongen. Extending software pipelining
techniques for scheduling nested loops. In Proceedings of the 6th
International Workshop on Languages and Compilers for Parallel Computing, number 768 in Lecture Notes in Computer
Science, pp 340-357, Portland, Oregon, August 1993. Springer-Verlag.
55.
Erik
R. Altman, Vonod K. Agarwal, and Guang R Gao. A novel methodology using genetic
algorithms for the design of caches and cache replacement policy. In Stephanie
Forrest, editor, Proceedings of the 5th International Conference on
Genetic Algorithms, pp 392-399. Morgan Kaufmann Publishers, Inc., July 1993.
University of Illinois at Urbana-Champaign.
56.
Kevin
B. Theobald, Guang , and Laurei J. Hendren. Speculative execution and branch
prediction on parallel machines. In Conference Proceedings, 1993 ACM International Conference on
Supercomputing, pp 77-86, Tokyo, Japan, July 1993.
57.
Robert
Kim Yates and Guang R. Gao, A Kahn
principle for networks of nonmonotonic real-time processes. In Proceedings of
PARLE ’93 – Parallel Architectures and Languages Europe, number 694 in Lecture
Notes in Computer Science, pp 209-227, Munich, Germany, June 1993.
Springer-Verlag.
58.
Herbert
H. J. Hum and Guang R. Gao. Supporting a dynamic PMD model in a multi-threaded
architecture. In Digest of Papers, 38th IEEE Computer Society
International Conference, COMPCON Spring ’93, pp 165-174, San Francisco,
California, February 1993.
59.
Qi
Ning and Guang R. Gao, A novel framework of register allocation for software
pipelining. In Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT
Symposium on Principles of Programming Languages, pp 29-42, Charleston, South
Carolina, January 1993.
60.
Kevin
B. Theobald, Guang R. Gao, and Laurie
J. Hendren. On the limits of program parallelism and its smoothability. In
Proceedings of the 25th Annual International Symposium on
Microarchitecture, pp 10-19, Portland, Oregon, December 1992.
61.
V.
Van Dongen, Guang R. Gao, and Q. Ning.
A polynomial time method for optimal software pipelining. In Proceedings of the
Conference on Vector and Parallel Processing, CONPAR-92, number 634 in Lecture
Notes in Computer Science, pp 613-624, Lyon, France, September 1-4, 1992.
Springer-Verlag.
62.
J.
M. Monti and Guang R Gao. Efficient interprocessor synchronization and
communication on a dataflow multiprocessor architecture. In the Proceedings of
1992 International Conference on Parallel Processing, pp I-220-224, St.
Charles, IL, August 1992.
63.
Guang
R Gao, R. Olsen, V. Sarkar, and R. Thekkath. Collective loop fusion for array
contraction. In Proceedings fo the 5th International Workshop on
Languages and Compilers for Parallel Computing, number 757 in Lecture Notes in
Computer Science, pp 281-295, New Haven, Connecticut, August 1992.
Springer-Verlag.
64.
L.
Hendren, C. Donawa, M. Emami, Guang R Gao, Justiani, and B. Sridharan.
Designing the McCAT compiler based on a family of structured intermadiate
representations. In Proceedings of the 5th International Workshop on
Languages and Compilers for Parallel Computing, number 757 in Lecture Notes in
Computer Science, pp 406-420, New Haven, Connecticut, August 1992.
Springer-Verlag.
65.
Herbert
H. J. Hum, Kevin B. Theobald, and Guang R. Gao. Building multithreaded
architectures with off-the-shelf microprocessors. In Proceedings of the 8th
International Parallel Processing Symposium, pages 288-294, Canc'un, Mexico,
April 1994. IEEE Computer Society.
Monographs, Books and Book
Chapters
1.
G.
R. Gao., J-L. Gaudiot, and L. Bic, editors. Advanced Topics in Dataflow and Multithreaded
Computers. IEEE Computer Society Press, 1995.
2.
Jack
B. Dennis and Guang R. Gao. Multithreaded architectures: Principles, projects,
and issues. In Robert A. Iannucci, Guang R. Gao, Robert H. Halstead, Jr., and
Burton Smith, editors, Multithreaded Computer Architecture: A Summary of the
State of the Art, chapter 1, pages 1-72. Kluwer Academic Publishers, Norwell,
Massachusetts, 1994.
3.
Robert
A. Iannucci, Guang R. Gao, Robert H. Halstead, Jr., and Burton Smith, editors.
Multi-threaded Computer Architecture: A Summary of the State of the Art. Kluwer
Academic Publishers, Norwell, Massachusetts, 1994. Book contains papers
presented at the Workshop on Multithreaded Computers, Albuquerque, New Mexico,
November 1991.
4.
G.
R. Gao. A Code Mapping Scheme for Dataflow Software Pipelining. Kluwer Academic
Publishers, Boston, Massachusetts, December 1990.
The
theme of my research in computer architecture and systems, compiler technology,
and memory models not only enriches the field of parallel computing and
encompass a host new techniques for high-performance architectures and
compiling technology but also provides a new horizon for mapping applications,
both regular or irregular, onto these architectures. Furthermore, the research
activities are not only themselves intellectually stimulating, interesting and
competitive, but also exposes students with a dynamic new field with excellent
prospect of
employment
and a productive career.
My
work on EARTH model and architecture has important relevance to the design
and development of future generation of
parallel computer architectures. The research results have been published
widely in a range of recognized international professional conferences and
journals. It has attracted a considerable
level of research support from NSF through 4 NSF research grants encompassing
the areas from architecture and memory support, the efficient implementation of
multithreaded execution models on SMP workstation cluster based parallel
systems, the application of EARTH model to large irregular applications such as
the crack propagation problem, and the compilation technology for
multithreading. It has also attracted
industry interests and funding such as the DRP grant we received with support
from ACORN Inc. An extension of our work on fine-grain multithreading and EARTH
to be applied to high-end Supercomputing has become an important component to
the HTMT project, one of the few nation's on-going petaflow architecture
project funded by DARP, NSA and NASA.
My
work on modulo scheduling and software pipelining also have immediate relevance
to the computer industry in their effort to exploit high performance with
instruction level parallelism.
The
research results have been published widely in a range of recognized
international professional conferences and journals. The technology developed
in our group has been used in the evaluation of the software pipelining
techniques in the SGI production compiler,
and
to foster the future collaboration, we have received the donation of two SGI
workstations with special SGI software.
The
co-scheduling technique has been funded by NSF through a research grant. The
co-scheduling technology developed by me and my colleagues have also attracted
strong industry attention, and Rockwell Semiconductor Systems has already
committed funding to this research and a DRP grant on retargetable compiler for
DSP architectures with Rockwell funding and
university
matching has just been awarded.
The
significance and novelty of my work on program analysis and memory models have
also been recognized by the research community. Three papers out of the work on program
Analysis have been accepted for publications on the most prestigious journal -- the ACM Transactions on Program Languages and Systems.
B.4 Research Support
------------------------------------------------------------------------------------------------------------------------------
Agency Grant Title Amount Period Status
Number
-------------------------------------------------------------------------------------------------------------------------------
NSF CCR Compiling Irregular $319,156 08/97- co-PI
9808522 Applications on a 07/00
Multithreaded
Architecture
NSF MIPS A New Generation $400,000 07/97- co-I
9707125 Multithreaded 06/00
Processors
NSF CDA Parallel and Distri- $633,513 07/97- co-PI
9703088 buted Computing: 06/02
Systems and
Applications
Development
(Infrastructure Grant)
DARPA/ ASC Hybrid
Technology $800,000 06/97- co-I
NSA/ 9612105 Multithreaded 05/99
NASA Architecture
for
Petaflops
NSF CCR A Framework of $139,263 06/97- PI
9711477 Modulo Scheduling 05/99
Based on Finite
Automaton
(with
REO $ 6,250 06/97-
05/98
NSF CISE Challenges in CISE- $264,952 01/98- co-I
9726388 Crack Propagation 12/00
Project
DRP Approved Retargetable
Compilers $75,000 98-00 PI
for
Embedded DSP
Processors
(with Rockwell Semiconductor Systems Inc.)
C.1 University Activities and Services
-
Special
Activities:
-
attended
recruiting activities of new faculty members
-
the
tenure review of Prof. Dan Van Weide, Prof. Paul Berger
-
participating
faculty retreat meeting (1998)
-
Dean's
ad hoc group for supercomputing (1998)
-
Participate
Engieering Outreach program
-
An
advisor in the university Undergraduate Research Opportunity program
-
Departmental
and College Committees
-
chairing
the departmental Committee on Promotion & Tenure (1998)
-
College
Election Committee (1998)
-
University Committees
-
the
ICRSS committee (Instructional, Computing and Research Support Services
Committee)
·
IEEE Computer Society
Distringuished Visitor, 1998-2001
·
IEEE, Senior Member (since
1997)
·
Program Committee Members of
Recognized International Conferences
-
IEEE
International Symposium on Computer Architecture (HPCA-95, HPCA-99)
-
ACM
Symposium on Programming Language Design and Implementation (PLDI’98)
-
ACM
International Conference on Supercomputing (ICS-95)
-
ACM/IEEE
International Symposium on Microarchitectures (MICRO-95, 96, 97)
-
International
Parallel Processing Symposium (IPPS'95)
-
IFIP
and ACM SIGARCH International Conference on Parallel Architectures and
Compilation Techniques (PACT'94,95,96,97,98)
-
International
Conference on Algorithms And Architectures for Parallel Processing (ICAPP-95)
-
Parallel
Architecture and Language Europe (PARLE-91,92,93,94,95)
-
International
Conference on Parallel Processing (EURO-PAR-95,96)
-
Working
Conference on Massively Parallel Programming Models (MPPM-93,95,97,99)
-
High
Performance Computing Symposium (HPCS-95, 96, 98).
·
Program Committee
Chairmanship
- I have been elected as the Program Chairman of the 1994 ACM SIGARCH, International Conference on parallel Architectures and Compilation Techniques (PACT '94), Aug/. 1984. Montreal, Canada, co-sponsored by IFIP and in association with ACM SIGPLAN, IEEE TCCA (Technical Committee on Computer Architecture) and IEEE TCPP (Technical Committee on Parallel Processing).
- I am elected as the General Co-Chair of the 1998 International Conference on Parallel Architectures and Compilation Techniques (PACT '98), Oct. 1998, Paris, France., co-sponsored by IFIP and IEEE Computer Society
- I am elected as the Chair of the Third Workshop on Petaflop Computing, Feb., 1999. Annapolis, MD.
·
Other Activities in
Recognized Professional Conferences
I have served as a workshop chair, a session chair,
an organizing committee or steering committee member of many international
conferences.
·
Journal Editorialship
-
I
am elected to the Editorial Board of
IEEE Transactions on Computers (1998 -)
-
I
am elected to the Editorial Board of
IEEE Concurrency Journal (1997 -)
-
I
joined the Editorial Board of the Journal on Programming Languages in Jan.
1996, and subsequently became one of the two Co-Editors of the journal.
-
I
am a Guest Editor for the Special Issue on Dataflow and Multithreaded
Computers, Journal of Parallel and Distributed Computing, Academic Press, June,
1993.
· Invited Seminars and Distinguished Seminars
I have given seminars in
many industrial and academic organizations: IBMT.J. Watson Research Center, IBM
Toronto Lab, AT&T Bell Laboratories, BNR, HP Labs, SGI, DEC, NRL(Navy
Research Lab.), MIT, Stanford, UC Berkeley, University of Victoria are just
named a few.
· Others: A panelist, session chair, organization/steering committee member, advisory board member for many recognized professional conferences. (detail to be provided upon request)