Conference
Papers
"Toward High Throughput Algorithms on Many Core Architectures"
In Proceedings of 7th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC),
Paris, France. January 23-25, 2012.
Daniel Orozco, Elkin Garcia, Rishi Khan, Kelly Livingston and Guang R. Gao.
"TIDeFlow: The Time Iterated Dependency Flow Execution Model"
In Proceedings of Workshop on Data-Flow Execution Models for Extreme Scale Computing (DFM 2011);
20th International Conference on Parallel Architectures and Compilation Techniques (PACT 2011),
Galveston Island, TX, USA. October 10 - 14, 2011.
Daniel Orozco, Elkin Garcia, Robert Pavel, Rishi Khan and Guang R. Gao
"Exploring Fine-Grained Task-based Execution on Multi-GPU Systems"
In Proceedings of Workshop on Parallel Programming on Accelerator Clusters (PPAC 2011);
IEEE Cluster 2011.
Austin, TX, USA. September 26, 2011.
Long Chen, Oreste Villa and Guang R. Gao
"Towards an integrated multiscale simulation of turbulent clouds on PetaScale computers"
In Proceedings of 13th European Turbulence Conference (ETC13),
Warsaw, Poland. September 12-15, 2011.
Lian-Ping Wang, Orlando Ayala, Hossein Parishani, Wojciech W Grabowski,
Andrzej A Wyszogrodzki, Zbigniew Piotrowski, Guang R Gao, Chandra
Kambhamettu, Xiaoming Li, Louis Rossi, Daniel Orozco and Claudio Torres.
"Polytasks: A Compressed Task Representation for HPC Runtimes"
In Proceedings of the 24th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2011),
Fort Collins, CO, USA. September 8-10, 2011.
Daniel Orozco, Elkin Garcia, Robert Pavel, Rishi Khan and Guang R. Gao
"OPELL and PM: A Case Study on Porting Shared Memory Programming Models to Accelerators Architectures"
In Proceedings of the 24th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2011),
Fort Collins, CO, USA. September 8-10, 2011.
Joseph B. Manzano, Ge Gan, Juergen Ributzka, Sunil Shrestha and Guang R. Gao
"Hardware and Software Tradeoffs for Task Synchronization on Manycore Architectures"
In Proceedings of International European Conference on Parallel and Distributed Computing (Euro-Par'11),
Bordeaux, France. August 29 - September 2, 2011.
Yonghong Yan, Sanjay Chatterjee, Daniel Orozco, Elkin Garcia, Zoran Budimlic, Jun Shirako, Robert Pavel, Guang R. Gao and Vivek Sarkar
"Experiments with the Fresh Breeze Tree-Based Memory Model"
In Proceedings of International Supercomputing Conference (ISC'11),
Hamburg, Germany, June 19 - 23, 2011.
Jack B. Dennis, Guang R. Gao and Xiao X. Meng
"Position Paper: Using a "Codelet" Program Execution Model for Exascale Machines"
In Proceedings of ACM SIGPLAN 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era (EXADAPT 2011);
Programming Language Design and Implementation (PLDI 2011).
San Jose, CA, USA. June 5, 2011.
Stephane Zuckerman, Joshua Suetterlein, Rob Knauerhase and Guang R. Gao
"The Elephant and the Mice: Non-Strict Fine-Grain Synchronization for Many-Core Architectures"
In Proceedings of 25th International Conference on Supercomputing (ICS'11),
Tucson, AZ, USA. May 31 - June 4, 2011.
Juergen Ributzka, Joseph B. Manzano, Yuhei Hayashi and Guang R. Gao
"DEEP: An Iterative FPGA-based Many-Core Emulation System for Chip Verification and Architecture Research"
In Proceedings of 19th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA'11),
Monterrey, CA, USA. February 27 - March 1, 2011.
Juergen Ributzka, Yuhei Hayashi, Fei Chen and Guang R. Gao
"Energy efficient tiling on a Many-Core Architecture"
In Proceedings of 4th Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG 2011);
6th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC),
Heraklion, Greece. January 23, 2011.
Elkin Garcia, Daniel Orozco and Guang R. Gao
"Locality Optimization of Stencil Applications using Data Dependency Graphs"
In Proceedings of the 23rd International Workshop on Languages and Compilers for Parallel Computing (LCPC 2010),
Houston, TX, USA. October 7-9, 2010.
Daniel Orozco, Elkin Garcia and Guang R. Gao
"Optimized Dense Matrix Multiplication on a Many-Core Architecture"
In Proceedings of International European Conference on Parallel and Distributed Computing (Euro-Par'10),
Ischia, Italy. August 31- September 3, 2010.
Elkin Garcia, Ioannis E. Venetis, Rishi Khan and Guang R. Gao
"A Study of a Software Cache Implementation of the OpenMP Memory Model for Multicore and Manycore Architectures"
In Proceedings of International European Conference on Parallel and Distributed Computing (Euro-Par'10),
Ischia, Italy. August 31- September 3, 2010.
Chen Chen, Joseph B Manzano, Ge Gan, Guang R. Gao and Vivek Sarkar
"TiNy threads on BlueGene/P: Exploring many-core parallelisms beyond The traditional OS"
In Proceedings of Workshop on Multithreaded Architecures and Applications (MTAAP);
24th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2010),
Atlanta, GA, USA. April 23, 2010.
Handong Ye, Robert Pavel, Aaron Landwehr and Guang Gao
"Minimizing Communication in Rate-Optimal Software Pipelining for Stream Programs"
In Proceedings of Symposium on Code Generation and Optimization (CGO 2010),
Toronto, Canada. April 24-28, 2010.
Haitao Wei, Junqing Yu, Huafei Yu and Guang R. Gao
"Dynamic Load Balancing on Single- and Multi-GPU Systems"
In Proceedings of the 24th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2010),
Atlanta, GA, USA. April 19-23, 2010.
Long Chen, Oreste Villa, Sriram Krishnamoorthy, and Guang R. Gao
"Performance Analysis of Cooley-Tukey FFT Algorithms for a Many-core Architecture "
In Proceedings of The High Performance Computing Symposium (HPC 2010),
Orlando, FL, USA. April 12-15, 2010.
Long Chen and Guang R. Gao
"MODA: A Memory Centric Performance Analysis Tool"
In Proceedings of 11th LCI International Conference on High-Performance Clustered Computing,
Pittsburgh, PA, USA. March 9-11, 2010
Joseph B. Manzano, Andres Marquez and Guang R. Gao
"Iterative Layer-Based Raytracing on CUDA"
In Proceedings of 28th IEEE International Performance Computing and Communications Conference (IPCCC 2009),
Phoenix, AZ, USA. December 14-16, 2009.
Alejandro Segovia, Xiaoming Li and Guang R. Gao
"Mapping the FDTD Application to Many-Core Chip Architectures"
In Proceedings of the 38th International Conference on Parallel Processing (ICPP 2009),
Vienna, Austria. September 22-25, 2009.
Daniel Orozco and Guang R. Gao
"Tile Percolation: an OpenMP Tile Aware Parallelization Technique for the Cyclops-64 Multicore Processor"
In Proceedings of International European Conference on Parallel and Distributed Computing (Euro-Par'09),
Delft, The Netherlands. August 25-28, 2009
Ge Gan, Xu Wang, Joseph Manzano and Guang R. Gao
"Tile reduction: the first step towards Openmp tile aware parallelization"
In Proceedings of the 5th International Workshop on OpenMP (IWOMP'09),
Dresden, Germany, June 3-5, 2009
Ge Gan, Xu Wang, Joseph Manzano, Guang R. Gao
"Mapping the LU Decomposition on a Many Core Architecture: Challenges and Solutions"
In Proceedings of ACM International Conference on Computing Frontiers (CF 2009),
Ischia, Italy. May 18-20, 2009
Ioannis E. Venetis and Guang R. Gao
"Just-In-Time Locality and Percolation for Optimizing Irregular Applications on a Manycore Architecture"
In Proceedings of The 21st Annual Languages and Compilers for Parallel Computing Workshop (LCPC 2008),
Alberta, Canada. July 31 - August 2, 2008
Guangming Tan, Vugranam Sreedhar, Guang R. Gao
"Experience on Optimizing Irregular Computation for Memory Hierarchy in Manycore Architecture "
Poster Paper. 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2008),
Salt Lake City, UT, USA. February 20-23, 2008
Guangming Tan, Dongrui Fan, Junchao Zhang, Andrew Russo, Guang R. Gao
"Performance Tuning of the Fast Fourier Transform on a Multi-core Architecture"
In Proceedings of First Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG 2008),
Goteborg, Sweden. January 27, 2008.
Liping Xue, Long Chen, Ziang Hu, Guang R. Gao
"Server I/O Acceleration Using an Embedded Multi-core Architecture"
In Proceedings of Workshop on Application Specific Processors (WASP 2007),
Salzburg, Austria. October 4-5, 2007.
Lurng-Kuo Liu, Fei Chen, Christos J. Georgiou and Guang R. Gao
"Software-Pipelining on Multi-Core Architectures"
In Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007),
Brasov, Romania. September 15-19, 2007.
Alban Douillet and Guang R. Gao
"Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers"
In Proceedings of The 20th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2007),
Urbana, IL, USA. October 11-13, 2007
Yuan Zhang, Evelyn Duesterwald and Guang R. Gao
"Synchronization State Buffer: Supporting Efficient Fine-Grain Synchronization for Many-Core Architectures"
In Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007),
San Diego, CA, USA. June 9-13, 2007
Weirong Zhu, Vugranam C. Sreedhar, Ziang Hu, and Guang R. Gao
Available in pdf format
"A Parallel Dynamic Programming Algorithm on a Multi-core Architecture"
In Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2007),
San Diego, CA, USA. June 9-11, 2007
Guangming Tan, Ninghui Sun, and Guang R. Gao
"ParalleX: A Study of A New Parallel Computation Model"
In Proceedings of the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007),
Long Beach, CA, USA. March 26 - 30, 2007.
Guang R. Gao, Thomas Sterling, Rick Stevens, Mark Hereld and Weirong Zhu
"On the Role of Deterministic Fine Grain Data Synchronization for Scientific Applications: A Revisit in the Emerging Many-Core Era"
In Proceedings of First Workshop on Multithreaded Architecures and Applications
in the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007),
Long Beach, CA, USA. March 26 - 30, 2007.
Weirong Zhu, Ziang Hu, and Guang R. Gao
"Exploring a multithreaded Methodology to Implement a Network Communication Protocol on the Cyclops-64 Multithreaded Architecture"
In Proceedings of First Workshop on Multithreaded Architectures and Applications
in the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007),
Long Beach, CA, USA. March 26 - 30, 2007.
Ge Gan, Ziang Hu, Juan del Cuvillo, and Guang R. Gao
Also available in pdf format
"Experience of Optimizing FFT on Intel Core Architecture"
In Proceedings of Workshop on Performance Optimization for High-Level Languages and Libraries
in the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007),
Long Beach, CA, USA. March 26 - 30, 2007.
Daniel Orozco, Liping Xue, Murat Bolat, Xiaoming Li and Guang Gao
Also available in pdf format
"Automatic Program Segment Similarity Detection in Targeted Program Performance Improvement"
In Proceedings of Workshop on Performance Optimization for High-Level Languages and Libraries
in the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007),
Long Beach, CA, USA. March 26 - 30, 2007.
Haiping Wu, Eunjung Park, Mihailo Kaplarevic, Yingping Zhang, Murat Bolat, Xiaoming Li and Guang Gao
Also available in pdf format
"Optimizing Fast Fourier Transform on a Multi-core Architecture"
In Proceedings of Workshop on Performance Optimization for High-Level Languages and Libraries
in the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007),
Long Beach, CA, USA. March 26 - 30, 2007.
Long Chen and Ziang Hu
Also available in pdf format
"Optimized lock assignment and allocation: a method for exploiting concurrency among critical sections"
In the Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming (PPoPP 2007),
San Jose, CA, USA, March 14 - 17, 2007.
Yuan Zhang, Vugranam C. Sreedhar, Weirong Zhu, Vivek Sarkar and Guang R. Gao
"Exploring Financial Applications on Many-core-on-a-chip Architecture: A First Experiment"
In Proceedings of Workshop on Frontiers of High Performance Computing and Networking (FHPCN2006),
4th International Symposium on Parallel and Distributed Processing and Applications (ISPA 2006) ,
Sorrento, Italy. December 4-7, 2006.
Weirong Zhu, Parimala Thulasiraman, Ruppa K. Thulasiram and Guang R. Gao
Available in pdf format
"Optimization of Dense Matrix Multiplication on IBM Cyclops-64: Challenges and Experiences"
In Proceedings of the 12th International European Conference on Parallel Processing (Euro-Par 2006),
Dresden, Germany. August 29 - September 1, 2006.
Ziang Hu, Juan del Cuvillo, Weirong Zhu, and Guang R. Gao
Also available in pdf format
"Multi-Dimensional Kernel Generation for Loop Nest Software Pipelining"
In Proceedings of the 12th International European Conference on Parallel Processing (Euro-Par 2006),
Dresden, Germany. August 29 - September 1, 2006.
Alban Douillet, Hongbo Rong, and Guang R. Gao
Also available in pdf format
"A User-Friendly Methodology for Automatic Exploration of Compiler Options"
In Proceedings of The International Conference on Programming Languages and Compilers (PLC06).
Las Vegas, Nevada. June 26-29, 2006.
Haiping Wu, Long Chen, Joseph Manzano, and Guang Gao
Also available in pdf format
"A User-Friendly Methodology for Automatic Exploration of Compiler Options: A Case Study on the Intel XScale Microarchitecture"
In Proceedings of The International Conference on Programming Languages and Compilers (PLC06).
Las Vegas, Nevada. June 26-29, 2006.
Haiping Wu, Eunjung Park, Long Chen, Juan del Cuvillo, and Guang Gao
Also available in pdf format
"Performance Characteristics of OpenMP Language Constructs on a Many-core-on-a-chip Architecture"
In Proceedings of the 2nd International Workshop on OpenMP (IWOMP2006),
Remis, France. June 12-15 2006.
Weirong Zhu, Juan del Cuvillo, and Guang R. Gao
Also available in pdf format
"Towards a Software Infrastructure for the Cyclops-64 Cellular Architecture"
In Proceedings of the 20th International Symposium on High Performance Computing Systems and Applications (HPCS'06),
St. John's, Canada. May 14 - 17, 2006.
Juan del Cuvillo, Weirong Zhu, Ziang Hu, and Guang R. Gao
Also available in pdf format
"Landing OpenMP on Cyclops-64: An Efficient Mapping of OpenMP to a many-core System-on-a-chip"
In Proceedings of the 3rd ACM International Conference on Computing Frontiers,
Ischia, Italy. May 2-5, 2006.
Juan del Cuvillo, Weirong Zhu, Guang R. Gao
Also available in pdf format
"A Study of the On-Chip Interconnection Network for the IBM Cyclops-64 Multi-Core Architecture"
In Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS2006),
Rhodes Island, Greece. April 25 - 29, 2006.
Ying M. P. Zhang, Taikyeong Jeong, Fei Chen, Haiping Wu, Ronny Nitzsche, and Guang R. Gao
Also available in pdf format
"Hierarchical Multithreading: Programming Model and System Software"
In Proceedings of Workshop on NSF Next Generation Software Program (NSFNGS'06),
in conjunction with 20th International Parallel and Distributed Processing
Symposium (IPDPS2006),
Rhodes Island, Greece. April 25 - 29, 2006.
Guang R. Gao, Thomas Sterling, Rick Stevens, Mark Hereld, and Weirong Zhu
"Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops-64"
In Proceedings of Network and Parallel Computing (NPC 2005),
Beijing, China. November 30 - December 3, 2005.
Yanwei Niu, Ziang Hu, Kenneth Barner, Guang R. Gao
Also available in pdf format
"Register Pressure in Software-Pipelined Loop Nests: Fast Computation and Impact on Architecture Design"
In Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2005),
Hawthorne, NY, USA. October 20-22, 2005.
Alban Douillet and Guang R. Gao
Also available in pdf format
"Identifying Multiply-Add Operations in Kylin Compiler"
In the proceedings of the 2005 International Conference on Embedded Systems and Applications (ESA'05),
Las Vegas, NV, USA. June 27-30, 2005.
Haiping Wu, Ziang Hu, Joseph Manzano Yingping Zhang and Guang R. Gao
"Register Allocation for Software Pipelined Multi-dimensional Loops"
In Proceedings of Conference on Programming Language Design and Implementation (PLDI 2005),
Chicago, IL, USA. June 11 - 15, 2005.
Hongbo Rong, Alban Douillet, and Guang R. Gao
Also available in pdf format
"FAST: A Functionally Accurate Simulation Toolset for the Cyclops-64 Cellular Architecture"
In Proceedings of Workshop on Modeling, Benchmarking and Simulation (MoBS),
held in conjunction with the 32nd Annual International Symposium on Computer Architecture (ISCA 2005),
Madison, WI, USA. June 4, 2005.
Juan del Cuvillo, Weirong Zhu, Ziang Hu, and Guang R. Gao
Also available in pdf format
"P3I: The Delaware Programmability, Productivity and Proficiency Inquiry"
In Proceedings of the Second International Workshop On Software Engineering for High
Performance Computing System Applications (SE-HPCS '05),
St. Louis, MO, USA. May 15, 2005
Joseph B. Manzano, Yuan Zhang and Guang R. Gao
"Atomic Section: Concept and Implementation"
In Proceedings of Mid-Atlantic Student Workshop on Programming Languages and Systems (MASPLAS '05),
Newark, DE, USA. April 30, 2005.
Yuan Zhang, Joseph B. Manzano and Guang R. Gao
"TiNy Threads: a Thread Virtual Machine for the Cyclops-64 Cellular Architecture"
In Proceedings of the Fifth Workshop on Massively Parallel Processing (WMPP),
held in conjunction with the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005),
Denver, CO, USA. April 3 - 8, 2005
Juan del Cuvillo, Weirong Zhu, Ziang Hu, and Guang R. Gao
Also available in pdf format
"Performance Portability on EARTH: A Case Study across Several Parallel Architectures"
In Proceedings of the 4th International Workshop on Performance Modeling, Evaluation,
and Optimization of Parallel and Distributed Systems (PMEO-PDS'05),
conjuncted with IPDPS 2005,
Denver, CO, USA. April 4 - 8, 2005.
Weirong Zhu, Yanwei Niu, and Guang Gao
"Sequential Consistency Revisited: The Sufficient Conditions and Method to Reason Consistency Model of a
Multiprocessor-on-a chip Architecture"
In Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN2005),
Innsbruck, Austria. February 15 - 17, 2005.
Yuan Zhang, Weirong Zhu, Fei Chen, Ziang Hu, and Guang R. Gao
"If-Conversion in SSA Form"
In Proceedings of the International European Conference on Parallel and Distributed Computing (Euro-Par 2004),
Pisa, Italy. August 31 - September 3, 2004.
Arthur Stoutchinin, and Guang R. Gao
"Single-Dimension Software Pipelining for Multi-Dimensional Loops"
In Proceedings of International Symposium on Code Generation and Optimization (CGO 2004),
San Jose, CA. March 21 -24, 2004.
Hongbo Rong, Zhizhong Tang, R. Govindarajan, Alban Douillet and Guang Gao
Also available in pdf format
"Code Generation for Single-Dimension Software Pipelining of Multi-Dimensional Loops"
In Proceedings of International Symposium on Code Generation and Optimization (CGO 2004),
San Jose, CA. March 21 -24, 2004.
Hongbo Rong, Alban Douillet, R. Govindarajan and Guang Gao
Also available in pdf format
"DIMES: An Iterative Emulation Platform for Multiprocessor-System-on-Chip Designs"
In Proceedings of IEEE International Conference on Field-Programmable Technology (FPT'03),
Tokyo, Japan. December 15 - 17, 2003.
Hirofumi Sakane, Levent Yakay, Vishal Karna, Clement Leung and Guang R. Gao
"Code Size Oriented Memory Allocation for Temporary Variables"
In Proceedings of Fifth Workshop on Media and Streaming Processors (MSP-5/MICRO-36),
San Diego, CA, USA. December 1, 2003.
Ziang Hu, Yan Xie and Guang R. Gao
"Code Size Reduction with Global Code Motion"
In Proceedings of Workshop on Compilers and Tools for Constrained Embedded Systems (CTCES/CASES) 2003,
San Jose, CA, USA. October 29, 2003.
Ziang Hu, Yuan Zhang, Hongbo Yang and Guang R. Gao
"Performance Study of a Whole Genome Comparison Tool on a Hyper-Threading Multiprocessor"
In Proceedings of Fifth International Symposium on High Performance Computing,
Tokyo, Japan. October 20 - 22, 2003.
Juan del Cuvillo, Xinmin Tian, Guang Gao and Millind Girkar
"CARE: Overview of an Adaptive Multithreaded Architecture"
In Proceedings of Fifth International Symposium on High Performance Computing,
Tokyo, Japan. October 20 - 22, 2003.
Andres Marquez and Guang R. Gao
"Compiler-Assisted Cache Replacement: Problem Formulation and Performance Evaluation"
In Proceedings of 16th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2003),
College Station, TX, USA. October 2 - 4, 2003.
Hongbo Yang, R. Govindarajan, Guang R. Gao and Ziang Hu
"A Cluster-Based Solution for High Performance Hmmpfam Using EARTH Execution Model"
In Proceedings of Fifth IEEE International Conference on Cluster Computing (CLUSTER2003),
Hong Kong, China. September 20-23, 2003.
Weirong Zhu, Yanwei Niu, Jizhu Lu, Chuan Shen and Guang R. Gao
"An Executable Analytical Performance Evaluation Approach for Early Performance Prediction"
In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS 2003),
Nice, France. April 22 - 26, 2003.
Adeline Jacquet, Vincent Janot,Clement Leung, Guang R. Gao, R. Govindarajan, Thomas L. Sterling
"Programming Models and System Software for Future High-End Computing Systems: Work-in-Progress"
In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS 2003),
Nice, France. April 22 - 26, 2003.
Guang R. Gao, Kevin B. Theobald, R. Govindarajan, Clement Leung, Ziang Hu, Haiping Wu, Jizhu Lu, Juan del Cuvillo, Adeline Jacquet, Vincent Janot and Thomas L. Sterling
"On Achieving Balanced Power Consumption in Software Pipelined Loops"
In Proceedings of International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES 2002),
Grenoble, France. Octuber 8 - 11, 2002.
Hongbo Yang, Guang R. Gao, Clement Leung, R. Govindarajan and Haiping Wu
Available as
gzipped Postscript.
"Exploiting Schedule Slacks for Rate-Optimal Power-Minimum Software Pipelining"
In Proceedings of 3rd Workshop on Compilers and Operating Systems for Low Power (COLP),
held in conjunction with The 11th International Conference on Parallel Architecture and Compilation Techniques (PACT),
Charlottesville, VA, USA. September 22 - 25, 2002.
Hongbo Yang, R. Govindarajan, Guang R. Gao, George Cai and Ziang Hu
Available as
gzipped Postscript.
"Power-Performance Trade-offs for Energy-Efficient Architectures: A Quantitative Study"
In Proceedings of 20th International Conference on Computer Design (ICCD) 2002,
Freiburg, Germany. September 16 - 18, 2002.
Hongbo Yang, R. Govindarajan, Guang R. Gao and Kevin B. Theobald
Available
gzipped Postscript.
"Whole Genome Alignment using a Multithreaded Parallel Implementation"
In Proceedings of Symposium on Computer Architecture and High Performance Computing,
Pirenopolis, Brazil. September 10 - 12, 2001.
Wellington S. Martins, Juan del Cuvillo, Wenwu Cui and Guang R. Gao
"Power and Energy Impact by Loop Transformations"
In Proceedings of Workshop on Compilers and Operating Systems for Low Power 2001,
in conjunction with Parallel Architecture and Compilation Techniques 2001,
Barcelona, Spain. September 8, 2001.
Hongbo Yang, Guang R. Gao, Andres Marquez, George Cai and Ziang Hu
Available as
gzipped Postscript.
"A Multi-Threaded Runtime System for a Multi-Processor/Multi-Node Cluster"
In Proceedings of 15th Annual International Symposium on High Performance Computing Systems and Applications,
Windsor, ON, Canada. June 18 - 20, 2001.
Christopher J. Morrone, Jose N. Amaral, Guy Tremblay, and Guang R. Gao
"Minimum Register Instruction Sequence Problem: Revisiting Optimal Code Generation for DAGs"
In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS 2001),
San Francisco, CA, USA. April 24 - 28, 2001.
R. Govindarajan, Hongbo Yang, C. Zhang, Jose N. Amaral and Guang R. Gao
Available as
gzipped Postscript.
"Multithreaded Algorithms for Pricing a Class of Complex Options"
In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS 2001),
San Francisco, CA, USA. April 24 - 28, 2001.
Ruppa K. Thulasiram, Lubomir Litov, Hassan Nojumi, Christopher T. Downing and Guang R. Gao
Available as
gzipped Postscript.
"Speculative Prefetching of Induction Pointers"
In Proceedings of International Conference on Compiler Construction (CC 2001),
Genova, Italy. April 2 - 6, 2001.
Artour Stoutchinin, Jose N. Amaral, Guang R. Gao, Jim Dehnert, Suneel Jain and Alban Douillet
Available as
gzipped Postscript.
"Computer Detection of Single Nulcleotide Polymorphisms (SNPs) in Maize ESTs"
In Proceedings of Plant & Animal Genome IX Conference (PAG-IX),
San Diego, CA, USA. January 13 - 17, 2001.
F. Useche, M. Morgante, M.Hanafey, Scott Tingey, Guang R. Gao and Antoni Rafalski
"A Multithreaded Parallel Implementation of a Dynamic Programming Algorithm for Sequence Comparison"
In Proceedings of Pacific Symposium on Biocomputing (PSB 2001), pp. 311-322,
Hawaii, HI, USA. January 3 - 7, 2001.
W.S. Martins, J.B. del Cuvillo, F.J. Useche, K.B. Theobald and Guang R. Gao
"Landing CG on EARTH: A Case Study of Fine-Grained Multithreading on an Evolutionary Path"
In Proceedings of Super Computing (SC2000),
Dallas, TX, USA. November 4-10, 2000.
Kevin B. Theobald, Gagan Agrawal, Rishi Kumar, Gerd Heber, Guang R. Gao, Paul Stodghill and Keshav Pingali
"Developing a Communication Intensive Application on the EARTH Multithreaded Architecture"
In Proceedings of International European Conference on Parallel and Distributed Computing (Euro-Par 2000),
Munchen, Germany. August 28 - September 1, 2000.
Kevin B. Theobald, Rishi Kumar, Gagan Agrawal Gerd Heber, Ruppa K. Thulasiram and Guang R. Gao
"Parallel FEM Simulation of Crack Propagation --Challenges, Status, and Perspectives"
In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS'00), pp. 443-449
Cancun, Mexico. May 1-5, 2000.
Bruce Carter, Chuin-Shan Chen, L. Paul Chew, Nikos Chrisochoides, Guang R. Gao, Gerd Heber, Antony R. Ingraffea, Roland Krause, Chris Myers, Demian Nave, Keshav Pingali, Paul Stodghill, Stephen Vavasis, Paul A. Wawrzynek
"Caching Single-Assignment Structures to Build a Robust Fine-Grain Multi-Threading System"
In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS'00), pp. 589-594,
Cancun, Mexico. May 1-5, 2000.
Wen-Yen Lin, Jean-Luc Gaudiot, Jose N. Amaral and Guang R. Gao
"Performance Analysis of the I-Structure Software Cache on Multi-Threading Systems"
In Proceedings of 19th IEEE International Performance, Computing and Communication Conference (IPCCC2000),
Phoenix, AZ, USA. February 20-22, 2000.
Wen-Yen Lin, Jean-Luc Gaudiot, Jose N. Amaral and Guang R. Gao
"A Comparative Performance Study of Fine-Grain Multi-threading on Distributed Memory Machines"
In Proceedings of 19th IEEE International Performance, Computing and Communication Conference (IPCCC2000),
Phoenix, AZ, USA. February 20-22, 2000.
Prasad Kakulavarapu, Christopher J. Morrone, Kevin B. Theobald, Jose N. Amaral and Guang R. Gao
"Coping With Very High Latencies in Petaflops Computer Systems"
In Proceedings of High Performance Computing, Second International Symposium, ISHPC'99,
Kyoto, Japan. May 26-28, 1999.
Sean Ryan, Jose N. Amaral, Guang Gao, Zachary Ruiz, Andres Marquez and Kevin Theobald.
"A Multithreading Parallel Computational Approach for Valuing Derivatives"
In Proceedings of First WAFA Finance Research Conference,
Fairfax, VA, USA. April 30, 1999.
R.K. Thulasiram and Guang R. Gao
"Load Adaptive Algorithms and Implementations for the 2D Discrete Wavelet Transform on Fine-Grain Multithreaded Architectures"
In Proceedings of Workshop on SPDP '99,
San Juan, Puerto Rico, April 12-16, 1999.
Ashfaq A. Khokhar, Gerd Heber, Parimala Thulasiraman and Guang R. Gao
Available as
gzipped Postscript.
"A New Approach to Parallel Dynamic Partitioning for Adaptive Unstructured Meshes"
In Proceedings of Workshop on SPDP '99,
San Juan, Puerto Rico, April 12-16, 1999.
Gerd Heber, Rupak Biswas and Guang R. Gao.
Available as gzipped Postscript.
"Self-Avoiding Walks over Adaptive Unstructured Grids"
In Proceedings of Workshop on Parallel Algorithms for Irregularly Structured Problems (IRREGULAR 1999),
San Juan, Puerto Rico, April 12-16, 1999.
Gerd Heber, Rupak Biswas and Guang R. Gao.
Available as
gzipped Postscript.
"Efficient State-Diagram Construction Methods for Software Pipelining"
In Proceedings of International Conference on Compiler Construction (CC 1999),
Amsterdam, The Netherlands. March 20-28, 1999.
Chihong Zhang, R. Govindarajan, Sean Ryan and Guang R. Gao.
Available as
gzipped Postscript.
"HTMT-C: Proposing A Programming Language For A Petaflop Machine"
In Proceedings of the Mid-Atlantic Student workshop on Programming Languages and Systems (MASPLAS 1999), pp 53-68,
Baltimore, MD. March 27. 1999
Sean Ryan, Jose Nelson. Amaral, Zachary Ruiz and Guang Gao
"Superconducting Processors for HTMT: Issues and Challenges"
In Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation (FRONTIERS '99), pp 260-267,
Annapolis, MD, USA. February 21-25, 1999.
Kevin B. Theobald, Guang R. Gao and Thomas L. Sterling.
Available as
gzipped Postscript.
"Performance Prediction for the HTMT: A Programming Example"
TFP3 '99, Annapolis, Maryland, February 22, 1999
Jose Nelson Amaral, Guang R. Gao, Phillip Merkey, Thomas Sterling, Zachary Ruiz and Sean Ryan.
"A Superstrand Architecture and its Compilation"
In Proceedings of Workshop on Multithreaded Execution, Architecture and Compilation,
held in conjunction with HPCA-V,
Orlando, FL, USA. January 9-12, 1999.
Andres, Marquez, Kevin B. Theobald, Xinan Tang, and Guang R. Gao
"Design and Evaluation of Dynamic Load Balancing Schemes under a Fine-grain Multithreaded Execution Model"
In Proceedings of Workshop on Multithreaded Execution, Architecture and Compilation,
held in conjunction with HPCA-V,
Orlando, FL, USA. January 9-12, 1999.
Haiying Cai, Olivier Maquelin, Prasad Kakulavarapu and Guang R. Gao.
"An Implementation of a Hopfield Network Kernel on EARTH"
Brazilian Symposium on Computer Architecture and High Performance Processing ,
Buzios, Brazil, September, 1998.
Jose N. Amaral, Guang Gao and Xinan Tang
Available as
gzipped Postscript.
"Using Multithreading for the Automatic Load Balancing of Adaptive Finite Element Meshes"
In Proceedings of Workshop on Parallel Algorithms for Irregularly Structured Problems (IRREGULAR 1998),
Berkeley, CA, USA. August 9-11, 1998.
Gerd Heber, Rupak Biswas, Parimala Thulasiraman and Guang R. Gao
Available as
gzipped Postscript.
"Elastic History Buffer: A Low-Cost Method to Improve Branch Prediction Accuracy"
In Proceedings of International Conference on Computer Design: VLSI in Computers & Processors (ICCD 1997),
Austin, TX, USA. October 12-15, 1997
Guang R. Gao, Maria-Dana Tarlescu and Kevin B. Theobald.
Available as
gzipped Postscript.
"Thread Partitioning and Scheduling Based on Cost Model"
In Proceedings of 9th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA 1997),
Newport, RI, USA. June 22 - 25, 1997.
Guang R. Gao, Xinan Tang, Jian Wang and Kevin B. Theobald.
Available as
gzipped Postscript.
Journal Papers
Toward High Throughput Algorithms on Many Core Architectures
ACM Transactions on Architecture and Code Optimization (TACO), Volume 8, Issue 4, January 2012, Article No. 49.
Daniel Orozco, Elkin Garcia, Rishi Khan, Kelly Livingston and Guang R. Gao.
Analysis and Performance Results of Computing Betweenness Centrality on IBM Cyclops64
ACM Journal of Supercomputing, Vol. 56, No.1, April 2011, pp. 1-24.
Guangming Tan, Vugranam C. Sreedhar and Guang R. Gao.
Improving Performance of Dynamic Programming via Parallelism and Locality on Multi-core Architectures
IEEE Transactions on Parallel and Distributed Systems, Vol.20, No.2, 2009, pp. 261-274.
Guangming Tan, Ninghui Sun and Guang R. Gao
Register allocation for software pipelined multidimensional loops
ACM Trans. Program. Lang. Syst. 30(4), July 2008.
Hongbo Rong, Alban Douillet, Guang R. Gao
EnGENIUS - Environmental Genome Informational Utility System
Journal of Bioinformatics and Computational Biology, JBCB-119R1, July 2008
M. Kaplarevic, A.E. Murray, Guang R. Gao
Single-Dimension Software Pipelining for Multidimensional Loops
ACM Transactions on Architecture and Code Optimization (TACO), Volume 4, Issue 1, March 2007, Article No. 7.
Hongbo Rong, Zhizhong Tang, R. Govindarajan, Alban Douillet, Guang R. Gao
Performance Portability on EARTH: A Case Study across Several Parallel
Architectures
Cluster Computing, Volume 10, Number 2, June, 2007, page 115-126.
Weirong Zhu, Yanwei Niu, and Guang R. Gao
Madd Operation Aware Redundancy Elimination
International Journal of Software Engineering and Knowledge Engineering,
Vol. 15, No. 2, 2005, pp357-362
Haiping Wu, Ziang Hu, Joseph Manzano and Guang. R. Gao.
Improving Power Efficiency with Compiler-Assisted Cache Replacement
Journal of Embedded Computing, 2005
Hongbo~Yang, R. Govindarajan, Guang R. Gao, Ziang Hu
A Cluster-Based Solution for High Performance Hmmpfam Using EARTH
Execution Model
International Journal of High Performance Computing and Networking, Vol 2,
Issue 2/3/4, 2004
Weirong Zhu, Yanwei Niu, Jizhu Lu, Chuan Shen and Guang R. Gao,
An Improved Hidden Markov Model for Transmembrane Protein Topology
Prediction and Its Applications to Complete Genomes
Bioinformatics, Volume 21, Number 9, pp. 1853-158, 2005
Robel Kahsay, Li Liao , Guang Gao
Quasi-Consensus Based COMParison of Profile Hidden Markov Models for
Protein Sequences
Bioinformatics, Volume 21, Number 10, pp. 2287-2293, 2005
Robel Kahsay, Guoli Wang, Guang Gao, Li Liao and Roland
Dunbrack.
Efficient Multithreaded Algorithms for the Fast Fourier Transform
Parallel and Distributed Computing Practices, Vol. 5, No. 2, Pages:
177-191, 2004
Parimala Thulasiraman, Kevin B. Theobald, Ashfaq A. Khokhar, and Guang
R. Gao
A Fine-Grain Load Adaptive Algorithm of the 2D Discrete Wavelet
Transform for Multithreaded Architectures
Journal of Parallel and Distributed Computing (JPDC), Vol.64, No.1, Pages:
68-78, January 2004
Parimala Thulasiraman, Ashfaq A. Khokhar, Gerd Heber, Guang R. Gao
Evaluation and Choice of Various Branch Predictors for Low-Power
Embedded Processor
Journal of Computer Science and Technology, Vol. 18, No. 6, Pages:
833-838, November, 2003
Dong Rui Fan, Hongbo Yang, Gaung R. Gao, and Rong Cai Zhao
Minimum Register Instruction Sequencing to Reduce Register Spills in
Out-of-Order Issue Superscalar Architectures
IEEE Transactions on Computers, Vol. 52, No. 1, Pages: 4-20, January
2003
Ramaswamy Govindarajan, Hongbo Yang, Jose N Amaral, Chihong Zhang, and
Guang R. Gao
Implementation of the EARTH Programming Model on SMP Clusters: a
Multi-Threaded Language and Runtime System
Concurrency and Computation: Practice and Experience, Vol. 15, No. 9,
Pages: 821-844, August 2003
Guy Tremblay, Christopher J. Morrone, Jose N. Amaral, and Guang
R.Gao
Minimizing Buffer Requirements in Rate-Optimal Schedules in Regular
Dataflow Networks
Journal of VLSI Signal Processing, Vol. 31, No. 3, Pages: 207-229, Jul
2002
Ramaswamy Govindarajan and Guang R. Gao
A Theory for Co-Scheduling Hardware and Software Pipelines in ASIPs and
Embedded Processors
Design Automation for Embedded Systems, Vol. 6, No. 3, Pages: 243-275,
March 2002
Ramaswamy Govindarajan, Erik R. Altman, and Guang R. Gao
CASA: A Server for The Critical Assessment of Sequence Alignment
Accuracy
Bioinformatics, Vol. 18, No. 3, Pages: 496-497, March 2002
Robel Y. Kahsay, Nataraj Dongre, Guang R. Gao, Guoli Wang, and Roland
L. Dunbrack Jr.
TROLL--Tandem Repeat Occurrence Locator
Bioinformatics, Vol. 18, No. 4, Pages: 634-636, April 2002
Adalberto T. Castelo, Wellington S. Martins, and Guang R. Gao
Exploiting Locality in single Assignment Data Structures Updated
through Split Phase Transactions
Cluster Computing, Special issue on Internet Scalability: Advances in
Parallel, Distributed and Mobile Systems, Vol. 4, No. 4, Pages: 281-293,
October 2001
Jose N. Amaral, Wen-Yen Lin, Jean-Luc Gaudiot, and Guang R. Gao
Dynamic Load Balancers for a Multithreaded Multiprocessor System
Parallel Processing Letters, Vol. 11, No. 1, Pages: 169-184, March
2001
Prasad Kakulavarapu, Olivier Maquelin, Jose N. Amaral, and Guang R.
Gao
A New Memory Model and Cache Consistency Protocol
IEEE Transactions on Computers, Vol. 49, No. 8, Pages: 798-813, August
2000
Guang R. Gao and Vivek Sarkar, Location Consistency
Automatically Partitioning Threads for Multithreaded
Architectures
Special Issues on Compilation and Architectural Support for Parallel
Applications, Journal of Parallel and Distributed Computing, Vol. 58, No.
2, Pages: 159-189, August 1999
Xinan Tang and Guang R. Gao
Advances in the Dataflow Computational Model
Parallel Computing , Vol. 25, No.13 - 14, Pages: 1907 . 1927, 1999
Walid A. Najjar , Edward A Lee, and Guang R Gao
A New Framework for Elimination Based Data Flow Analysis Using DJ
Graphs
ACM Transaction on Programming Languages and Systems, Vol. 20, No. 2,
Pages 388-435, March 1998
Vugranam C. Sreedhar, Guang R. Gao, and Yong-Fong Lee
Optimal Modulo Scheduling Through Enumeration
International Journal on Parallel Programming, Vol. 26, No.2, Pages:
313-344, 1998
Erik R. Altman and Guang R. Gao
A Unified Framework for Instruction Scheduling and Mapping for Function
Units with Structural Hazards
Journal of Parallel and Distributed Computing, Vol. 49, No. 2, Pages:
259-293, 1998
Erik R. Altman, Ramaswamy Govindarajan, and Guang R. Gao
Incremental Computation of Dominator Trees
ACM Transactions on Programming Languages and Systems, Vol. 19, No. 2,
Pages: 239-252, March 1997
Vugranam C. Sreedhar, Guang R. Gao, and Yong-fong Lee
A Quadratic Time Algorithm for Computing Multiple Node Immediate
Dominators
Journal of Programming Languages, 1996
Vugranam C. Sreedhar, Guang R. Gao, and Yongfong Lee
A Framework for Resource-constrained Rate-optimal Software Pipelining
IEEE Transactions on Parallel and Distributed Systems, Vol. 7, No. 11,
Pages: 1133-1149, November 1996
Ramaswamy Govindarajan, Erik R. Altman, and Guang R. Gao
A Study of the EARTH-MANNA Multithreaded System
International Journal of Parallel Programming, Vol. 24, No. 4, Page
319-347, August 1996
Herbert H. J. Hum, Olivier Maquelin, Kevin B. Theobald, Xinmin Tian,
Guang R. Gao, and Laurie J. Hendren
Identifying Loops Using DJ Graphs
ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 18,
No. 6, Pages: 649 . 658, November 1996
Vugranam Sreedhar, Guang R. Gao, and Yongfong Lee
A Linear Time Algorithm for Placing OE-nodes
Journal of Programming Languages, 1995. Accepted
Vugranam C. Sreedhar and Guang R. Gao
Automatic Data and Computation Decomposition for Distributed Memory
Machines
Parallel Processing Letters, Vol. 5, No. 4, Pages: 539-550, April 1995
Qi Ning, Vincent V. Dongen, and Guang R. Gao
Computing phi-nodes in Linear Time Using DJ Graphs
Journal of Programming Languages, Vol. 3, Pages: 191-213, April 1995
Vugranam C. Sreedhar and Guang R. Gao
ABC++: Concurrency by Inheritance in C++
IBM Systems Journal, Vol. 34, No. 1, Pages: 120-137, 1995
Eshrat Arjomandi, William O'Farrell, Ivan Kalas,Gita Koblents, Frank
Ch. Eigler, and Guang. R. Gao
Rate-optimal Schedule for Multi-rate DSP Computations
Journal of VLSI Signal Processing, Vol. 9, No.3, Pages: 211-232, April
1995
Ramaswamy Govindarajan and Guang R. Gao
An Efficient Hybrid Dataflow Architecture Model
Journal of Parallel and Distributed Computing, Vol. 19, No. 4, Pages:
293-307, December 1993
Guang. R. Gao
A Register Allocation Framework Based on Hierarchical Cyclic Interval
Graphs
The Journal of Programming Languages, Vol. 1, No. 3, Pages: 155-185,
1993
Laurie J. Hendren, Guang R. Gao, Erik R. Altman, and Chandrika
Mukerji
Optimal Loop Storage Allocation for Argument-fetching Dataflow
Machines
International Journal of Parallel Programming, Vol. 21, No. 6, Pages:
421-448, December 1992
Qi Ning and Guang R. Gao
A High-speed Memory Organization for Hybrid Dataflow/von Neumann
Computing
Future Generation Computer Systems, Vol. 8, Pages: 287-301, 1992
Herbert H. J. Hum, and Guang. R. Gao
Toward Efficient Fine-grain Software Pipelining and the Limited
Balancing Techniques
International Journal of Mini and Microcomputers, Vol. 13, No. 2, Pages:
57-68, 1991
Guang. R. Gao, Herbert H. J. Hum, and Yue-Bong Wong
Exploiting Fine-grain Parallelism on Dataflow Architectures
Parallel Computing, Vol. 13, No. 3, Pages: 309-320, March 1990
Guang R. Gao
Technical Memos
CAPSL Technical Memo 111:
Toward Efficient Fine-grained Dynamic Scheduling on Many-Core Architectures
Elkin Garcia, Daniel Orozco, Robert Pavel and Guang R. Gao
February, 2012
Available on request
CAPSL Technical Memo 110:
SHF:Large:Collaborative Research: Power-Efficient Fault Resilience in Massively Parallel Computing
Guang R. Gao, Jack B. Dennis and Chengmo Yang
November, 2011
Available on request
CAPSL Technical Memo 109:
Comparative Evaluation of Alternative Program Execution Models
Jack B. Dennis, Robert Pavel and Guang R. Gao
September, 2011
Available on request
CAPSL Technical Memo 108:
Code Partition and Overlays: A Reintroduction to High Performance Computing
Joseph B. Manzano, Ge Gan, Juergen Ributzka, Sunil Shrestha and Guang R. Gao
August, 2011
CAPSL Technical Memo 107:
TIDeFlow: The Time Iterated Dependency Flow Execution Model
Daniel Orozco, Elkin Garcia, Robert Pavel, Rishi Khan and Guang R. Gao
August, 2011
CAPSL Technical Memo 106:
C64prof: A Parallel Profiling Environment for the Cyclops64 Architecture
Mark Pellegrini and Guang R. Gao
June, 2011
CAPSL Technical Memo 105:
Polytasks: A Compressed Task Representation for HPC Runtimes
Daniel Orozco, Elkin Garcia, Robert Pavel, Rishi Khan and Guang Gao
June, 2011
CAPSL Technical Memo 104:
Toward an Execution Model for Extreme-Scale Systems-Runnemede and Beyond
Guang R. Gao, Joshua Suetterlein and Stephane Zuckerman
April, 2011
Available on request
CAPSL Technical Memo 103:
High Throughput Queue Algorithms
Daniel Orozco, Elkin Garcia, Rishi Khan, Kelly Livingston and Guang R. Gao
January, 2011
CAPSL Technical Memo 102:
Energy efficient tiling on a Many-Core Architecture
Elkin Garcia, Daniel Orozco and Guang R. Gao
October, 2010
CAPSL Technical Memo 101:
Locality Optimization of Stencil Applications using Data Dependency Graphs
Daniel Orozco, Elkin Garcia and Guang R. Gao
October, 2010
CAPSL Technical Memo 100:
Experiments with the Fresh Breeze Tree-Based Memory Model
Jack B. Dennis, Guang R. Gao and Xiao X. Meng
October, 2010
CAPSL Technical Memo 99 Revised:
The Elephant and the Mouse: Non-Strict Fine-Grain Synchronization for Many-Core Architectures
Juergen Ributzka, Yuhei Hayashi and Guang R. Gao
April, 2011
CAPSL Technical Memo 99:
The Elephant and the Mouse: Non-Strict Fine-Grain Synchronization for Many-Core Architectures
Juergen Ributzka, Yuhei Hayashi and Guang R. Gao
June, 2010
CAPSL Technical Memo 98:
Dynamic Percolation - Mapping Dense Matrix Multiplication on a Many-Core Architecture
Elkin Garcia, Rishi Khan, Kelly Livingston, Ioannis E. Venetis and Guang R. Gao
June, 2010
Available on request
CAPSL Technical Memo 97:
TiNy Threads on BlueGene/P: Exploring Many-Core Parallelisms Beyond The Traditional OS
Handong Ye, Robert Pavel, Aaron Landwehr, and Guang R. Gao
May, 2010
CAPSL Technical Memo 96:
Many-Core Chip Architecture - A Report on a Novel Architecture/Software Co-Verification Platform
Juergen Ributzka, Yuhei Hayashi and Guang R. Gao
April, 2010
CAPSL Technical Memo 95:
Optimized Dense Matrix Multiplication on a Many-Core Architecture
Elkin Garcia, Ioannis E. Venetis, Rishi Khan and Guang R. Gao
February, 2010
CAPSL Technical Memo 94:
Synchronization for Dynamic Task Parallelism on Manycore Architectures
Yonghong Yan, Sanjay Chatterjee, Daniel Orozco, Elkin Garcia, Jun Shirako, Zoran Budimlic, Vivek Sarkar and Guang Gao
February, 2010
CAPSL Technical Memo 93:
A Study of a Software Cache Implementation of the OpenMP Memory Model for Multicore and Manycore Architectures
Chen Chen, Joseph B Manzano, Ge Gan, Guang R. Gao, Vivek Sarkar
February, 2010
CAPSL Technical Memo 92:
Establishing Causality as a Desideratum for Memory Models and Transformations of Parallel Programs
Chen Chen, Wenguang Chen, Vugranam Sreedhar, Rajkishore Barik, Vivek Sarkar and Guang Gao
January, 2010
CAPSL Technical Memo 91:
Diamond Tiling: A Tiling Framework for Time-iterated
Scientific Applications.
Daniel Orozco and Guang Gao
December, 2009
CAPSL Technical Memo 90:
Analysis and Performance Results of Computing Betweenness Centrality on IBM Cyclops64
Guangming Tan, Vugranam Sreedhar, Guang R. Gao
October, 2009
CAPSL Technical Memo 89:
Formalizing Causality as a Desideratum for Memory Models and Transformations of Parallel Programs
Chen Chen, Wenguang Chen, Vugranam Sreedhar, Rajkishore Barik, Vivek Sarkar and Guang Gao
July, 2009
CAPSL Technical Memo 88:
Collaborative Research: Programming Models and Storage System for High Performance Computation with Many-Core Processors
Jack B. Dennis, Guang R Gao and Vivek Sarkar
May 11th, 2009
CAPSL Technical Memo 87:
Mapping the FDTD Application to Many-Core Chip Architectures
Daniel A. Orozco and Guang R. Gao.
March 3rd, 2009
CAPSL Technical Memo 86:
A Study of Different Instantiations of the OpenMP Memory
Model and Their Software Cache Implementations
Chen Chen, Joseph B Manzano, Ge Gan, Guang R. Gao and Vivek Sarkar.
January, 2009
CAPSL Technical Memo 85:
Tile Reduction: an OpenMP Extension for Tile Aware Parallelization
Ge Gan, Xu Wang, Joseph B Manzano and Guang R. Gao
December, 2008
CAPSL Technical Memo 84:
Optimizing the LU Benchmark for the Cyclops-64 Architecture.
Ioannis E. Venetis and Guang R. Gao
July 8th, 2009
CAPSL Technical Memo 83:
Analysis and Performance Results of Computing Betweeness Centrality on IBM Cyclops64
Guangming Tan, Andrew Russom Vugranam Sreedhar and Guang R Gao
April 9th, 2008
CAPSL Technical Memo 82:
A New Cache Protocol Based on the Order Free Consistency Memory Model
Chen Chen, Joseph B Manzano, Ge Gan, Guang R Gao and Vivek Sarkar
May, 2008
CAPSL Technical Memo 81:
Performance Tuning of the Fast Fourier Transform on a Multicore Architecture
Liping Xue, Long Chen, Ziang Hu and Guang R Gao
Febraury 8th, 2008
CAPSL Technical Memo 80:
Order Free Consistency: Towards a Fully Asynchronous Memory Model
Chen Chen, Joseph B Manzano, Wenguang Chen and Guang R Gao
November, 2007
CAPSL Technical Memo 79:
Concurrency Analysis for Shared Memory Programs with Textually Unaligned
Barriers
Yuan Zhang, Evelyn Duesterwald and Guang R Gao
November, 2007
CAPSL Technical Memo 78:
Implementation of the Smith-Waterman Algorithm on A Reconfigurable Supercomputing Platform
Peiheng Zhang, Guangming Tan and Guang R. Gao
April 16th, 2007
CAPSL Technical Memo 77:
A Study of Parallel Betweenness Centrality Algorithm on a Many-core architecture
Guangming Tan and Guang R. Gao
June 27th, 2007
Also available in pdf format
CAPSL Technical Memo 76:
FAME: Financial Application with Many-core-on-a-chip Architecture
Weirong Zhu, Parimala Thulasiraman, Ruppa K. Thulasiram and Guang R. Gao
February 17th, 2006
Also available in pdf format
CAPSL Technical Memo 75:
Optimizing the LU Benchmark for the Cyclops-64 Architecture
Ioannis E. Venetis and Guang R. Gao
February, 2007
Also available in pdf format
CAPSL Technical Memo 74:
Exploring a Multithreaded Methodology to Implement a Network Communication Protocol on the IBM Cyclops-64 Multithreaded Architecture
Ge Gan, Ziang Hu, Juan del Cuvillo and Guang R. Gao
January, 2007
Also available in pdf format
CAPSL Technical Memo 73:
A Parallel Dynamic Porgramming Algorithm on a Multi-core Architecture
Guangming Tan and Guang R. Gao
February, 2007
Also available in pdf format
CAPSL Technical Memo 72:
Automatic Program Segment Similarity Detection in Targeted Program Performance Improvement
Haiping Wu, Eunjung Park, Mihailo Kaplarevic, Yingping Zhang, Murat Bolat and Guang R. Gao
December 30, 2006
Also available in pdf format
CAPSL Technical Memo 71:
An Automatic Methodology for Program Segment-based Compiler Optimization Search
Haiping Wu, Eunjung Park, Murat Bolat, Mihailo Kaplarevic, Yingping Zhang, Xiaoming Li and Guang R. Gao
November 14, 2006
Also available in pdf format
CAPSL Technical Memo 70:
Handling Massive Parallelism Efficiently: Introducing Batches of Threads
Ioannis E. Venetis, Theodore S. Papatheodorou and Guang R. Gao
October 18, 2006
Also available in pdf format
CAPSL Technical Memo 69:
Software Pipelining On Multi-core Chip Architectures: A
case study on IBM Cyclops-64 Chip Architure
Alban Douillet, Junmin Lin and Guang R. Gao
February 14, 2006
CAPSL Technical Memo 68:
Server I/O Acceleration Using an Embedded Multi-core Architecture
Lurng-Kuo Liu, Fei Chen, Christos J. Georgiou and Guang R. Gao
May 12, 2006
CAPSL Technical Memo 67 Revised:
Synchronization State Buffer: Supporting Efficient Fine-Grain Synchronization on Many-Core Architectures
Weirong Zhu, Vugranam C. Sreedhar, Ziang Hu and Guang R. Gao
November 20, 2006
Available upon request
CAPSL Technical Memo 67:
Efficient Fine-Grain Synchronization on a Multi-Core Chip Architecture: A Fresh Look
Weirong Zhu, Ziang Hu, and Guang R. Gao
July 17, 2006
CAPSL Technical Memo 66:
An Efficient Communication Infrastructure for the IBM Cyclops-64 Computer System
Ge Gan, Ziang Hu, Juan del Cuvillo and Guang R. Gao
June 12, 2006
Also available in pdf format
CAPSL Technical Memo 65:
Optimized Lock Assignment and Allocation for Productivity:
A Method for Exploiting Concurrency among Critical Sections
Yuan Zhang, Vugranam C. Sreedhar, Weirong Zhu, Vivek Sarkar and Guang R. Gao
May 10th, 2006
Also available in pdf format
CAPSL
Technical Memo 64: Multidimensional Kernel Generation for Loop Nest Software Pipelining
Alban Douillet, Hongbo Rong and Guang R. Gao
Febraury 13th, 2006
Also available in pdf format
CAPSL
Technical Memo 63: A New Framework for Analysis and Optimization
of Shared Memory Parallel Programs"
Vugranam C. Sreedhar, Yuan Zhang and Guang R. Gao
July 18th, 2005
CAPSL
Technical Memo 62:
" FAST: A Functionally Accurate Simulation Toolset for the Cyclops-64 Cellular Architecture"
Juan del Cuvillo, Weirong Zhu, Ziang Hu and Guang R. Gao
June 17th, 2005
Also available in pdf format
CAPSL
Technical Memo 61:
"P3I: Delaware's Programmability, Productivity and Proficiency Inquiry"
Joseph B. Manzano, Yuan Zhang and Guang R. Gao
June 10th, 2005
CAPSL
Technical Memo 60:
"Performance Analysis of Interconnection Network of Cyclops-64 Chip Architecture"
Yingping Zhang, Taikyeong Jeong, Fei Chen, Ronny Nitzsche and Guang R. Gao
June 1st, 2005
CAPSL
Technical Memo 59:
"Concurrency Analysis and Its Applications"
Yuan Zhang and Guang Gao
May 28th, 2005
CAPSL Technical Memo 58:
"Register Pressure in Software Pipelined Loop Nests: Fast
Computation and Impact on Architecture Design"
Alban Douillet, Hongbo Rong and Guang R. Gao
May 3rd, 2005
CAPSL Technical Memo 57:
"Parallel Reconstruction for Parallel Imaging SPACERIP on Cellular Architecture"
Yuanwei Niu, Ziang Hu and Guang R. Gao
June 15, 2004
CAPSL Technical Memo 56:
"Quasi consensus based comparison of profile hidden Markov models for protein sequences"
Robel Y. Kahsay, Guoli Wang, Li Liao, Roland Dunbrack and Guang R. Gao
May 28, 2004
CAPSL
Technical Memo 55:
"Toward a Software Infrastructure for the Cyclops-64 Cellular Architecture"
Juan B. del Cuvillo, Ziang Hu, Weirong Zhu, Fei Chen and Guang R. Gao
April 26, 2004
Also available in pdf format
CAPSL Technical Memo 54:
"Speeding up CG on Cluster with Two Dimensional Blocking Method and EARTH Runtime Support"
Fei Chen, Kevin B. Theobald and Guang R. Gao
April 23, 2004
CAPSL
Technical Memo 53:
"Lamport Order Revisit: A Study on How to Efficiently Achieve Sequential Consistency on a Modern
Multiprocessor-on-a-Chip Architecture"
Yuan Zhang, Weirong Zhu, Fei Chen, Ziang Hu and Guang R. Gao
March 01, 2004
Also available in pdf format
CAPSL
Technical Memo 52:
"Analyzable Atomic Sections: Integrating Fine-Grained Synchronization and Weak Consistency Models for Scalable
Parallelism"
Vivek Sarkar and Guang R. Gao
February 09, 2004
Also available in pdf format
CAPSL Technical Memo 51:
"Code Generation for Single-Dimension Software Pipelining of Multi-Dimensional Loops"
Hongbo Rong, Alban Douillet, R.Govindarajan and Guang R. Gao
September 26, 2003
Also available in pdf format
CAPSL Technical Memo 49:
"Single-Dimension Software Pipelining for Multi-Dimensional Loops"
Hongbo Rong, Zhizhong Tang, R.Govindarajan, Alban Douillet and Guang R. Gao
September 26, 2003
Also available in pdf format
CAPSL Technical Memo 48:
"Programming Method and software Infrastructure for Cellular Architecture"
Guang R. Gao, Juan del Cuvillo, Ziang Hu, Robert Klosiwicz, Clement Leung, Jason McGuiness, Hirofumi Sakane, Yingping Zhang
September 16, 2003
CAPSL Technical Memo 47:
"Compiler-Assisted Cache Replacement: Problem Formulation and Performance Evaluation"
Hongbo Yang, R. Govidarajan, Guang R. Gao and Ziang Hu
September 9, 2003
CAPSL Technical Memo 45:
"Selective Slim Scheduling: On Software Pipelining of Loop Nests"
Hongbo Rong, Zhizhong Tang, R. Govidarajan, Guang R. Gao
June 8, 2003
CAPSL Technical Memo 44:
"Algorithms, Applications, and Environments for Emerging Petascale Architectures"
R. Govindarajan, H. Tufo, S. Thomas, R. Loft, Guang R. Gao, J. Moreira and J.Castanos
March 6, 2003
Also available in pdf format
CAPSL Technical Memo 43:
"Executable Performance Model and Evaluation of High Performance Architectures with Percolation"
Adeline Jacquet, Vincent Janot, R. Govindarajan, Clement Leung, Guang R. Gao and Thomas Sterling
November 21, 2002
Also available in pdf format
CAPSL Technical Memo 42:
"A Quantitative Study on Performance-Power Impact of Dual-Speed Pipeline Architectures"
Hongbo Yang, R.Govindarajan, Guang R. Gao and Kevin B. Theobald
June 13, 2002
Also available in pdf format
CAPSL Technical Memo 41:
"Maximizing Pipelined Functional Units Usage for Minimum Power Software Pipelining"
Hongbo Yang, R.Govindarajan, Guang R. Gao and George Cai
September 27, 2001
Also available in pdf format
CAPSL Technical Memo 40:
"New Normalization Method and Error Analysis for Gene Expression Microarray Data"
Stanley D. Luck, Francisco Jose Useche G., Wellington S. Martins and Guang R. Gao
December 11, 2000
Also available in pdf format
CAPSL Technical Memo 39:
"Threaded-C Language Reference Manual (Release 2.0)"
Guy Tremblay, Kevin B.Theobald, Christopher J.Morrone, Mark D.Butala, Jose Nelson Amaral and Guang R. Gao
September 23, 2000
CAPSL
Technical Memo 38:
"Automatic Prefetching of Induction Pointers"
Artour Stouctchinin, Jose Nelson Amaral, Guang R. Gao, Jim Dehnert, Suneel Jain and Alban Douillet
April 18, 2000
CAPSL Technical Memo 37:
"Automatic Prefetching of Induction Pointers for Software Pipelining"
Artour Stoutchinin, Jose Nelson Amaral, Guang R. Gao, Jim Dehnert and Suneel Jain
November 12, 1999
CAPSL Technical Memo 36:
"Minimum Register Instruction Sequence Problem: Revisiting Large Optimal"
R. Govindarajan, Hongbo Yang, Chihong Zhang, Jose Nelson Amaral and Guang R. Gao
November 12, 1999
CAPSL Technical Memo 35:
"A Comparative Performance Study of Fine-Grain Multi-Threading on Distributed Memory Machines"
Prasad Kakulavarapu, Christopher J. Morrone, Kevin B. Theobald, Jose Nelson Amaral and Guang R. Gao
November 11, 1999
CAPSL Technical Memo 34:
"Caching Single-Assignment Structures to Build a Robust Fine-Grain Multi-Threading System"
Wen-Yen Lin, Jose Nelson Amaral, Jean-Luc Gaudiot and Guang Gao
October 13, 1999
CAPSL Technical Memo 33:
"Definition of the EARTH Model"
Kevin B. Theobald
October 6, 1999
CAPSL Technical Memo 32:
"The Benefits of Hardware-Assisted Fine-Grain Multithreading"
Kevin B. Theobald and Guang R. Gao
July 20, 1999
CAPSL Technical Memo 31:
"HTMT Phase 2 Report"
Guang R Gao, Jose Nelson Amaral, Andres Marquez, Kevin B. Theobald, Sean Ryan, Zachary Ruiz, Thomas Geiger and Christopher J. Morrone
July 19, 1999
CAPSL Technical Memo 30:
"Design and Implementation of an Eefficient Thread Partitioning Algorithm"
Jose Nelson Amaral, Guang R. Gao, Erturk Dogan Kocalar, Patrick O'Neil and Xiang Tang
July 1, 1999
CAPSL Technical Memo 29:
"Advances in Dataflow Computational Model"
Walid A Najjar, Edward A. Lee and Guang R. Gao
April 1, 1999
CAPSL Technical Memo 28:
"Efficient State-Diagram Construction Methods for Software Pipelining"
Chihong Zhang, R. Govindarajan, Sean Ryan and Guang R. Gao
March 5, 1999
CAPSL Technical Memo 27:
"SEMi: A Simulator for EARTH, MANNA, and i860"
Kevin Theobald
March 1, 1999
CAPSL Technical Memo 26:
"An HTMT Performance Prediction Case Study: Implementing Cannon's Dense Matrix Multiply Algorithm"
Jose Nelson Amaral, Guang R. Gao, Phillip Merkey, Thomas Sterling, Zachary Ruiz and Sean Ryan
February 17, 1999
CAPSL Technical Memo 25:
"Option Pricing Problem on a Multithreaded Parallel Architecture"
Ruppa K. Thulasiram and Guang R.Gao
November 11, 1998
CAPSL Technical Memo 24:
"Design of the Runtime System for the Portable Threaded-C Language"
Prasad Kakulavarapu, Olivier Maquelin and Guang R. Gao
July 21, 1998
CAPSL Technical Memo 23:
"Automatically Partitioning Threads Based on Remote Paths"
Xinan Tang and Guang R. Gao
July 20, 1998
CAPSL Technical Memo 22:
"A Refinement of the HTMT Program Execution Model"
Guang Gao, Jose Nelson Amaral, Andres Marquez and Kevin Theobald"
July 13, 1998
CAPSL Technical Memo 21:
"Self-Avoiding Walks Over Two-Dimensional Adaptive Unstructured Grids"
Gerd Heber, Rupak Biswas and Guang R. Gao
April 20, 1998
CAPSL Technical Memo 20:
"Using Multithreading for the Automatic Load Balancing of 2-D Adaptive Finite Element Meshes"
Gerd Heber, Rupak Biswas,Parimala Thulasiraman and Guang R. Gao
March 16, 1998
CAPSL Technical Memo 19:
"Overview of the Threaded-C Language"
Kevin B. Theobald, Jose Nelson Amaral, Gerd Herber, Oliver Maquelin, Xinan Tang and Guang R. Gao
March 16, 1998
CAPSL Technical Memo 18:
"A Superstrand Architecture"
Andres Marquez, Kevin B. Theobald, Xinan Tang, Thomas L. Sterling and Guang R. Gao
March 14, 1998
CAPSL Technical Memo 17:
"An Enhanced Co-Scheduling Method Using Reduced MS-State Diagrams"
R. Govindarajan, N.S.S. Narasimha Rao, Erik R. Altman and Guang R. Gao
February 18, 1998
CAPSL Technical Memo 16:
"Location Consistency -- A New Memory Model and Cache Consistency Protocol"
Guang R. Gao and Vivek Sarkar
February 16, 1998
CAPSL Technical Memo 15:
"Superconducting Processors for HTMT: Issues and Challenges"
Kevin B. Theobald, Guang R. Gao and Thomas L. Sterling
December 15, 1997
CAPSL Technical Memo 14:
"A Superstrand Architecture"
Andres Marquez, Kevin B. Theobald, Xinan Tang and Guang R. Gao
December 1, 1997
CAPSL Technical Memo 13:
"Partial Sampling with Reverse State Reconstruction: A New Technique for Branch Predictor Performance Estimation"
Darren E. Vengroff and Guang R. Gao
CAPSL Technical Memo 11:
"Heap Analysis and Optimizations for Threaded Programs"
Xinan Tang, Rakesh Ghiya, Laurie J. Hendren and Guang R. Gao
November 7, 1997
CAPSL Technical Memo 10:
"A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors"
Raul Silvera, Jian Wang and Guang R. Gao
CAPSL Technical Memo 09:
"The HTMT Program Execution Model"
Guang R. Gao, Kevin B. Theobald, Andres Marquez and Thomas Sterling
July 18, 1997
CAPSL Technical Memo 08:
"Benefits of Efficient Multithreading on Distributed Memory for the Parallelization of Communication-Intensive Applications"
Angela C. Sodan and Guang R. Gao
CAPSL Technical Memo 07:
"An Interger Linear Programming Model of Software Pipelining for the MIPS R8000 Processor"
Artour Stoutchinin
CAPSL Technical Memo 06:
"A New Fast Algorithm for Optimal Register Allocation in Modulo Scheduled Loops"
Sylvain Lelait, Guang R. Gao and Christine Eisenbeis
CAPSL Technical Memo 05:
"Design and Evaluation of Dynamic Load Balancing Schemes under A Multithreaded Execution Model"
Haiying Cai, Olivier Maquelin and Guang R. Gao
CAPSL Technical Memo 04:
"Non-Clustered Statistical Trace Sampling for Large Cache Design Space Exploration"
Darren E. Vengroff, Kenneth Simpson and Guang R. Gao
CAPSL Technical Memo 03:
"Thread Partitioning and Scheduling Based on Cost Model"
Xinan Tang, Jian Wang, Kevin B. Theobald and Guang R. Gao
April 15, 1997
CAPSL Technical Memo 02:
"Elastic History Buffer: A Low-Cost Method to Improve Branch Prediction Accuracy"
Maria-Dana Tarlescu, Kevin B. Theobald and Guang R. Gao
November 14, 1996
CAPSL Technical Memo 01:
"Hybrid Technology Multithreaded Architecture"
Guang R. Gao, Konstantin K. Likharev, Paul C. Messina and Thomas L. Sterling
Technical Notes
CAPSL Technical Note 21:
"Experiences Porting Mstack to ParalleX"
Mark Pellegrini
August, 2008
CAPSL Technical Note 20:
"The EDIF2KSF Converter"
Jonathan Barton
August, 2007
CAPSL Technical Note 19:
"Mrs. Clops Tool Chain Manual"
Matthew Wells
March, 2006
CAPSL Technical Note 18:
"ASAP Low-Level Connection Library"
Inanc Dogru
March, 2006
CAPSL
Technical Note
17:
"C64 DDR Verification and Critical Path Reduction"
Michael Bodnar
September, 2005
CAPSL Technical Note 16:
"The Cyclops-E Emulation Environment"
Juan del Cuvillo and Nathaniel Merritt.
August, 2005
CAPSL Technical Note
15:
"SLICED: a Source Level Interacting Cyclops-64 Effective
Debugger"
Geoff Gerfin and Ziang Hu.
August 26, 2004
CAPSL Technical Note
14:
"DISC64: A Disassembler for the Instruction Set of Cyclops-64"
John Tully
August 5, 2004
CAPSL Technical Note
13:
"Generate the Multiple and Add Operation during the WHIRL Lowering Phase
Joseph Bryant Manzano Franco and Haiping Wu
May 31, 2004
CAPSL Technical Note 12:
"Integrate EBO with Pattern Matching"
Divya Parthasarathi
May 28, 2004
CAPSL Technical Note 11:
"A DIMES Demonstration Application: Mandelbot-Set Generation Using a Work-Stealing Algorithm"
Jason M. McGuiness
June 15, 2002
CAPSL
Technical Note 10 Revised:
"A Software Development Kit for CeDIMES"
Juan del Cuvillo, Robert Klosiewicz and Yingping Zhang
March 15, 2005
CAPSL
Technical Note 10:
"A Software Development Kit for CeDIMES"
Juan del Cuvillo, Robert Klosiewicz and Yingping Zhang
September 30, 2002
CAPSL Technical Note 09:
"Threaded-C Release 2.0: Motivation, Description, and Rationale"
Guy Tremblay
June 15, 2000
CAPSL Technical Note 08:
"Runtime Locality Transformations for NAS Conjugate Gradient (Sparse Matrix Computation)"
Rishi Kumar, Nathaniel Johnson, Ruppa K. Thulasiram, Gagan Agrawal, Guang R. Gao
December 17, 1999
CAPSL Technical Note 07:
"Computational Financial Derivatives ---A Primer"
Ruppa K. Thulasiram, Guang R. Gao
October 9, 1998
CAPSL Technical Note 06:
"Debugging: The `Feedback' Way"
James P. Durbano
October 9, 1998
CAPSL Technical Note 05:
"Portable Threaded-C Release 1.1"
Jos'e Nelson Amaral, Zachary Ruiz, Sean Ryan, Andres Marquez, Christopher Morrone, Prasad Kakulavarapu, Guang R. Gao
October 8, 1998
CAPSL Technical Note 04:
"Implementation of I-Structures as a Library of Functions in Portable Threaded-C"
Jos'e Nelson Amaral, Guang R. Gao
June 15, 1998
CAPSL Technical Note 03:
"Proposed Changes to Threaded-C"
Kevin B. Theobald
January 20, 1998
CAPSL Technical Note 02:
"A Portable Threaded-C Language for EARTH Multiprocessors"
Xinan Tang, Olivier Maquelin, Kevin B. Theobald, Guang R. Gao, Prasad Kakulavarapu
January 6, 1998
CAPSL Technical Note 01:
"An Overview of the Threaded-C Language"
Guang R. Gao, Xinan Tang, Parimala Thulasiraman, Kevin B. Theobald
July 25, 1997
CAPSL Theses
Ph.D. Theses:
"Exploring novel many-core architectures for scientific computing"
Long Chen
Fall 2010
"Programming Model and Execution Model for OpenMP on the Cyclops-64 many-core processor"
Ge Gan
Spring 2010
Available on request
"Enabling System Validation for the many-core Supercomputer"
Fei Chen
Summer 2009
Available on request
"Breaking away from the OS Shadow: A Program Execution Model Aware Thread Virtual Machine for Multicore Architectures"
Juan del Cuvillo
Summer 2008
"Static Analyses and Optimizations for Parallel Programs with Synchronization"
Yuan Zhang
Summer 2008
"Advanced Protein Sequence Analysis Methods for Structure and
Function Prediction"
Robel Y. Kahsay
Spring 2005
"The CARE Architecture"
Andrés Marquez
Winter 2004
"Power-Aware Compilation Techniques for High Performance Processors"
Hongbo Yang
Fall 2003
"Irregular Computations on Fine-Grain Multithreaded Architecture"
Parimala Thulasiraman
Fall 2000
"Compiling for Multithreaded Architectures"
Xinan Tang
Fall 1999
"EARTH: An Efficient Architecture for Running Threads"
Kevin Bryan Theobald
Spring 1999
"Toward a software pipelining framework for many-core chips"
Juergen Ributzka
Summer 2009
"Optimizing the Fast Fourier Transform on a Many core Architecture"
Long Chen
Winter 2008
"Design and Implementation of Tool-chain framework to support OpenMP Single Source Compilation on CELL platform"
Yi Jiang
Winter 2007
"A Study of Simulation and Verification of a Many-core Architecture on two modern reconfigurable platforms"
Dimitrij Krepis
Summer 2007
"Methodology of Dynamic Compiler Option Selection Based on Static Program Analysis - Implementation and Evaluation"
Eun Jung Park
Summer 2007
"Efficient Mapping of Fast Fourier Transform on the Cyclops-64 Multithreaded Architecture"
Liping Xue
Summer 2007
"Tower Methodology for Verification of Multi-Core
Architecture - A Case Study"
Divya Parthasarathi
Summer 2005
"A Study of Architecture and Performance of IBM Cyclops-64
Interconnection Network"
Yingping Zhang
Summer 2005
"Quantitive Study of Human-Computer interaction in adaptive search on Mobile Handsets and its Localization for
Mandarin Chinesse"
Xing Wang
Fall 2004
"A Parallel Debugger for the Cyclops Architecture"
Robert S. Klosiewic Jr.
Summer 2004
"Multithreaded Parallel Implementation of HPMMPFAM on EARTH"
Weirong Zhu
Spring 2004
"Implementing Parallel CG Algorithm on the EARTH Multithreaded Architecture"
Fei Chen
Spring 2004
"Code Size Oriented Memory Allocation for Temporary Variables"
Yan Xie
Winter 2004
"Binary Diffing"
Kapil Khosla
Fall 2003
"A Portable Runtime System and its Derivation for the Hardware SU Implementation"
Chuan Shen
Fall 2003
"A Interconnect Architecture for Commodity Off-the-thelf Multiprocessor Emulation Testbed"
Mark Lawrence Legutko
Spring 2002
"A Visual Perspective to Motif/Pattern Analysis"
Praveen R Thiagarajan
Summer 2001
"Automated Single Nucleotide Ploymorphism Discovery Pipeline"
Francisco Jose Useche Gomez
Summer 2001
"Efficient Parallelization of Reductions and Loop Based Programs on EARTH"
Rishi Kumar
Summer 2001
"Whole Genome Comparison Using A Multithreaded Parallel Implementation"
Juan Del Cuvillo
Summer 2001
"A EARTH Runtime System For Multi-Processor/Multi-Node Beowulf Cluster"
Christopher Jason Morrone
Spring 2001
"Implementation Issues of a Hardware-Based EARTH Synchronization Unit"
Thomas Geiger
Spring 2001
"Register Stack and Optimal Allocation Instruction Placement"
Alban Douillet
Spring 2001
"Advanced Compilers, Architectures and Parallel Systems"
ShaoHua Han
Spring 2001
"Dynamic Load Balancing Issues in the EARTH Runtime System"
Kamala Prasade Kakulavarapu
Fall 1999
"Towards a Custom EARTH Synchronization Unit"
Ian Stuart MacKenzie Walker
Summer 1999
"Static Instruction Schedule For Dynamic Issue Processor"
Raul E. Silvera Muñoz
Spring 1997
|