Skip to main content

Advertisement

Springer Nature Link
Log in
Menu
Find a journal Publish with us Track your research
Search
Cart
  1. Home
  2. Network and Parallel Computing
  3. Conference paper

Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64

  • Conference paper
  • pp 132–143
  • Cite this conference paper
Network and Parallel Computing (NPC 2005)
Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64
  • Yanwei Niu19,
  • Ziang Hu19,
  • Kenneth Barner19 &
  • …
  • Guang R. Gao19 

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3779))

Included in the following conference series:

  • IFIP International Conference on Network and Parallel Computing
  • 2351 Accesses

  • 2 Citations

  • 3 Altmetric

Abstract

This paper focuses on the Cyclops64 computer architecture and presents an analytical model and performance simulation results for the preloading and loop unrolling approaches to optimize the performance of SVD (Singular Value Decomposition) benchmark. A performance model for dissecting the total execution cycles is presented. The data preloading using “memcpy” or hand optimized “inline” assembly code, and the loop unrolling approach are implemented and compared with each other in terms of the total number of memory access cycles. The key idea is to preload data from offchip to onchip memory and store the data back after the computation. These approaches can reduce the total memory access cycles and can thus improve the benchmark performance significantly.

Download to read the full chapter text

Chapter PDF

Similar content being viewed by others

Design energy efficient shared distributed memory management system on SoC’s to improve memory performance

Article 18 January 2022

SSC: An SRAM-Based Silence Computing Design for On-chip Memory

Chapter © 2025

Data-Centric Computing Paradigm Shift, and Domain-Specific Architecture and Hardware

Chapter © 2024

Explore related subjects

Discover the latest articles, books and news in related subjects, suggested using machine learning.
  • Computer Memory Structure
  • Computer Science
  • Optimization
  • Register-Transfer-Level Implementation
  • Control Structures and Microprogramming
  • Processor Architectures

References

  1. Cascaval, C., Castanos, J.G., Ceze, L., Denneau, M., Gupta, M., Lieber, D., Moreira, J.E., Strauss, K., Warren Jr., H.S.: Evaluation of a multithreaded architecture for cellular computing. In: HPCA 2002, pp. 311–322 (2002)

    Google Scholar 

  2. Almái, G., Cascaval, C., Castaños, J.G., Denneau, M., Lieber, D., Moreira, J.E., Warren, J.H.S.: Dissecting cyclops: a detailed analysis of a multithreaded architecture. In: MEDEA workshop, vol. 31, pp. 26–38 (2003)

    Google Scholar 

  3. Almasi, G.S., Caşcaval, C., Moreira, J.E., Denneau, M., Donath, W., Eleftheriou, M., Giampapa, M., Ho, H., Lieber, D., Newns, D., Snir, M., Henry, J., Warren, S.: Demonstrating the scalability of a molecular dynamics application on a petaflop computer. In: ICS 2001: Proceedings of the 15th international conference on Supercomputing, pp. 393–406. ACM Press, New York (2001)

    Google Scholar 

  4. del Cuvillo, J., Zhu, W., Hu, Z., Gao, G.R.: Fast: A functionally accurate simulation toolset for the cyclops-64 cellular architecture. In: Workshop on Modeling, Benchmarking and Simulation (MoBS), held in conjunction with the 32nd Annual Interantional Symposium on Computer Architecture (ISCA 2005), Madison, Wisconsin, June 4 (2005)

    Google Scholar 

  5. del Cuvillo, J., Zhu, W., Hu, Z., Gao, G.R.: Tiny threads: a thread virtual machine for the cyclops64 cellular architecture. In Fifth Workshop on Massively Parallel Processing (WMPP), held in conjunction with the 19th International Parallel and Distributed Processing System, Denver, Colorado, April 3-8 (2005)

    Google Scholar 

  6. del Cuvillo, J.B., Hu, Z., Zhu, W., Chen, F., Gao, G.R.: Toward a software infrastructure for the cyclops64 cellular architecture. CAPSL Memo 55, Department of ECE, Universisty of Delaware (2004)

    Google Scholar 

  7. Hestenes, M.R.: Inversion of matrices by biorthogonalization and related results. J. Soc. Induct. Appl. Math. 6, 51–90 (1958)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Department of ECE, University of Delaware, Newark, DE, 19711, USA

    Yanwei Niu, Ziang Hu, Kenneth Barner & Guang R. Gao

Authors
  1. Yanwei Niu
    View author publications

    Search author on:PubMed Google Scholar

  2. Ziang Hu
    View author publications

    Search author on:PubMed Google Scholar

  3. Kenneth Barner
    View author publications

    Search author on:PubMed Google Scholar

  4. Guang R. Gao
    View author publications

    Search author on:PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Services Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology, Huazhong University of Science and Technology, 430074, Wuhan, China

    Hai Jin

  2. University of North Carolina at Chapel Hill, CB 3175, Sitterson Hall, 27599-3175, Chapel Hill, NC, USA

    Daniel Reed

  3. Cluster and Grid Computing Lab, Huazhong University of Science and Technology, P.O. Box, 430074, Wuhan, China

    Wenbin Jiang

Rights and permissions

Reprints and permissions

Copyright information

© 2005 IFIP International Federation for Information Processing

About this paper

Cite this paper

Niu, Y., Hu, Z., Barner, K., Gao, G.R. (2005). Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64. In: Jin, H., Reed, D., Jiang, W. (eds) Network and Parallel Computing. NPC 2005. Lecture Notes in Computer Science, vol 3779. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11577188_18

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/11577188_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29810-6

  • Online ISBN: 978-3-540-32246-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Publish with us

Policies and ethics

Search

Navigation

  • Find a journal
  • Publish with us
  • Track your research

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Journal finder
  • Publish your research
  • Language editing
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our brands

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Discover
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support
  • Legal notice
  • Cancel contracts here

18.216.138.252

Not affiliated

Springer Nature

© 2025 Springer Nature