Abstract
Recent commercial microprocessors are concentrating on the multi-core CPU architectures, while most parallel and/or distributed computing methods focus on the multi-CPU architectures. Therefore, there are needs to analyze and adapt traditional parallel algorithms for the new multi-core environments. In this paper, we use matrix multiplications as the target problem, and implemented various matrix multiplication methods including the traditional serialized and parallel versions using OpenMP and Windows-threads, etc. We measured all the execution times for both of integer and floating point type implementations, with respect to the various matrix sizes, to finally analyze their overall performance. According to our experimental results, one of the most important factors for the execution time was the efficient use of level-2 caches in the CPU. We are now designing a new matrix multiplication algorithm for the multi-core CPU's, and also developing more efficient implementation methods.
Original language | English |
---|---|
Pages (from-to) | 1168-1173 |
Number of pages | 6 |
Journal | WSEAS Transactions on Computers |
Volume | 6 |
Issue number | 12 |
State | Published - Dec 2007 |
Keywords
- Matrix multiplication
- Multi-core CPU
- Parallel computing
- Performance analysis