WEKO3
アイテム
{"_buckets": {"deposit": "24be3d7f-d715-4342-8c87-66e37358dd7f"}, "_deposit": {"created_by": 7, "id": "6384", "owners": [7], "pid": {"revision_id": 0, "type": "depid", "value": "6384"}, "status": "published"}, "_oai": {"id": "oai:nied-repo.bosai.go.jp:00006384", "sets": []}, "author_link": [], "item_10001_biblio_info_7": {"attribute_name": "書誌情報", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2012", "bibliographicIssueDateType": "Issued"}, "bibliographicPageEnd": "350", "bibliographicPageStart": "344", "bibliographic_titles": [{"bibliographic_title": "2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS \u0026 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS)", "bibliographic_titleLang": "en"}]}]}, "item_10001_description_5": {"attribute_name": "抄録", "attribute_value_mlt": [{"subitem_description": "In this paper, we propose an implementation of a parallel one-dimensional fast Fourier transform (FFT) on the K computer. The proposed algorithm is based on the six-step FFT algorithm, which can be altered into the recursive six-step FFT algorithm to reduce the number of cache misses. The recursive six-step FFT algorithm improves performance by utilizing the cache memory effectively. We use the recursive six-step FFT algorithm to implement the parallel one-dimensional FFT algorithm. The performance results of one-dimensional FFTs on the K computer are reported. We successfully achieved a performance of over 18 TFlops on 8192 nodes of the K computer (82944 nodes, 128 GFlops/node, 10.6 PFlops peak performance) for a 2(41)-point FFT.", "subitem_description_language": "en", "subitem_description_type": "Other"}]}, "item_10001_publisher_8": {"attribute_name": "出版者", "attribute_value_mlt": [{"subitem_publisher": "IEEE COMPUTER SOC", "subitem_publisher_language": "en"}]}, "item_10001_relation_14": {"attribute_name": "DOI", "attribute_value_mlt": [{"subitem_relation_type_id": {"subitem_relation_type_id_text": "10.1109/HPCC.2012.53"}}]}, "item_10001_source_id_9": {"attribute_name": "ISSN", "attribute_value_mlt": [{"subitem_source_identifier": "2576-3512", "subitem_source_identifier_type": "EISSN"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Daisuke Takahashi", "creatorNameLang": "en"}]}, {"creatorNames": [{"creatorName": "Atsuya Uno", "creatorNameLang": "en"}]}, {"creatorNames": [{"creatorName": "Mitsuo Yokokawa", "creatorNameLang": "en"}]}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_title": "An Implementation of Parallel 1-D FFT on the K computer", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "An Implementation of Parallel 1-D FFT on the K computer", "subitem_title_language": "en"}]}, "item_type_id": "40001", "owner": "7", "path": ["1670839190650"], "permalink_uri": "https://nied-repo.bosai.go.jp/records/6384", "pubdate": {"attribute_name": "PubDate", "attribute_value": "2023-09-20"}, "publish_date": "2023-09-20", "publish_status": "0", "recid": "6384", "relation": {}, "relation_version_is_last": true, "title": ["An Implementation of Parallel 1-D FFT on the K computer"], "weko_shared_id": -1}
An Implementation of Parallel 1-D FFT on the K computer
https://nied-repo.bosai.go.jp/records/6384
https://nied-repo.bosai.go.jp/records/6384740d2193-4411-4299-b656-c6883a1ac723
Item type | researchmap(1) | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2023-09-20 | |||||||||||
タイトル | ||||||||||||
言語 | en | |||||||||||
タイトル | An Implementation of Parallel 1-D FFT on the K computer | |||||||||||
言語 | ||||||||||||
言語 | eng | |||||||||||
著者 |
Daisuke Takahashi
× Daisuke Takahashi
× Atsuya Uno
× Mitsuo Yokokawa
|
|||||||||||
抄録 | ||||||||||||
内容記述タイプ | Other | |||||||||||
内容記述 | In this paper, we propose an implementation of a parallel one-dimensional fast Fourier transform (FFT) on the K computer. The proposed algorithm is based on the six-step FFT algorithm, which can be altered into the recursive six-step FFT algorithm to reduce the number of cache misses. The recursive six-step FFT algorithm improves performance by utilizing the cache memory effectively. We use the recursive six-step FFT algorithm to implement the parallel one-dimensional FFT algorithm. The performance results of one-dimensional FFTs on the K computer are reported. We successfully achieved a performance of over 18 TFlops on 8192 nodes of the K computer (82944 nodes, 128 GFlops/node, 10.6 PFlops peak performance) for a 2(41)-point FFT. | |||||||||||
言語 | en | |||||||||||
書誌情報 |
en : 2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS) p. 344-350, 発行日 2012 |
|||||||||||
出版者 | ||||||||||||
言語 | en | |||||||||||
出版者 | IEEE COMPUTER SOC | |||||||||||
ISSN | ||||||||||||
収録物識別子タイプ | EISSN | |||||||||||
収録物識別子 | 2576-3512 | |||||||||||
DOI | ||||||||||||
関連識別子 | 10.1109/HPCC.2012.53 |