Publications
Detailed Information
Fast Point Transformer
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Park, Chunghyun | - |
dc.contributor.author | Jeong, Yoonwoo | - |
dc.contributor.author | Cho, Minsu | - |
dc.contributor.author | Park, Jaesik | - |
dc.date.accessioned | 2024-05-09T04:12:35Z | - |
dc.date.available | 2024-05-09T04:12:35Z | - |
dc.date.created | 2024-05-08 | - |
dc.date.created | 2024-05-08 | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol.2022-June, pp.16928-16937 | - |
dc.identifier.issn | 1063-6919 | - |
dc.identifier.uri | https://hdl.handle.net/10371/201294 | - |
dc.description.abstract | The recent success of neural networks enables a better interpretation of 3D point clouds, but processing a large-scale 3D scene remains a challenging problem. Most current approaches divide a large-scale scene into small regions and combine the local predictions together. However, this scheme inevitably involves additional stages for pre- and post-processing and may also degrade the final output due to predictions in a local perspective. This paper introduces Fast Point Transformer that consists of a new lightweight self-attention layer. Our approach encodes continuous 3D coordinates, and the voxel hashing-based architecture boosts computational efficiency. The proposed method is demonstrated with 3D semantic segmentation and 3D detection. The accuracy of our approach is competitive to the best voxel-based method, and our network achieves 129 times faster inference time than the state-of-the-art, Point Transformer, with a reasonable accuracy trade-off in 3D semantic segmentation on S3DIS dataset. | - |
dc.language | 영어 | - |
dc.publisher | IEEE | - |
dc.title | Fast Point Transformer | - |
dc.type | Article | - |
dc.identifier.doi | 10.1109/CVPR52688.2022.01644 | - |
dc.citation.journaltitle | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | - |
dc.identifier.wosid | 000870783002072 | - |
dc.identifier.scopusid | 2-s2.0-85132778978 | - |
dc.citation.endpage | 16937 | - |
dc.citation.startpage | 16928 | - |
dc.citation.volume | 2022-June | - |
dc.description.isOpenAccess | N | - |
dc.contributor.affiliatedAuthor | Park, Jaesik | - |
dc.type.docType | Conference Paper | - |
dc.description.journalClass | 1 | - |
dc.subject.keywordAuthor | 3D from multi-view and sensors | - |
dc.subject.keywordAuthor | grouping and shape analysis | - |
dc.subject.keywordAuthor | Scene analysis and understanding | - |
dc.subject.keywordAuthor | Segmentation | - |
- Appears in Collections:
- Files in This Item:
- There are no files associated with this item.
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.