Multi-modal alignment via hyperbolic geometry
Page: 1-48
2024
- 41Usage
Metric Options: CountsSelecting the 1-year or 3-year option will change the metrics count to percentiles, illustrating how an article or review compares to other articles or reviews within the selected time period in the same journal. Selecting the 1-year option compares the metrics against other articles/reviews that were also published in the same calendar year. Selecting the 3-year option compares the metrics against other articles/reviews that were also published in the same calendar year plus the two years prior.
Example: if you select the 1-year option for an article published in 2019 and a metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019. If you select the 3-year option for the same article published in 2019 and the metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019, 2018 and 2017.
Citation Benchmarking is provided by Scopus and SciVal and is different from the metrics context provided by PlumX Metrics.
Example: if you select the 1-year option for an article published in 2019 and a metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019. If you select the 3-year option for the same article published in 2019 and the metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019, 2018 and 2017.
Citation Benchmarking is provided by Scopus and SciVal and is different from the metrics context provided by PlumX Metrics.
Metrics Details
- Usage41
- Downloads32
- Abstract Views9
Thesis / Dissertation Description
Strong capabilities of generalization to unseen domains are vital for deep neural networks. While existing methods have shown promising results without source domain access, they mostly rely on models that are extensively pre-trained on source domains or overlook the intricate hierarchical structures inherent in visual and textual features. These limitations may have bad impacts on performances, especially on datasets with many classes. To overcome this, in this paper we propose a novel approach that projects the model onto hyperbolic geometry and employs geometric optimal transport to align cross-modal features in an unsupervised manner. Unlike Euclidean geometry, hyperbolic geometry is characterized by hierarchical data structures, which can facilitate understanding diverse classes. To fully capture hierarchical information from text, we enrich the model with finegrained concepts extracted from WordNet, enhancing its understanding of diverse classes. Extensive experiments on standard benchmarks demonstrate the superior performance of our method compared to strong baselines.
Bibliographic Details
Provide Feedback
Have ideas for a new metric? Would you like to see something else here?Let us know