ChartMuseum is a chart question answering benchmark designed to evaluate reasoning capabilities of large vision-language models (LVLMs) over real-world chart images. The benchmark consists of 1162 ...
Prithvi-EO-2.0 is based on the ViT architecture, pretrained using a masked autoencoder (MAE) approach, with two major modifications as shown in the figure below. Second, we considered geolocation ...
Abstract: Deep neural networks (DNNs) reveal significant robustness deficiencies due to their susceptibility to being misled by small and imperceptible adversarial examples, thus it is crucial to ...
Abstract: Although the generative novel view synthesis frameworks have already achieved the generation of target views from specific viewpoints, they still rely on either direct or indirect input of ...